Posted on Leave a comment

Pandoc, Markdown, XeLaTeX, EPUB

EPUB documents are essentially a kind of html document as a collection of files which are zipped, and include html, css, images, and some XML pages. There are several ways of organizing these, but the most straightforward is one html document for each chapter (or section), a set of images organized in a subfolder, and a few metadata files regarding the collection. An epub document can be even simpler, and consist of a single html file, no images, and a few metadata files.

Continue reading Pandoc, Markdown, XeLaTeX, EPUB

Posted on Leave a comment

Image / Scaling / Compression

Size matters, and the smaller the better, when it comes to generation, modification, transmission, and storage of information. The vast amount of unoptimized documents and images on my very own local storage, much less what we send and receive all the time, is astounding. The idea that we need 100gb or 1tb of storage (thank you Dropbox, not) is sheer waste and sloth. I've addressed these issues a bit in the past, but it is time to take a bigger picture approach.

Note that this refers not only to images but essentially collections of images, namely pdf documents and video.

Continue reading Image / Scaling / Compression

Posted on Leave a comment

The Problem with H1

H1 and Content Boundaries on the Web and EBook Publications


NOTE: H1 isn't really a problem, the thing is that it defines a chapter, not a work (though a chapter could be considered a work in the sense of songs being considered works that are part of a collection). H1 is a problem when using a Markdown Lint and concatinating all chapters into a singe document (which makes more sense than one might think when working in tools such as VSCode).


There is a problem embedded in epub, which is that it is normally composed of several html documents, one per chapter. However, for parsers to create epubs properly (such as Pandoc), they do it based on H1, so that each H1 signifies the beginning (and title) of a new html document (which is a chapter).

However, as we know an HTML document itself (especially regarding the web) should only have one H1. Therefore if the native (single) document being edited is itself a book, then the single document will have multiple H1s embedded within it.

This means there is a basic disconnect of a book being an ODT file or HTML file or even a PDF vs. being an Ebook.

Baker & Taylor require epubs to have only one H1, which is itself the title of the work, and everything else H2 (e.g., chapter headers). However, the spec and common use has an H1 for each chapter.

See:


Pushing the problem from H1 to Meta Title gives the same problem: An epub ebook has multiple html documents (one per chapter). On the web, there should be one and only one H1 (for Google's purposes, and possibly in the HTML spec). The Meta Title is not (necessarily) displayed to the user (though browsers traditionally put it into the browser title bar as well), whereas the H1 does get displayed to the user, so these definitely have two different uses in terms of user/display.

The problem I see comes when people what to explicitly tag H1, H2, etc., and your application decides if/when it will do overrides. This is (partially) what I mean by having semantics (markup via markdown/copymarkup) be primary. By not having this and dealing with HTML export/native file format, you put all the control into your application, but take it away from the user and the documents.

Further, I believe that documents themselves should not be the top level, but collections of documents (libraries).

What this allows for is people to have a single editor instance and navigate across multiple documents, books and book elements. This is very fast when wanting to bounce around between various documents. Granted it can slow down on load and save if the entire structure is written out.

This helps further define things such that a book is not a single document, but a collection of documents (the idea of "book" being a container). Not only does this work with epub thinking but also website thinking. A website is a collection of documents, but not a document itself (it is an address). Also, this idea helps out that each web page itself has an address, as well as meta title and h1. In essence this means that a web page is a chapter in a web site (book).

This means that epubs and websites are on the same page, as it were, whereas the pdf and odt is (or rather, can be) at odds insofar as a single instance can be a collection of chapters (and accompanying images, including usually a cover image).


Note that in a collection, if the Title of the collection is an H1, there is still the problem of each working having either a single H1 or multiple H1s. How to parse this as a tree is interesting both in the production of Epubs as well as PDFs. For a multiple-document (book) collection, with a generated ToC, there are many possibilities, such as:

  • Title Page (hidden, unlisted, unnumbered)
  • Copyright (hidden, unlisted, numbered)
  • ToC (hidden, unlisted, numbered)
  • Preface / Introduction
  • FIRST BOOK
    • Title Page (hidden, unlisted, unnumbered)
    • ...
  • SECOND BOOK
Posted on Leave a comment

Telegram for Social Networking

Telegram is a great chat app, but there is more, and less to it, than say Twitter and Facebook. The first thing is that a lot of this gamification of likes/thumbsup is gone. Want to know if someone read your post? That has to be done either via direct message, or in a group (and the person has to respond). Recently there are new apis that help enable discussions on posts, as well as connecting channel posts as annoucements in groups.

Types of Accounts in Telegram

There is a single namespace in telegram for all entities: users, channels, groups, and bots. Users are individual accounts tied to a phone number (I think that is mandatory). Telegram Channels are one-way broadcast accounts, which can have multiple admins (but messages are signed by the channel. Membership in channels is unlimited. Telegram Groups can include up to 200,000 users, and everyone can post.

Using Bots for Commenting and Discussion

Note that for feedback on channel posts one can add a like bot or other such simple feedback, or add a discussion group and put that information in the channel description. A third new option is to have a comment system using an app which would also be available on the web as a preview (without logging into Telegram). The preview bot that does this works nicely and shows off what kind of api/developer support Telegram.

No Manipulation or Advertising

Instead of the constant intrusion of 99% annoyance in terms of timeline distortion and advertising as found in Facebook and Instagram (and to some extent Twitter, which is going down that same path).

Essentially, the use of channels with comments can replace any given social network (other limitations apply), such as Twitter, Facebook, and Instagram. While those platforms still have the lion's share of engagement and users, moving over to the Telegram way of things makes sense.

Telegra.ph for Longform

Telegra.ph is a longform microblog platform which is very simple and also has zero advertising. There is a nice Telegraph App in the Google Play store.

Installing Telegram

For the Linux and ChromeOS world, the options are: Telegram Desktop (for Linux) and Telegram Android App (for ChromeOS).

Posted on Leave a comment

Podcast Platforms

Podcasting is growing (slowly) and offers a great opportunity for brand engagement. Generally free, the idea is to be where the audience already is, and have a reliable host for content and the rss feed.

Media and RSS Hosting

Google Podcasts and Google Play Music Podcasts

Note, these are two different things: First Thing - Google Podcast (part of Google Search) - Google Podcast Publisher Tools - Google Podcasts App Second Thing - Google Play Music Podcasts

Pocket Casts (#4 platform

Stitcher (#3 platform)

Spotify (#2 platform)

iTunes/Apple Music (#1 platform)

WordPress Plugins

Posted on Leave a comment

Epub Editing Tools

Tools change over time, but it seems that in the Epub world we have more of the same. As of November 2018: - Calibre's Epub Editor is pretty nifty - Sigil development stalled, then picked up again - Pagina Epub Checker is still under development and useful - Pandoc with or without some kind of TeX, LaTeX, or XeLaTeX -- the last one is better for font support Things haven't really changed over the past X years, much. Certainly not since the 2017 note on Epub tools.

Some Pandoc Resources

Posted on Leave a comment

Kindle Paperwhite 4th Gen

I've used a Kindle since the Kindle Keyboard (3rd gen), and since then purchased and used the DX for a while (the much larger model). On 06 September 2012 the Kindle Paperwhite was released and I registered mine on 10 September. I broke that model within six months by wedging it in a bag that had too many objects in it, but Amazon sent out a replacement free-of-charge (which included free shipping, and I live outside the United States).

Continue reading Kindle Paperwhite 4th Gen

Posted on Leave a comment

Dokuwiki – The Canonical Wiki

Dokuwiki, over the last 10 years, has become the canonical wiki. By this I mean that Dokuwiki is the go-to wiki for most uses. While there are many other wikis which are popular and in use (e.g., Xwiki, MoinMoin, TikiWiki, etc.), the competitors (other than Mediawiki) do not exceed half of Dokuwiki's popularity. The only real competitor in terms of global mindshare is Mediawiki, and the only reason for that is of course Wikipedia and the other wiki properties run by the Wikimedia foundation. Since Mediawiki is pretty much a shit show when it comes to management and resource consumption, Dokuwiki is the winner by default. Even with such a behemouth as a competitor, Dokuwiki has reached the point where it has more than half the generic searches in Google worldwide compared with Mediawiki.

Continue reading Dokuwiki – The Canonical Wiki

Posted on Leave a comment

Caret vs. Caret – A Tale of Two Editors

Caret the Chrome App vs. Caret the PC App -- not sure which came first, but they are very different (except for the name, and the fact they are open source).

Caret the Chrome App

Note that Caret may possibly replace Atom in my workflow - Caret in the Chrome Store - Caret website - Caret source on Github - Caret wiki on Github

Caret the PC App (Linux, OSX, Windows)

Note that while Caret the Chrome App may possibly replace Atom, Caret the PC App has some great built in Markdown display (it is Markdown-focused rather than general-text-editor-focused). - Caret.io website - Caret on Github - Caret wiki on Github - Caret on Twitter

Posted on Leave a comment

Grav CMS on Debian

This post will be frequently (or infrequently) updated. It is meant to help me learn Grav and Gravcart, and in particular migrate off of WordPress and Woocommerce.

Related Artices in Debian Services and Applications - Debian on AWS Lightsail - OpenVPN on Debian + UFW Firewall - Nginx and Letsencrypt on Debian - PHP & MariaDB on Debian

- Grav CMS on Debian

Grav, Gravcart vs. WordPress, Woo

WordPress and Woocommerce have such overhead, including dependencies such as MySQL, that it is important to seek out a functional but higher performing option to manage modern websites and web storefronts.

Installing and Configuring Grav

The best approach is to download the Grav + Admin zip file, unzip and move contents to the webroot. I've had issues with using github and composer, so the zip file is a less problematic place to start. ... details to come ... Finally run bin/grav install to get plugin and theme dependencies

bin/grav install

File Rights

I've found that permissions get jammed every now and then. Overwriting them with a script is the easiest approach, as follows:

chown -R www-data:www-data /var/www/WEBROOT
find /var/www/WEBROOT -type d -exec chmod 2775 {} \;
find /var/www/WEBROOT -type d -exec chmod g+s {} \;
find /var/www/WEBROOT -type f -exec chmod 0664 {} \;
find /var/www/WEBROOT/bin -type f -exec chmod 0755 {} \;

Resources for Grav and Gravcart