Posted on Leave a comment

The Problem with H1

H1 and Content Boundaries on the Web and EBook Publications


There is a problem embedded in epub, which is that it is normally composed of several html documents, one per chapter. However, for parsers to create epubs properly (such as Pandoc), they do it based on H1, so that each H1 signifies the beginning (and title) of a new html document (which is a chapter).

However, as we know an HTML document itself (especially regarding the web) should only have one H1. Therefore if the native (single) document being edited is itself a book, then the single document will have multiple H1s embedded within it.

This means there is a basic disconnect of a book being an ODT file or HTML file or even a PDF vs. being an Ebook.

Baker & Taylor require epubs to have only one H1, which is itself the title of the work, and everything else H2 (e.g., chapter headers). However, the spec and common use has an H1 for each chapter.

See:


The good news here is that once a piece of software has parsed the document and has a copy in memory, it's relatively trivial to change all the H1s to H2s, H2s to H3s (and back), so this won't be a problem to deal with, and in UX Write it won't be something the user has to do manually.

Using a single H1 for the book title only is stupid IMHO. That's what the title tag in HTML is for, or you can define a custom CSS class called "title". H1 as far as I'm concerned should be for chapters.

But again, this problem is easy to get around in software, and I could perhaps provide some options for EPUB export where you can specify how you want to handle things like this.


Pushing the problem from H1 to Meta Title gives the same problem: An epub ebook has multiple html documents (one per chapter). On the web, there should be one and only one H1 (for Google's purposes, and possibly in the HTML spec). The Meta Title is not (necessarily) displayed to the user (though browsers traditionally put it into the browser title bar as well), whereas the H1 does get displayed to the user, so these definitely have two different uses in terms of user/display.

The problem I see comes when people what to explicitly tag H1, H2, etc., and your application decides if/when it will do overrides. This is (partially) what I mean by having semantics (markup via markdown/copymarkup) be primary. By not having this and dealing with HTML export/native file format, you put all the control into your application, but take it away from the user and the documents.

Further, I believe that documents themselves should not be the top level, but collections of documents (libraries). This is what Scrivener allows for, where you can define which part of a document collection tree is the top level for an export/compilation.

What this allows for is people to have a single editor instance and navigate across multiple documents, books and book elements. This is very fast when wanting to bounce around between various documents. Granted it can slow down on load and save if the entire structure is written out, but that is not much of a problem in Scrivener.

This helps further define things such that a book is not a single document, but a collection of documents (the idea of "book" being a container). Not only does this work with epub thinking but also website thinking. A website is a collection of documents, but not a document itself (it is an address). Also, this idea helps out that each web page itself has an address, as well as meta title and h1. In essence this means that a web page is a chapter in a web site (book).

This means that epubs and websites are on the same page, as it were, whereas the pdf and odt is (or rather, can be) at odds insofar as a single instance can be a collection of chapters (and accompanying images, including usually a cover image).


Yes I've been having a look at the EPUB spec and realised this is something I'll have to deal with, particularly if I have support for opening existing EPUB files and editing them (as opposed to just exporting).

My understanding was that H1 was intended for top-level sections of a document, where "top-level section" can have different meanings depending on the type of document. For example in a book this would be chapter (or for a really large book, possibly even part), and for a smaller article this would be section. In LaTeX for example there is the book class and the article class; in the former you have \part{...} and \chapter{...} commands, and \section{...} and \subsection{...}. The article class only has the latter two.

So it becomes a matter of mapping what is in the file to the different levels of headings, and this could be different for different file formats (or variations thereof). For example when importing a LaTeX file, and it would look at what the document class is to determine whether H1 is chapter or part, and or whether H1 should be section instead.

Fortunately there are two things which work in our favour regarding this:

  1. The H* tags are semantic only; their appearance can be customised totally. So you could quite reasonably use H2 as the tag for chapters, but have the style name as displayed in the UI as "Chapter", with H3 displayed as "Section" and H1 as "Title". This way the user wouldn't have to think in terms of the HTML tag names but rather their meaning. Also the outline navigator could be configured to start at H2 instead of H1 in this case. So there's potential providing flexibility here as to how the different levels are presented in the UI vs. what specific tag names are used in the file.

  2. As I mentioned before, it's possible to "push up" or "pull down" the different heading tags by renumbering on load/save, if necessary. This is something which could be provided as an option to the user, or perhaps configured in a template/profile where you set up how you want your epub file structured (including single file vs. multiple files), depending on publisher's requirements.

To date, UX Write has been designed around the concept of working with everything within a single document. It's certainly capable of doing this in terms of performance, but with EPUB this throws a bit of a spanner in the works because some EPUB books use separate files. Also I've had a few people raise the desire to view individual sections by themselves, similar to what you have in Scrivener. So it might be worthwhile expanding this to allow you to work with multiple files that are all in the one "package", much like Scrivener does. For some types of writing you just want a single document, but for others (esp. books) you want multiple documents packaged together. So it would be good to support both approaches.

I'd be very keen to get your input on the UI for this, what sort of options should be provided, and how we could come up with something that gives the right level of control to authors about how their document is structured. And perhaps with the market research work we've discussed part of what you could do is present some UI mockups to various authors and get their input. What do you think?


Yes, I am interested in doing work in this area. It is a problem area I am working out myself as a "lead user". http://en.wikipedia.org/wiki/Lead_user

For me the key is to work out problems in the areas mentioned that will not cause problems in other areas, or indeed that can help lead to solutions in those areas.

For example: self-publishers also have needs to have a website. Some of their content (usually a few chapters) are hosted on their website. Exporting the document to support markdown and/or html is fairly straightforward if the epub is taken as the model (one html page per chapter). Also, conceptually, the idea that a website is a book and a web page is a chapter does make some sense (though some web page "articles" are multi-page, that is usually for advertising revenue or attention-tracking or SEO rather than readability/usability).

Also, the ability to edit on the website and have those changes pushed back into the source documents (or some kind of synchronization) would be helpful (though not essential).

Another issue: when using Google Docs for collaboration with writers and editors, we always created one document per chapter for editing purposes. So this makes sense as well.

As far as I know, most content management for documents are collections of files, in some kind of nested folder structure. Most are too much trouble to deal with, other than making files available in the cloud for backup and sharing (Dropbox, etc.).

The Calibre ebook library software, which is tedious to use (essentially a searchable but single list as a collection, but using tags and authors for grouping purposes, kind of like another itunes) does have the advantage of being able to import multiple versions, create multiple versions, and transfer ebooks between a desktop and an ebook reader device.


Posted on Leave a comment

Paypal vs. Stripe

Paypal Sucks

Paypal sucks, as everyone knows. It has one benefit for the consumer (those buying with paypal) and that is their dispute resolution, which I've used perhaps ten times and have never lost. Dispute resolution with credit cards and banks is much, much more difficult and harder to win.

For the seller, however, Paypal is horrible. Their rates are higher, their exchange rates typically add on 5%, and they also have fees for transfer to one's bank (in Thailand that is now 50 THB, though it used to be free for transfers over a certain (small) amount.

Stripe - A knight in shining armor

Stripe is the great competitor. It is available in far fewer countries than Paypal, but where it matters such as places like Vietnam, India, and the Philippines, Paypal is still largely unavailable for use. Stripe is avialable in India but only Malaysia and Singapore in terms of Southeast Asia. That said, Stripe has a way to enable those operating in countries without support to be fully functional, and that is by setting up a US corporation and enabling ease of banking setup. This service is called Stripe Atlas, and is in fact so much better than anything Estonia is marketing under their misguided and mischaracterized e-residency, which is neither a residency nor any kind of incorporation or accounting infrastructure (basically a chip on pin ID that works poorly, if at all).

If we look at fees for an organization working out of Thailand (that repatriates funds in Thai Baht), here are some numbers:

Example: 3 transactions of $33 USD/month

Paypal fees

$ 5.25 transaction fees
$ 5.00 currency conversion fee
฿ 50.00 bank transfer fee (~ $ 1.75)

Total: $ 12.00 USD = 12 %

Stripe + Transferwise fees

$ 3.78 transaction fees
$ 0.24 currency conversion fee (mid-market rate)
฿ 69.00 bank transfer fee (~ $ 2.20)

Total: $ 6.22 USD = 6.22 %

Note that the Stripe transactions scale much better as the currency conversion fee is fixed unlike the Paypal currency conversion. In addition, as we see the Stripe fees are lower for any given transaction.

Other reasons Paypal is worse than Stripe

  • Paypal transfer payments take a long time. Of course for no good reason other than that Paypal wants the use of other people's money for longer (4-7 business days in Thailand).
  • Paypal doesn't deal with Transferwise or Payoneer banks, and refuses to send/receive money to them. No good reason here, other than seeing these fintech companies as competition and making their customers suffer for using them.
  • Paypal technical support/customer service is notoriously poor.

Resources

Posted on Leave a comment

WooCommerce Paypal Settings

Updated July 2019 - I no longer use Paypal, as it is roughly twice as expensive as Stripe when looking at $100 USD/month in transactions (more than 10% with transaction fees and currency conversion), and without currency conversion it is still 30% more when only looking at transaction fees (usually because of the international transaction fees which are higher). If one is not a US national or without a US company, then use Stripe Atlas to incorporate and get banking set up.

There are several areas within the Paypal site and Woocommerce settings tabs that need to be configured to get everything working properly, including auto-return, IPN notifications, etc.

Posted on Leave a comment

Telegram for Social Networking

Telegram is a great chat app, but there is more, and less to it, than say Twitter and Facebook. The first thing is that a lot of this gamification of likes/thumbsup is gone. Want to know if someone read your post? That has to be done either via direct message, or in a group (and the person has to respond). Recently there are new apis that help enable discussions on posts, as well as connecting channel posts as annoucements in groups.

Types of Accounts in Telegram

There is a single namespace in telegram for all entities: users, channels, groups, and bots. Users are individual accounts tied to a phone number (I think that is mandatory). Telegram Channels are one-way broadcast accounts, which can have multiple admins (but messages are signed by the channel. Membership in channels is unlimited. Telegram Groups can include up to 200,000 users, and everyone can post.

Using Bots for Commenting and Discussion

Note that for feedback on channel posts one can add a like bot or other such simple feedback, or add a discussion group and put that information in the channel description. A third new option is to have a comment system using an app which would also be available on the web as a preview (without logging into Telegram). The preview bot that does this works nicely and shows off what kind of api/developer support Telegram.

No Manipulation or Advertising

Instead of the constant intrusion of 99% annoyance in terms of timeline distortion and advertising as found in Facebook and Instagram (and to some extent Twitter, which is going down that same path).

Essentially, the use of channels with comments can replace any given social network (other limitations apply), such as Twitter, Facebook, and Instagram. While those platforms still have the lion's share of engagement and users, moving over to the Telegram way of things makes sense.

Telegra.ph for Longform

Telegra.ph is a longform microblog platform which is very simple and also has zero advertising. There is a nice Telegraph App in the Google Play store.

Installing Telegram

For the Linux and ChromeOS world, the options are: Telegram Desktop (for Linux) and Telegram Android App (for ChromeOS).

Posted on Leave a comment

Syncthing = Dropbox & GDrive Alternative

Syncthing

Google Drive (GDrive) and other cloud storage alternatives such as Dropbox and Microsoft Ondrive all have the serious drawback of keeping one's information in a third party cloud repository. Privacy and security are generally compromised this way, even when paying for storage (as opposed to having an advertising model, which is worse in many ways).

Continue reading Syncthing = Dropbox & GDrive Alternative

Posted on Leave a comment

Delete Site Cache from Chrome

Chrome, why are you such crap at simple things? I need to delete the cache/cookies from a single website. It appears impossible these days. There is an odd work around, as follows:

> Three Dots 
    > Advanced 
        > Content Settings 
            > Cookies 
                > See All Cookie and Site Data 
                     > {Search for site} 
                          > Remove All Shown

For good measure, also go do:

> Three Dots 
    > Advanced 
        > Clear Browsing Data 
            > Cached Images and Files (only)

Yeah, what a joke. I sure wish there was an extension/plugin that would allow for a single click, but not that I can find.

Posted on Leave a comment

Tidying Up Digitally

Marie Kondo is an expert on tidying a house. Her Netflix series Tidying Up with Marie Kondo and two books (both of which are worth reading, best in chronological order) are best-sellers:

Continue reading Tidying Up Digitally

Posted on Leave a comment

Podcast Platforms

Podcasting is growing (slowly) and offers a great opportunity for brand engagement. Generally free, the idea is to be where the audience already is, and have a reliable host for content and the rss feed.

Media and RSS Hosting

Google Podcasts and Google Play Music Podcasts

Note, these are two different things: First Thing - Google Podcast (part of Google Search) - Google Podcast Publisher Tools - Google Podcasts App Second Thing - Google Play Music Podcasts

Pocket Casts (#4 platform

Stitcher (#3 platform)

Spotify (#2 platform)

iTunes/Apple Music (#1 platform)

WordPress Plugins

Posted on Leave a comment

Generic Roadmap

This is meant to be a reminder of important issues/decisions that already have some thought put in them (usually by others).

Stick with what we know in the marketing channels we know. Expand products, and channels for those products.

Posted on Leave a comment

DNS Records and Services

First, there are two kinds of DNS records: those for client look, and those for a server.

Client Lookup

I don't trust Google DNS, though for a while it was the go to DNS, and easy to remember at 4.4.8.8 8.8.4.4 and 8.8.8.8. For privacy, for me, there are two options, with the first being just better: - dns.watch 84.200.69.80 / 84.200.70.40 - 1.1.1.1 / 1.0.0.1 If one wants some security (as a service), then Quad9 is worth a look.

DNS Services

There are several DNS services to choose from. Dyn and related companies is the worst. Free DNS services such as afraid.org and he.net are unreliable, or simply not reliably fast. It makes the most sense to go with a top-rated DNS service (highly available and fast resolve times), and pay for this service (though less is more when it comes to expenses). - DNSmadeEasy.com - Silly name, $30/year for 10 domains, fast and reliable. Generally in the top 10 of private resolvers. I've not found better/faster for cheaper.

DNS Records

NS Records

There are several records to worry about. The first are nameservers, which are put into the registrar database. This can be as few as two or as many as six (possibly more).

A Records

Depending on the DNS Server, these can have wildcards or not. Generally there are at least three A records to have: - Root domain - www subdomain - * wildcard For certain services, it is required to have a www. and also people mistype this, so it is best to have it as a domain, to have it on the SSL certificate, and to have a reroute from www. to the root domain.

CNAME Records

Usually only Bing Webmaster Tools requires a CNAME record. Otherwise these are generally worthless.

MX Records

These are for the mailserver. Usually a few are needed, one plus two backups. Gsuite has five records, but that is overkill. The top three make the most sense. Also, there are priority numbers, e.g, 1, 5, 10 to govern the round robbin-style resolving. - 1, aspmx.l.google.com. - 5, alt1.aspmx.l.google.com. - 5, alt2.aspmx.l.google.com.

TXT Records

TXT records are the go to place for every third party to put their info. Several examples of TXT Records include: - Yandex Webmaster Tools validation - Google Webmaster Tools/Analytics/GSuite/etc. validation - _acme-challenge records for DNS-based authentication for LetsEncrypt

PTR Records

PTR records are essentially a reverse so that an IP address is associated with a host.domain.tld. This is key for sending email.

DKIM, SPF, DMARC

These are all records for email security, at various levels. DKIM and DMARC are TXT records, and SPF can be TXT or specific SPF records, depending on the DNS service provider. - Setting up Gsuite DKIM, SPF, DMARC - Google on DMARC records - Test SPF and DKIM - Google on SPF - DKIM on Gsuite - Google: About DKIM

SPF Records

SPF looks like:

host.domain.com / "v=spf1 include:_spf.google.com ~all"

SPF are one of the earliest and easiest email records to set up for security, and specifically states which hosts can send email for the domain.

CAA Records

These records help tell SSL Cert providers which of those providers can generate a cert for the domain records. Each host needs two records: - Name (host), Type: iodef, Value: "mailto:address@domain.com" - Name (host), Type: issue, Value: "letsencrypt.org"