HTML as TeX replacement

Stuart notes Lee Phillips’s critique of HTML compared to TEΧ

However Phillips’s idea of HTML is not quite up to date; he ignores how CSS and SVG combine with HTML to add richer typography. First he complains about hyphenation and ligatures. Hyphens are in CSS Text Level 3 and are implemented in many browsers though not yet Chrome. Ligatures are in CSS Fonts Level 3 and supported in many browsers too — Apple has done it for years. Here we have the TEΧ example, live rendering from your browser, and what Safari Mac [and Firefox Mac] made of it. Note the hyphenation and the ligatures. Also, I took out the spaces around the em-dashes that Lee Phillips oddly put in.

Call me Ishmael. Some years ago​—​never mind how long precisely​—​having little or no money in my purse, and nothing particular to interest me on shore, I thought I would sail about a little and see the watery part of the world. It is a way I have of driving off the spleen and regulating the circulation. Whenever I find myself growing grim about the mouth; whenever it is a damp, drizzly November in my soul; whenever I find myself involuntarily pausing before coffin warehouses, and bringing up the rear of every funeral I meet; and especially whenever my hypos get such an upper hand of me, that it requires a strong moral principle to prevent me from deliberately stepping into the street, and methodically knocking people’s hats off​—​then, I account it high time to get to sea as soon as I can.

Next Phillips takes on mathematical equations. His first example is e = −1. Note how that was displayed fine inline, just by using <sup>, which has been in HTML for years, along with <sub> which I used to show the TEΧ e. Writing in utf8 means I don’t need a special sequence like \pi for π.

Phillips is right that doing more complex equation layout in pure HTML is difficult. Fortunately, we do have SVG for arbitrarily precise positioning of text and graphics. I took his example of Stokes equation, and put it through Troy Henderson's LaTeX Previewer (which I found by googling 'tex to svg'). Here we are:

Top

Here is the elementary version of Stokes' Theorem:

Go to top

Now, the SVG there, though scalable, is not ideal - it renders as paths, not characters. If I use SVG text, I can get it selectable:

𝝨∇×𝐅∙𝑑𝝨 = ∂𝝨 𝐅∙𝑑𝐫

Here's the SVG code for that. You can see the tighter control.

<svg xmlns="http://www.w3.org/2000/svg" width="200" height="44" >
<text x="0" y="30" style="line-height:125%; font-size:18px; font-family:Serif;">
 <tspan style="font-size:36px;">∫</tspan>
 <tspan style="font-size:12px;" baseline-shift="sub">𝝨</tspan>
 ∇×𝐅∙𝑑𝝨 = <tspan style="font-size:36px;">∮</tspan>
 <tspan style="font-size:12px;" baseline-shift="sub" dx="-7px">∂𝝨</tspan> 𝐅∙𝑑𝐫
</text>
</svg>

Here is how Chrome Mac and Safari Mac render this:

However, you may not see all the glyphs, as I am using the special unicode characters for Mathematical letters, and your browser or device may not have those.(Update - I included the STIX font so you should see them now). Here's a version with ordinary latin and greek letters:

Σ∇×F∙dΣ = ∂Σ F∙dr

Phillips may be superficially right that HTML doesn't give as much typographic control as TEΧ, but when you compare to the full web suite, including CSS and SVG, that conclusion can't be sustained; indeed even his point about macros could be solved by using javascript as well, though I prefer my web pages to be declarative.

That said, many of the CSS specs I have linked to are still being edited, so this is a good time to try out authoring your mathematical papers that way and possibly proposing changes.

Update 2015-10-09

Lee Phillips has posted a reply.

There is a clash of worldviews going on here, that reminds me of this discussion. TeX and HTML have a lot in common, in that they are both plaintext authoring environments for documents, but they have philosophical differences too, in that TeX is meant to be compiled into a specific pagination, and HTML is meant to flow dynamically. I'm sorry that my initial article came off as rhetorical point-scoring; I was genuinely trying to work out which parts of the HTML+SVG+CSS+JS toolkit were in a state to represent maths well.

What I was clumsily trying was to show that HTML+SVG output from TeX could be better than the current default, which is HTML+bitmaps, or PDF. The way we get better support for ligatures, hyphenation and justification in browsers is by trying them out, seeing where it goes wrong and sharing these cases with the spec editors and browser writers too.

Now to address Phillips's specific points there. Yes, I am updating markup in this post. My initial pass at representing equations with SVG text looked good in the Mac browsers I was using, and my Chromebook, but that was relying on font substitution for the Mathematical letters, which I later realised were rare in actual unicode fonts. The edit history is in my github repository. Apologies for not making that clear. Do send me a pull request if you have further edits.

My choice of "Hoefler Text" as default typeface for this site has also apparently been causing layout problems John Lenton got all caps on ubuntu and Firefox seems to use really wide spaces for it. I've put in 'Computer Modern Serif' which Firefox seems to prefer. HTML font handling is fiddly and annoying - I now appreciate TypeKit a lot more.

Our em-dash differences are clearly one of those typographic holy wars, like Oxford commas, that generate more heat than light​—​the Gutenburg text I took the Moby Dick from was without spaces.

I have added the DOCTYPE and lang declarations to encourage Firefox to hyphenate more; that and the typeface change seem to have helped somewhat, but it is still setting the spacing loosely- before and after in Firefox on my mac:

That said, one of the advantages of the HTML worldview is that invalid markup is still rendered readably; again this is a culture clash of compiled versus dynamic worldviews; SVG, being pure XML is draconian like TeX.

Encouraging browsers to adopt the Knuth and Plass justification algorithm, hyphenation and hanging indents is definitely worth doing. I added Microsoft's text-justify: newspaper property which apparently turns it on there. In practice, avoiding text-align: justify on the web (as both myself and Phillips do on our sites) still produces more reliably legible text for now.

My point with <sup> and <sub> was that they suffice for simpler equations. I think we agree that SVG rather than bitmaps are a good idea for complex ones.

My attempt at using SVG Text rather than paths was an experiment, and clearly an unsuccessful one; I ended up with equations that looked good in my browser and looked like crap in many others, and the selectability was not worth the aggravation. I think there is potential for a new TeX to SVG converter that did use SVG Text rather than paths, but I am not the one to write it; SVG as a PDF replacement would be a good thing.

The macros point was Phillips's discussion of TeX as Turing complete and thus hard to translate into HTML. If you want Turing complete macros in HTML, you use Javascript for them was what I was trying to say.

The point about the CSS specs still being edited was to encourage those who are generating and publishing mathematical papers to join in the discussions there, with use cases and examples, to try to get the browser engineers to converge on these typographic issues.

I'm not hostile to TeX and those who find it a productive tool; I do think it could be better translated for the Web than by rendering to PDF. Mostly I'd just like TeX fans to stop making only 2-column fixed-size PDF papers that I can't read on my phone or tablet without a lot of zooming, panning and squinting. I'm sure it is a powerful enough toolbox to do better at that.

Rich typography: HTML versus TeX kevinmarks.com/htmlversustex.… twitter.com/kevinmarks/sta…
@kevinmarks Good work! I believe that Lee Phillips, the author of the LWN article, is @lpfeed and it'd be interesting to hear his response.
HTML as TeX replacement kevinmarks.com/htmlversustex.… by @kevinmarks as a reaction on lwn.net/Articles/66205… (via @mahemoff) #typography
@mahemoff @sil the text in that article seems to be trying to say that html typography is not that bad, and yet...
HTML as TeX replacement kevinmarks.com/htmlversustex.…
HTML as TeX replacement: kevinmarks.com/htmlversustex.… Comments: news.ycombinator.com/item?id=105184…
HTML as TeX replacement ift.tt/1kh6adX
HTML as TeX replacement ift.tt/1kh6adX #webdesign
HTML as TeX replacement ift.tt/1kh6adX
HTML as TeX replacement | #SVG alone settles this argument in favor of #html #tex kevinmarks.com/htmlversustex.…
HTML as TeX replacement ift.tt/1kh6adX
HTML as TeX replacement ift.tt/1kh6adX #news
bit.ly/myeponlinks HTML as TeX replacement bit.ly/1NWzYpA Trailing the updated info-technews on yo… bit.ly/myeponlinks
Click Share in minute info-technews bit.ly/1NWzYpA popping linksexploaded under the rainbow!! follow li… bit.ly/epon_technews
IT news: HTML as TeX replacement kevinmarks.com/htmlversustex.…
IT news: HTML as TeX replacement kevinmarks.com/htmlversustex.…
RT linnflux HTML as TeX replacement ift.tt/1kh6adX #webdesign
HTML as TeX replacement kevinmarks.com/htmlversustex.…
HTML as TeX replacement ift.tt/1kh6adX #business #startups
HTML as TeX replacement ift.tt/1kh6adX
HTML as TeX replacement bit.ly/1NWBkAK (cmts bit.ly/1NWBkAL)
HTML as TeX replacement ift.tt/1kh6adX
HTML as TeX replacement ift.tt/1kh6adX
HTML as TeX replacement kevinmarks.com/htmlversustex.…
HTML as TeX replacement bit.ly/1REGlik
HTML as TeX replacement ift.tt/1kh6adX #hacker #geek #startup
HTML as TeX replacement ift.tt/1kh6adX #hacker #geek #startup via ruby_engineer
@kevinmarks update didn't work... Up-to-date Android Chrome
HTML as TeX replacement ift.tt/1kh6adX
@jacob_mcdonald web fonts are harder than they look. I'll try again.
@jacob_mcdonald on which OS?
HTML as TeX replacement kevinmarks.com/htmlversustex.…
HNews: HTML as TeX replacement bit.ly/1S6UiW5
HTML as TeX replacement kevinmarks.com/htmlversustex.… (bit.ly/1LYPG2S)
HTML as TeX replacement kevinmarks.com/htmlversustex.…
HTML as TeX replacement nzzl.us/8gGIK5v via @nuzzel
HTML as TeX replacement bit.ly/1SBK29e comm: bit.ly/1MsJx0i
@kevinmarks Heads up: All the images from Twitter are blocked by Firefox’s tracking protection!
@kevinmarks Wouldn't that page have looked better using MathML? @firefox
@aslakr @firefox clearly I taunted the web typography gods and they are punishing me.
HTML as TeX replacement ift.tt/1kh6adX
@kevinmarks @firefox FWIW, it appears related to Hoefler Text. When I get rid of that in the font declaration the weird spacing goes away.
@kevinmarks @sil You did it wrong. It has to be "DOCTYPE", in upper case. Read the spec (your way won't work).
@kevinmarks Thanks for the thoughful update. "Encouraging browsers to adopt the Knuth and Plass [...]" We agree about this.
@lpfeed do you have a computer that runs IE that you can check text-justify: newspaper on ?
@kevinmarks I only have laptops running Ubuntu now. (And I didn't have the all-caps problem with your article.)
@kevinmarks yes! Even inline maths is better! The SVG formula using paths is jagged, but ok; the one using text...
Unknown
There should definitely be a better way for the TeX to work with HTML: HTML as TeX replacement http://www.kevinmarks.com/htmlversustex.html
@kevinmarks @leyawn it will be like new coke. After seeing how annoying it is. I think we are here on twitter for the short bursts.
@kevinmarks @KatiMichel @lakens impressive but a too rube goldbergy for collaborations methinks
@rogierK @KatiMichel @lakens html in github is not too bad to collaborate on. Not quite gdocs easy
@kevinmarks I remember reading that! I'll look into it :-)
@kevinmarks on the web you would have to enforce a (web)font, I think. But there's no solution like that afaik
I'd like html/css to not be used for documents like this, because the rendering can vary between browsers,
and the format changes too frequently.
Oh my gosh this is COOL
right, that's what I was saying about philosophical differences. HTML is meant to adapt, TeX tries not to.
totally. I don't believe that a document format should try to adapt too much if it's to be kept for a long
time. I can totally understand the other point of view, tho :)
that's not quite right - old html documents are still displayable by current browsers, indeed that has improved
im not convinced the rendering of modern websites will stay consistent in the next 10 years
Counterargument: it's LaTeX which should replace HTMLWhat my experiments with various content creation and markup systems over the past 30 years have shown me is that it's far less details of presentation which are crucial, but in enforcing document structure itself.Presentation changes with technology -- I've seen and used systems with toggle-and-light outputs, true ttys (paper), glass ttys, various terminal and console outputs, the "standard" 24x80 terminal, desktop GUIs, and now handheld and mobile GUI d
All I know is I had a 300 page journal to lay out. I was given word docs. If you think I wasn't doing it in InDesign...well...yeah. I forget what I used but it basically made images of the equations and it was not ideal.
Yes, a lot of the tools make bitmaps of the equations from TeX, which is a shame when we have vector formats like SVG and epsf
Didn't we already do html vs latex? kevinmarks.com/htmlversustex.…