Hacker News new | past | comments | ask | show | jobs | submit login
SILE Typesetting System (sile-typesetter.org)
90 points by _pfxa on Feb 19, 2017 | hide | past | favorite | 23 comments



Compared to TeX I miss two things: Math mode and TikZ.

One of the quality marks of TeX for me, is that the fonts match everywhere. For inline math, the x-height of normal and math font is the same. The font and text size used in plots is identical to the normal text.


No math is pretty much a dealbreaker for most people, I guess.


It depends on what world you operate in. For science-related work, then yes, I agree it's a dealbreaker. SILE is currently focused towards humanities publishing where not having math support is not really that big a deal. But I would like to see it added, and have been looking into how hard it would be to get MathJAX support.


This was probably posted because it was discussed in the thread for the Patoline typesetting system yesterday: https://news.ycombinator.com/item?id=13674879


Yes, it made me recall this, though I first saw it in FOSDEM '15 talks [1]. I thought it'd be nice to see what people think about this as I'm in the looks for a TeX alternative.

[1] https://www.youtube.com/watch?v=5BIP_N9qQm4


SILE indeed looks great, promising to come on par with the typesetting quality we have come to expect since TeX: Knuth-Plass line breaking, Unicode and OpenType font features support, complete with contextual shaping, Cassowary constraint solver, parallel text, multiple apparati, foot and marginal notes, vertical typesetting… This is a typographer’s dream!

But, like TeX and friends (LaTeX, ConTeXt, etc.), creating stylesheets and document templates still looks like a pain with SILE and alike. At least from the perspective of designers who have come to expect a strict separation of concerns between document structure/semantics and its styling, and who are used to work with a declarative stylesheet-based language like the prevalent CSS, as opposed to the macros of TeX, and its document model which conflates semantical markup with inline styling instructions.

Similar initiatives like SILE, which attempt to port TeX to newer languages while untangling macro spaghetti, like Cló¹ and Rinohtype², didn’t consider CSS-based stylesheets either. Which is a pity, especially with highly developed W3C Working Draft open standard specifications³ for paged media being out quite some time now.

That’s exactly what makes PDF formatters like Prince⁴, which accepts standard html and css as its input, so immensely attractive: users can continue to use the (Web) technologies they already know (html, javascript, css) and enjoy a strict separation of concerns between document contents, templates and make-up. Unfortunately, there exist no FLOSS alternatives.

Once in a while, people are coming up with the question whether there are TeX flavors which do support `.css` as an input.⁵ Peculiar too that no project exists to create a compiler to convert and map css style rules, selectors and properties to something which TeX does understand. (Except may this⁶ one.)

[1] https://www.youtube.com/watch?v=824yVKUPFjU [2] https://github.com/brechtm/rinohtype [3] http://www.w3.org/TR/css3-page/ [4] http://www.princexml.com/ [5] e.g. http://tex.stackexchange.com/questions/139067/i-have-a-dream... [6] https://github.com/yannisl/phd/blob/master/phd.dtx


As a lazy Debian user, I checked all of these and found none on Debian repositories, which due to Ubuntu and other direct or indirect derivatives is probably the best way of making such software widely available and known.


So I'm curious and click on the "Examples" link in the navigation, and there's just no justification for the first picture. ;)


And a glaring "overfull hbox" between the two columns "means".

And shouldn't the "grid layout" feature assure that the baselines of the two columns match?

And the kerning problem in "T able of Contents"

;)


Oh dear, yes. You just keep seeing more and more errors the longer you look.

This isn't meant to demean the people working on SILE. It just goes to show how complex typesetting is.


Kerning is hard. Like, really hard. There is a good reason that kerning is mostly tweaked by hand.

Re. the grid layout, if you define a page height, I would expect it to produce a layout as the example page shows, otherwise there will be a void between the grid and the buttom frame.

But the "means" issue is simply not okay.


So, I'm really sorry about the examples. There are two issues here: first, I'm not good at coming up with compelling examples. Second, I'm not very good at keeping the examples on the web site up to date with the current progress of the code. In fact, I haven't regenerated those examples, like, ever. Try the PDF examples in the repository as they should be better; but at the same time we should still auto-regenerate them periodically, maybe on release. (And convert them for the web site.) Really sorry about this.

Of the issues you've mentioned: I now justify the columns; the overfull box is gone as I tightened up the linebreak tolerance on narrow columns; kerning most definitely happens automatically if the font provides kerning pairs; grid layout now works better.

Any other issues, please file bug reports and I'll fix them!


What goes into writing a typesetting system like this? I know kerning is one thing that is needed but I'm assuming there's a lot more then just spacing letters.

I'd love for someone to write a full post about how it works.

Also this does look really good.


TeX by Topic was a much clearer resource to me than the TeXBook although one eventually has to go back and read that. TeX seems like a reasonable approximation of it, but it's definitely slanted towards text-heavy and mathematical stuff, which is pretty far from what something like InDesign is mostly about. I'm just an appreciator, not an implementer, but this is my overview of typesetting text:

First, you have to build words from letters; this is where kerning comes into it.

Second, you have to build lines. TeX calls this "horizontal mode" and this is where inter-word spacing and justification matter. TeX uses a "boxes-and-glue" model, so during line building it's essentially putting stretchable glue between each word, and then when the line is built, it spreads the space evenly between all the pieces of glue to achieve justification. Glue has a natural size as well as minimum and maximum values; this along with hyphenation has to do with calculating the line "badness."

Third, you have to build paragraphs. This is what TeX calls "vertical mode." Between each line of text, it inserts vertical glue that works similarly to the horizontal glue mentioned above. Another glue is used between paragraphs. TeX worries about widow and orphan lines here; other calculations are made to prevent or allow them. This is like badness but I think it has another name.

Finally, you have to build pages. With TeX, this happens periodically as it notices it has enough material (or you force it by calling the right kind of eject). By default, this invokes a page building macro which you can customize. This part of TeX feels pretty 1970s and hackish, but essentially, there is a variable that contains a box with the supposed contents of the page, your job is to assign a box to a certain location and remove some of the contents of the input box. LaTeX's "float" mechanism is actually built on top of this mechanism, but as it's pretty procedural, you can certainly achieve other effects here.

So that's the birds-eye-view I acquired a few years ago when I did some in depth playing with TeX. First you build words, then lines, then paragraphs, then pages. There are different considerations at each stage of that but TeX attempts to subsume everything in that overall flow.

I think that process probably generalizes nicely for some things and is almost inadequate for others. I'm not sure how ConTeXt achieves the grid layout; I imagine they are relying on some complicated Lua code to pull it off, because TeX really doesn't care unless you make it.


There's also TeX: The Program that is a literate code for TeX. It's in Pascal though and littered with literate macros, so it's rather hard to read.


Do any of these work in HTML, or web browsers, where the page changes size and doesn't have fixed breaks? Where you can just scroll and scroll?

I'd love beautiful typesetting on the web, where I actually write and read. I don't use paper.


Note that I use paper, because I am an old person. They're very different problems.

In print, you have to deal with page breaks and positioning of figures, but you know everyone will see the same thing. Doing this right requires human attention, but you can produce beautiful layouts, even with multiple columns and oddly-shaped text boxes.

On the web, everyone sees a single "page," but those pages are all different. Even simple two-column justified text is mostly garbage on the web, and forget trying to float figures close to their references. Web designers have mostly given up, resorting to either huge margins, or an ugly wide wall of giant-font text. Print-quality layout and typesetting on the web is very far away.


Completely true, but for fairness the reason is that the web was not created as a tool for text layout. The goal was exactly the opposite, to make the text independent of layout and displayed differently in the different browsers. Doing good text layout on the web is practically an intractable problem. On the other hand, we already have software capable of doing this, such as PDF.


I don't know if Tim Berners-Lee thought about layout. SGML was text with semantic information, and HTML was text with cross-references. Back when displays were tiny low-resolution things little better than terminals, and most web pages were just text with a few images, layout didn't matter. It was all ugly. Now that displays are larger than books, and pages are heavy enough to contain the entire Gutenberg Bible, people are starting to care.

Maybe in 5-10 years, with enough server-side computing power and bandwidth, sites will ask for the client's screen size and desired font size, then serve an appropriate PDF.


Does this support arbitrarily nested environments?

One thing that always bugged me in LaTeX was exactly this. Nesting environments often gave me odd results, or errors.


Looking at the FOSDEM'15 talk linked above, it appears to support this.


It seems the latest release (0.9.4) is from August 2016.

Any idea when the 1.0.0 milestone will be reached?


1.0.0 will be reached when, given an appropriate style sheet, SILE can perform unsupervised typesetting of an arbitrary USX file (XML-based Bible translation document) to publication standard.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: