Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Writing my PhD using groff (jstutter.netlify.app)
236 points by yockyrr on July 23, 2022 | hide | past | favorite | 130 comments


Groff is great. I went through a phase of doing reports for work using groff and one of the cool things is that on W Richard Stevens' website there were all his groff macros he used to produce all the beautiful diagrams in TCP/IP Illustrated etc. So I used to have lovely diagrams with spline curves etc thanks to W Richard Stevens.

The great thing about groff (compared in my experience with latex) is that you spend basically zero time on formatting/messing about once you have a set of macros you like, and the document production cycle is really fast so you edit with zero distractions using basically plain text (a lot like markdown) and then any time you want to see the finished product it's very quick to see it.


The LaTeX criticisms of the article really resonated with me. Long compile times and a narrow "happy path" are the things where I feel LaTeX makes me less productive.

This is a pity because, otherwise, it is a great tool with its focus on document structure and output quality. I'm currently working on a LaTeX successor which seeks to address these issues, but it is really hard to make the right design compromises here -- what can be programmed? What is accessible through dedicated syntax? How does the structure drive presentation?

Computer typesetting is a rabbit hole, but a fascinating one. And I'm sure the last word on it has not been spoken yet :)


• As a rough rule of thumb, TeX can do about 1000–3000 pages a second on today's computers.[1] This is for a (presumably typical) book that was written in plain TeX.

• So if your LaTeX document is taking orders of magnitude more than about a millisecond a page, then clearly the slowdown must be from additional (macro) code you've actually inserted into your document.

• TeX is already heavily optimized, so the best way to make the compilation faster is to not run code you don't need.

• Helping users do that would be best served IMO not by writing a new typesetting engine, but by improving the debugging and profiling so that users understand what is actually going on: what's making it slow, and what they actually need to happen on every compile.

To put it another way: users include macros and packages because they really want the corresponding functionality (and everyone wants a different 10% of what's available in the (La)TeX ecosystem). It's easy to make a system that runs fast by not doing most of the things that users actually want[2], but if you want a system that gives users what they'd get from their humongous LaTeX macro packages and yet is fast, it would be most useful to help them cut down the fluff from their document-compilation IMO.

---

[1] Details: Try it out yourself: Take the file gentle.tex, the source code to the book "A Gentle Introduction to TeX" (https://ctan.org/pkg/gentle), and time how long it takes to typeset 8 copies of the file (with the `\bye` only at the end): on my laptop, the resulting 776 pages are typeset in: 0.3s by `tex`, 0.6s by `pdftex` and `xetex`, and 0.8s by `luatex`.

[2] For that matter, plain TeX is already such a system; Knuth knew a thing or two about programming and optimization!


I never got deep into TeX, but I browsed the code at one time and some of what I found seemed utterly insane to me. For example, it includes an IEEE floating point implementation, based entirely on TeX string expansion [1]. I don't know if it is widely used, but I'm not surprised by slow LaTeX compiles anymore.

You say "TeX is already heavily optimized", but that's only true for the layout engine. The input language is entirely based on macros and string expansion. That's fine if you're only going to use it for a bit of text substitution. But as a programming language it's inherently slow. (To be fair, I believe Knuth expected that large extensions, such as LaTeX, would be implemented in WEB.)

[1] https://github.com/latex3/latex3/blob/main/l3kernel/l3fp-bas...


That's precisely my point: the slowness comes not from TeX but from the LaTeX macro packages that the user has included on top. And if you want to make things faster for the user, you don't have to replace TeX (which is already plenty fast); you have to replace the macro packages or provide faster alternatives.

If you're going to create a new system with a new way of doing things (i.e. not using the existing popular LaTeX macro packages), then you can already do that on top of TeX, by just not using those packages! (Use LuaTeX and program in Lua instead of via TeX macros, or do the programming outside and generate the TeX input, or whatever.)

What I'm proposing is that the hard/worthwhile problem is to take real users' real LaTeX documents and give them ways of profiling (what inefficiently written macro packages are making this so slow, because surely it's not the typesetting) and replacing the slow parts.


This is a topic I have been interested in for a while. Is it viable to compose fancy large documents in plain TeX without a lot of effort replicating functionality provided by LaTeX (if your requirements stay constant)?

I am a heavy user of the memoir class, and I have always suspected moving to plain TeX would not be that hard. However, the fraction of users doing this seems pretty slim so modern TeX workflows do not seem really well documented.


I’ve tried. Coauthors and journal requirements were my limiting factors, not anything inherent to the typesetting engines…

These days (in industry) I manage to use pandoc markdown to word for everything (for similar reasons), which is even more limiting than plain TeX. You learn to write around the limitations pretty quickly. :)


For a dissertation you often have diagrams and bibliography this is taking a while. Even my resume takes more than 1s a build because of that.


How does LaTeX compare to XeTeX, ConTeXt, and LuaTeX in compilation speed?


This is not really a meaningful comparison: You have your TeX compiler (pdfTeX, XeTeX, LuaTeX, and the older e-TeX) that takes a document in some format and produces a PDF or DVI. In my tests (that did not include e-TeX) pdfTeX tends to be the fastest here, but sometimes you need modern fonts, so you have no choice but to use the other two.

The TeX compiler then loads a format like plain TeX (which the above commenter uses), LaTeX, or ConTeXt. The format defines what macros are available. LaTeX adds a package system, as does ConTeXt (modules) so you can import even more macros on-demand. These TeX formats differ in scope and thus speed, LaTeX tends to be a bit heavier but what really weighs it down are the myriad of packages it is usually used with.

Many TeX distributions will define aliases like pdflatex in your path such that you can preload pdfTeX with the LaTeX format, but they are not really separate compilers.


But XeTeX and LuaTeX are still billed as "end-user-facing" systems, alternatives to LaTeX and ConTeXt. I appreciate the distinction between the actual typesetting engine and a macro library layered on top, but as someone who has only ever used LaTeX I would definitely be interested in a thorough comparison of their capabilities, speed, user experience, etc. One can of course include plain TeX itself in such a comparison!


I've become productive in LaTeX once I stopped doing any typesetting in it until there was a real need for it due to publisher requirements. LaTeX looks great out of the box, I just finished a book that I had to deliver camera-ready and the publisher (not a LaTeX shop) was very impressed with the quality. It was the standard Memoir book template with almost no changes. Ironically, many documentations for special typesetting packages in LaTeX look very bad. Generally, the less you change, the better.

LaTeX really fails at "register-true" typesetting, though. You have to allow it to extend pages here or there by a line or be willing to fix many orphans and widows by hand. AFAIK, this has to do with the text flow algorithms which are paragraph-based and cannot do some global optimizations. (Correct me if I'm wrong, I'm not an expert.)

Btw, I cannot confirm the compile-time criticisms. A whole book takes just a few seconds on my machine for one run. I wonder what people are doing who get slow compile times.


There are some packages that slow LaTeX down, I think... Tikz is one I think.

My masters thesis was written on an old netbook with an Atom processor, plenty of graphics, the compile times got pretty ugly. But I did different files for each section, and set it up so the latex process would automatically kick off and run in the background after writing to the file in vim. Working within constraints like that is sort of fun, it forces you to get the slow operations off the critical path.

Currently I use a script like:

  inotifywait -e close_write,moved_to,create -m . |
  while read -r directory events filename; do
    if [ "$filename" = "$1" ]; then
      latexmk -interaction=nonstopmode -f -pdf $1 2> $1.errlog
    fi  
  done
to just re-compile the .tex whenever it changes. I'm not really a bash programmer though so I guess this will probably be ripped apart by somebody here, haha (the top couple lines were probably taken from some post on the internet somewhere).


FWIW, `latexmk` has a "Watch the sources for this document and rebuild if they change" mode builtin. It gets activated if you pass it the command line flag `-pvc`.


Suggesting things on this site is such an easy way to get better solutions. Thanks!


Memoir has options like \sloppy bottom. But honestly, the reason it doesn't is that it's virtually impossible to have an algorithm that gives you the best layout 100% of the time. Sometimes it's physically impossible not to have orphans or awkward spacing with the text you've given. You can never remove the human from the equation.


Yes, that's my point. \sloppybottom can look fine but it violates the requirement of real register-true typesetting where each typeblock has the same size on every page and the lines on a double page match exactly. Some publishers have this requirement and it's hard to work around it. This could in principle be improved by some subtle tradeoffs between line breaks in previous paragraphs, paragraph breaks, and page breaks. It's a kind of global optimization that is not possible in TeX due to principal limitations of the engine. See Section 4 of [1].

[1] https://tug.org/TUGboat/tb11-3/tb29mitt.pdf


Simon Cozens spent some time writing a new typesetter called SILE:

  https://sile-typesetter.org/
On of the design goals was to be able to achieve exactly that kind of line matching. IIRC it can ensure that lines on the front & back of a page line up exactly too; apparently this is important for bibles?

Worth taking a look at. It recently acquired TeX style mathematics typesetting ability & has a small but active developer group.


I figure I should also mention that my LaTeX alternative is called Typst. We do not have much public detail yet but there is a landing page [1] to sign up for more info and beta access as soon as it becomes available.

[1]: https://typst.app/


It looks really nice! I’ve sent this to a few friends to check it out.

To give some context, I'm a professor in theoretical computer science, so I write a lot of LaTeX documents and notes.

Some observations of my work flow.

  - Writing: I'm writing the source, and occasionally look at the output. So as long as the output time is reasonable, then it is sufficient. 
  - Editing: I'm reading the output, and then edit the source. So going from the output to the part of the source I have to edit should be as smooth as possible. 
  - Typesetting is the least of my concern. I only check if there are any glaring typesetting problems right before we publish. This takes at most 1% of the total time in preparing a document. 
  - Live editing almost never happen. But I see why it might be useful to incorporate it into the work flow. (A cursor on the rending of the live editing would be very nice)
There are some choices on how to present source and the rendering.

Typst went with the 2 panel design, with one side source, one side rendering. So I found something close to WYSIWYG is better for editing. However, full WYSIWYG is hard to get right and comes with its own problem. Currently I found there are a few common things people do with respect to source/rendering.

  - WYSIWYG editors, which renders everything (word, TeXmacs, Lyx). Editing is done in the rendering. It is smooth, but takes a long time to get used to.
  - The app Typora that renders everything except the part where you are editing (which shows as the source). This can be generalized to render all except the current line, or something similar. Editing is done in the source, but feels like I'm editing in the rendering. This is extremely smooth for my editing work, and is my preferred way. 
  - The app like Compositor https://compositorapp.com/ that renders everything, but can call out the selected part of the source. 
  - The source and render are in two different panels. Editing is done in the source. So usually one can click part of the rendering, and cursor jumps to the corresponding part of the source. This introduce some friction, as the eyes have to do a jump, and also a quick context switch.


The screencast looks good! For parallel/prior work in this sort of "live update" of the typeset document (and to learn from their experiences), you may also want to look at:

• SwiftLaTeX (https://github.com/SwiftLaTeX/SwiftLaTeX / https://www.swiftlatex.com / https://doi.org/10.1145/3209280.3209522 — the cool demo that used to be on their site seems to be gone, but see HN discussion: https://news.ycombinator.com/item?id=21710105)

• Texpad https://www.texpad.com/

• BaKoMa TeX (http://www.bakoma-tex.com/) — its eponymous author Basil K. Malyshev passed away recently, but the product and page still exists for now

• VorTeX (see Pehong Chen's PhD thesis from 1988 https://www2.eecs.berkeley.edu/Pubs/TechRpts/1988/CSD-88-436... — it actually discusses the issues of quiescence, etc).


My main concern with rented cloud software for this space is that it seems like a great way to lose access to editing your own past a academic/technical work.


Very valid concern. We aim for an open core model so you can always take your projects and compile them locally


If you are working on a LaTeX successor you could be interested in TeXmacs, which is a LaTeX successor which works very nicely in many ways, except apparently selling itself well :-) You could see there how it was designed and how the author answered the questions you are asking.


I wrote my MA thesis using markdown (with extended syntax). Structuring the document is easy: just use one hash for top level sections and more for subsections. Footnotes are easy, you just add a footnote[^reference] somwhere and then add the footnote text on a seperate line somewhere:

  [^reference]: some text
Inline math works by adding $x= \frac{y}{z}$ or in a seperate math block by adding two $ signs before and after.

The syntax of markdown is easier there, but LaTeX is arguably much more powerful, e.g. you can load tables from csv data, generate graphs, make it deal with your bibliography, draw circuit diagrams etc. And the layouts tend to look just good.

I ended up converting markdown to Indesign IDML and using this as a source in an Adobe Indesign layout where I could do all the basic typographic settings and styling once and update it on changes.


Splitting a long document in chunks and using the `draft` option while writing speed up compilation times considerably. Otherwise you're producing a finalised typeset document of 100+ pages every time you hit F5, no wonder it takes ~10s to finish ;)


IMO draft seems like a crutch: Because TeX has to reprocess the whole document and the sty-files of each imported package every time, you do not have a huge budget for your document content. Instead, you are given the option to sacrifice image decoding, plot drawing, and output fidelity to keep TeX humming along.

Sure, that's better than nothing but I cannot help but wonder whether there could be an architecture where you cut down on repeating work and get faster recompiles that way!


If you want something LaTeX like, but with a wider happy path you should try SILE


Does pandoc help at all?


I had a go at pandoc when I was writing my (social science) PhD, but I gave up and went back to just doing it all in LaTeX fairly quickly.

From memory the compile times didn't worry me at all. What did worry me was making it look like I'd put far more effort into documentation presentation than I actually had. Which worked well for me, especially big shout out to the hyperref package to get my far-too-many acronyms linked back to the acronym definition table for every mention), links to every citation, and proper links inside the document to each section /page reference.

Then on top of that my hacked together proofreading tools in emacs, and a torturous 10k word chapter 2 to sideline a difficult politicical problem, and I passed first time!


In use cases like the Markdown to PDF pipeline described in the article, sure! Documents there are also simple enough so that compile times aren't too much of a problem.

However, many of the documents we like to set in TeX are more complex than that: bibliographies, figure placement, special typographical flourishes.... And here is where the complexity of LaTeX and its macros adds to the inherent complexity of what we are trying to accomplish (and compile times quickly ballon again).

So, sometimes it helps...?


I can relate quite well to the author's pursuit in tinkering with their typesetting workflow. When I wrote my bachelor's thesis, I also spent a great deal of time coming up with a custom LaTeX template and workflow. Like the author, one of the pain-points was the relatively slow edit-compile-review cycle of modern LaTeX engines like LuaLaTex.

In my case, I was mainly concerned with making the resulting thesis.pdf PDF/A compliant. PDF/A is a archival compliance standard that's dedicated to the long term digital preservation of PDF files.

Predictably, I got way too carried away as well, and ended up trying to create fully-reproducible LaTeX PDFs as well. It was probably overkill for my use-case, but it did result in a fun blog post where I documented the process [1]

[1] https://shen.hong.io/reproducible-pdfa-compliant-latex/


> Like the author, one of the pain-points was the relatively slow edit-compile-review cycle of modern LaTeX engines like LuaLaTex.

This depends a lot. In most of the cases delay is only about 1 second on modern PCs. A bit more when you cite and build the document twice.

You can use LaTeX in many different ways. There are built-in editors and web services such as Overleaf. In the end, they all use the same workflow or dependencies for building the document, but might add an additonal delay.

I too have ended up tweaking my environment a lot. I ended up testing almost every LaTeX workflow.

I finally ended up for just using vim and zathura. Optimised docker image with LuaLatex builds the document. Second favorite would be LaTeX plugin for Jetbrains products. Overleaf is only good for collaborating.

On my desktop pc which has 16 CPU cores, there is only very little latency when compiling. But for text editing, it is a bit rare that you need such PC…


> This depends a lot. In most of the cases delay is only about 1 second on modern PCs. A bit more when you cite and build the document twice.

I agree that for many (or even most) documents, LaTeX's compilation delay is generally manageable. However, when it comes to documents with bibliography management, footnotes, margin-notes, and multiple figures, the compilation delay can get quite high.

In my own experience, I had a document of notes containing over a hundred citations managed by biblatex and bibmla [1]. It also had footnotes and margin-notes, requiring an additional repaint. The compilation time on that document was well over several seconds on my laptop, up to dozens of seconds when on battery-power.

> I finally ended up for just using vim and zathura. Optimised docker image with LuaLatex builds the document. Second favorite would be LaTeX plugin for Jetbrains products. Overleaf is only good for collaborating.

I'm very curious to hear about the docker image that you are using. What purpose does the docker image serve in the build pipeline? I know that for compiled software, sometimes having a build environment allows you to better define the environment variables, but to my understanding this is not a worry for LaTeX.

[1] https://github.com/ShenZhouHong/sartre-notes


> I'm very curious to hear about the docker image that you are using. What purpose does the docker image serve in the build pipeline? I know that for compiled software, sometimes having a build environment allows you to better define the environment variables, but to my understanding this is not a worry for LaTeX.

Using Docker brings several benefits. I allows me to share the same build environment for multiple different machines. I can even use my desktop remotely for building the documents if I want, just by sharing Docker Daemon.

Sometimes some package breaks after an update, and Docker allows me to roll back to working environment. I also can declare additional packages and fonts deterministically if I need them. Overall, LaTeX is quite complicated and huge system, and I rather keep it away from my host machine. Maybe Docker is a bit overkill, but I have never wasted time on fighting with package conflicts or installing Latex once again with extra packages for different machine.


Surely PDF/A can be created just by passing any PDF through Ghostscript with the right flags such as -dPDFA and -sPDFACompatibilityPolicy=1 ?


I don't have any experience with using Ghostscript as a post-processing step, and I am curious to know if it works well for complex documents.

LaTeX does have a native way of generating PDF-A compliant documents, using the pdf-x package. It's still in beta, but it is quite stable and works very well. The advantage of enforcing PDF-A compliance using native LaTeX is that it allows you to take the further step of implementing reproducible builds. Once that is done, you can be certain that given a LaTeX source file, you will be able to generate a bit-for-bit identical document.

Additional post-processing steps will have to be at least documented, and will probably tie the output on the specific version of your post-processor.


I found that those who can’t get consistent styling and have laggy behavior on large documents don’t know how to configure it. I regularly wrote hundred page reports with embedded excel and images all embedded in Word with Math and got pretty proficient. There’s basically several things you need to do:

* Actually set up a named style for every type of content you have. Creating shortcuts for the common ones doesn’t hurt * use whatever the paid version that powers the free equation editor. It was miles better about 10 years ago * use a master document sub document approach for categorizing things. You wouldn’t have a single text file that’s 100 pages long. Split up Word that way too

I’m pretty sure I got to a state where I was using the tooling as intended because I wasn’t actually fighting the wysiwyg. Now I did switch to LateX at the end because I was tired of not having easy version control. Word has it if you enable change tracking but it can’t beat normal tooling. Also I wanted to learn latek because it felt like a worthwhile investment (it was - writing formulas in latek is wayyy faster to write and easier to maintain).

So I liked LateX just fine. Prefer Markdown / wiki these days because I don’t work with math formulas.

Disclaimer: I have zero experience with the web version and have no idea how it scales. I imagine it still does quite well on large documents but maybe browser rendering is not so good.


I think the point with LaTeX is that you can automate the document generation process to a great extent. For example, if you have some data, some python scripts that process the data, and some other scripts that generate figures, you can put all of that in a pipeline and build a new version of your document automatically after the data changes.


Oh man, I completely forgot that wouldn't even be possible (presumably) in Word. What a pain!


The equivalent would be OLE, I think. Yes, Word can't trivially use the text output of a Python program, but then LaTeX can't trivially embed a chunk of an Excel document and have it updated automatically. Both systems make assumptions about what they will integrate with.


Thanks for the info. I still consider LaTeX more modular since the only assumption it makes is the name of file you want to include.


That carries quite a few assumptions - for instance that it is a file that you are including. As the name implies, with OLE you are not including a file, but an object provided by another actively running program. Let's say you wanted a text document which contained a live video view from an IP camera. That would be feasible in OLE. Obviously you would not want to print such a document, but it's not an outlandish thing to want to make for viewing on screen.

It's not really obvious which is "better" as the two mechanisms work in very different ways. If anything, I would say that OLE can be more general, but the complexity of a minimal program to supply OLE objects is quite high compared with LaTeX.


This is possible in Word (and Excel).

You can link word tables to excel and, provided your analysis updates your spreadsheet, can refresh all data instantly.

You can also refer to values in tables in the body of the text.

I know this because I worked in an environment where Word was the only option!


I’ve also programmatically programmed generation of word documents using CPAN modules in a past life. I’m sure there’s better packages these days.


I wrote multiple papers during my PhD. The theoretical one with lots of equations I wrote with latex. It’d be stupid not to. Overleaf helped though I wrote the paper over 6 years so it only helped in the end.

Then I wrote two bio heavy papers. Using word. My thesis was in word too. If you have a ton of figures and not a ton of equations it’s not the best choice to use latex.


> If you have a ton of figures and not a ton of equations it’s not the best choice to use latex.

Last time I used Word for anything significant (a thesis) it was either word 2003 or 2007, and adding a table or inserting a new paragraph somewhere before a figure could mess up all the figure placement (sometimes the thing would literally disappear).

After that I switched to LaTeX and never looked back. Has this recently become better, or was I just unlucky/unexperienced?


If you patiently and systematically try to figure out all the layout options and settings in word, you start to understand what will and will not happen when you do any of these things. It’s still a deterministic piece of computer code? I certainly spent less time figuring out word than latex at least.


> If you patiently and systematically try to figure out all the layout options and settings in word, you start to understand what will and will not happen when you do any of these things

Well, this is exactly what I don’t want to do in a wyswyg editor, but glad that it worked out for you


I wrote my thesis in Word (graduated last friday, vivad last march). The key to stop things moving around in word is to use the "in line with text" option for images. And treat images like text. If you want it centred then justify it to the centre, dont try to place things manually.

Previous documents ive written with word i would do things like tight layouts on images, maybe with anchors, but that's a recipe for things moving around.

When i came to compile my chapters to a final document i used master document-subdocument to pull everything together. I only had a few issues with blank pages being added when exporting to pdf and that was due to my use of page breaks and section breaks.


> dont try to place things manually

Ah, so the solution is to use Word as if it was a worse LaTeX, I see :^)

Jokes aside, the precise manual positioning of figures and such (e.g. figures at a certain height of a paragraph, with text flowing around it) was the only potential attractiveness of Word.

If that’s still broken I really don’t see the reason for switching to a program with worse typesetting (and not only) capabilities, given my use case and the fact that I’m quite comfortable with LaTeX by now


> Has this recently become better?

You should try TeXmacs; it is not recent, but it has become smooth and it is superior to both LaTeX and Word under every point of view.


I find that lots of figures are good reason to use latex. You can programatically scale and crop, captions never get separated from the figures, and figure placement is handled in a sane predictable way.


> figure placement is handled in a sane predictable way.

That is not the LaTeX I know.


It does take some getting used to, I'll give you that. But it's very predictable.

Annoyingly, 'new' users, especially led by KOMAScript hints are drawn to use the 'total' positioning that is promised by using the H B P and other options. However, since these aren't holding their promises - at least not the misunderstood promised 'absolute' positioning, frustration is very often creeping in.

I've had too many co authors and friends ask me how to push figures to certain positions where all I saw was premature optimization in terms of positions and tons of wasted cycles (CPU and user) to get to intermediate solutions that are completely unnecessary if only they were saved until the last layouting runs. The change in document creation paradigm (to not care about the layout until the far end) is what 'manages' expectations and where the perceived errors mostly come from.


> If you have a ton of figures and not a ton of equations it’s not the best choice to use latex.

Strong disagree; wysiwyg editing in Word is an exercise in endless frustration, and Word’s typesetting and fonts are so ugly that it’s painfully obvious when a paper has been written using Word.

Just write LaTeX and let it do the typesetting, figure layout, citation formatting, etc. It’s less painful than Word WYSIWYG editing, and the result is far more polished.


While I have a lot of complaints with Word, I have to very tepidly take issue with the accusation of ugly fonts. You may like TeX's default typefaces more than Word's, but those are just the defaults. You can set a LaTeX document in Calibri and (presuming you have an OTF version of the fonts) a Word document in Computer Modern.

Word doesn't really do typesetting, though. You can make credible camera-ready output with it if you're rigorous with styles and learn how to anchor figures and images correctly, but the line and page breaks will still say "hi, I'm a word processor."


> If you have a ton of figures and not a ton of equations it’s not the best choice to use latex.

I don't really agree. You do however have to accept and become happy with one of LaTeX's ideas about what makes good figure placement though.


> writing formulas in latek is wayyy faster to write and easier to maintain

I was told by a friend that the Equation Editor in Word would silently accept LaTeX math-mode equation syntax and convert it automatically. Besides trying it out briefly, i never used it extensively, so I'm not sure how complete it is. Still, it's there.


I think the LaTeX compatibility came in a fairly recent version, but the usual Unicode-based syntax is already a lot easier to type (especially on non-US keyboard layouts) and read, since it's a lot more compact and uses whitespace to separate things by default instead of requiring large amounts of curly braces.


Can confirm. I use the LaTeX mode in equation editor in Word quite religiously for both equations and symbols.


After years of using hacks in MS Word trying to make my CV look the way I wanted, one day I bit the bullet and wrote it in LaTeX. The amount of 3+ hours spent to learn LaTeX basics and doing the re-write were disproportionately low compared to the huge jump in the quality of the output. Having used troff for writing man pages eons ago, this blog makes me interested in learning groff to re-write my CV in it and compare the experience with that of LaTeX.


As a counterpoint, I had to ditch my LaTeX CV when I realised that applicant tracking systems were struggling to properly parse the PDF.

Switching back to a simple Word template (no use of tables; just heading styles and bullet points) and submitting the .docx resolved these issues.


> (no use of tables; just heading styles and bullet points)

Couldn't you have done this with LaTeX anyway?


yikes! you must have felt so dirty having to do that!


W. Richard Stevens wrote some excellent Unix and TCP/IP books in the 1990s. He used troff / groff

RIP Richard, your books were amazing and his son thought he was cool because his book was in Wayne's World 2

http://www.kohala.com/start/

https://www.salon.com/2000/09/01/rich_stevens/


Speaking of typesetting…

This article is incorrectly scaled for mobile. There's no padding around the text so it butts up against the edge of my display. The line widths are way too long for comfortable reading. The blog entry also starts off with an unsemantic blockquote element that quotes nothing from a source.

But yes, Pandoc is a cool piece of software.


OP here: thanks for the feedback, added padding and correct scale. Should look better now!


+1 for using `ch` as your max-width unit and `rem` for your font size. Pixels is definitely the wrong value. I have my user agent set up with slightly larger default font size to be easier for me to read. I appreciate that you respect my settings using these relative values.


Also weirdly enough, browsers also can seem to go into reader-mode, to compensate. I've seen this before, but in this case it seems a little weird that reader mode wouldn't work.


This together with some padding could help:

  <meta name="viewport" content="width=device-width, initial-scale=1">


Nice tour of student typesetting today. Not surprising to find roff still in service too. My thesis in the late 80s was set using nroff, fig and eqn, all of which I've fond memories.

Surely WYSIWYG and "office" suites were a disaster for writing. Students seem to spend lost weeks and months fiddling with MS-Word only to create mediocre looking output.

Personally I's say it's hard to beat Org-mode, separate plain text files, then adding the desired exporter and style files at the last minute.


> Students seem to spend lost weeks and months fiddling with MS-Word only to create mediocre looking output.

I am suprised, and keep being surprised, that people haven't yet figured out that there is an excellent tool, that is TeXmacs, that manages to make WYSIWYG the best way to write structured documents while having complete control on the output and never having to fiddle with details.


This was not my experience with TeXmacs. I tried it recently after using LaTeX for all of my papers, and while it is nice, it is not replacing LaTeX for me.

Table output in particular was much lower quality than LaTeX with booktabs. I think I had to manually resize columns, which was tedious, and I never had to do that with LaTeX. There were a lot of similar situations I found myself in, where I wound up needing to fight TeXmacs quite a bit to get it to output what I wanted.

I prefer my LaTeX workflow where I can edit markup in Emacs, and have a preview almost instantly generated next to my editor by a filesystem watcher & makefile. TeXmacs necessitates using its own interface (which lacks my vim keybindings and Emacs customizations) and I could not find many resources on editing TeXmacs documents in external programs.

I did appreciate that the general typesetting in TeXmacs was high quality, and the ability to type TeX macros and get e.g. enumerated lists quickly was very nice. But overall, I prefer LaTeX.


> TeXmacs necessitates using its own interface (which lacks my vim keybindings and Emacs customizations)

TeXmacs's own interface is deeply customizable by the user via Scheme.

I think you can set it up to have vim keybindings---see experimental code at https://github.com/chxiaoxn/texmacs-vi-experiment and comments at http://forum.texmacs.cn/t/a-very-tiny-vim-in-texmacs/176 (I know that the lack of a block cursor has put off someone, but I did not find that comment in the brief search I did now).


It's always been a mystery to me why TeXmacs is so good yet obscure. I've stayed away from it because I feared there was a caveat that I didn't know about.


I am using it since several years and I have yet to find the caveat :-)

It has low discoverability, so you have to go through the manuals, the mailing list and the forum (and maybe the blog too, at https://texmacs.gitee.io/notes/docs/main.html) to figure out all that it can do, and to have complete control. On the other hand, it is quite usable with default settings.


It has a terrible name, conflating TeX and Emacs signals something arcane, doubly so.


Fond of Heirloom Troff (rather than GNU Troff). See https://n-t-roff.github.io/heirloom/doctools.html Native support for TTF and OTF fonts. Knuth's algorithm for formatting paragraphs.


Will definitely try this. I sometimes used latex at work for things like contracts and other documents that should look formal. But occasionally you need to share it with someone to get their input before it is final. Lots of people are unfamiliar with latex. So I switched to markdown. Markdown does not get in your way, so even those unfamiliar with it get the hang of it.


I wrote my Masters thesis in LaTeX, which is why I wrote my PhD thesis in Word.


I've seen people crying over Word for not being able to work with proper styles or dealing correctly with cross references, bibliography included, all of which is relatively easy for Latex. Bibliographies in Word is almost impossible without a third party plugin like Zotero, and less able people doesn't even know they exist.

There's a well working line of business in my Uni that consist on properly final-formatting thesis with Word.

Luckily for Microsoft "easy" products, there are a legion of people that work for free as technical support.


I wrote my dissertation in Word, and I found it more than sufficient. WYSIWYG is still the best way to edit documents, but it’s not great for version control. Word’s equation editor is great though, and I enjoyed the ability to precisely place figures. Although resolving references can take a while with hundreds, I think they could serve to improve that.


Yeah there was a time when I thought my speed of typing was the thing slowing down my thesis writing. So I spent a week training Dragon Naturally Speaking to be able to transcribe my voice.

Turns out that really wasn't the bottleneck, and I had just spent another week distracting myself with technology to avoid writing.


I wrote my masters thesis using troff in the early 1980s. Later that decade, I used a version of nroff on PC-DOS for my job. It seems, viewed from a sufficient distance, that this wheel has been re-invented a number of times since then.


In the mid-70s, I typed my senior thesis on a reconditioned manual (Underwood) and a borrowed electric typewriter. By the time I did my masters in the late 80s, all my papers were composed in vde on CP/M and formatted with TeX.


Having used many *roff variants (e.g., troff, nroff, ditroff, groff) over decades and also having rather extensive experience with LaTeX, I'd now definitely choose the latter for any serious typesetting task.

Pain points include many customization points: re-creating exact document specifications provided externally, using specific typefaces, creating your own macros... Oh, and leaving ASCII (or ISO-8859-1) for multi-script characters.

Today's groff is a very fine software, if you are satisfied with its default settings and your task is in the domain it handles.


groff can do all of the above. The big issue is finding the documentation for it.


Oh, I know how to do most of these, by now. But they are still pain points.


Similar to the experiences of other commenters, I find the LaTeX edit-compile-review cycle to only grow unreasonably slow when none of the incremental compilation features are used. For larger documents I recommend (i) splitting the document to leverage the \include and \includeonly commands, and (ii) using the Tikz library "external" to avoid the unnecessary recompilation of unchanged graphics. PGF/TikZ is often a bottleneck.

I agree though that it would be nice if the compilation (esp. from scratch) were generally faster.


Still even using all of that, my thesis with heavy inline tikz took about 5 minutes per run (about 120 pages). And a full rerun with all tikz graphs redone (about 20), it took just shy of 20 minutes if the indexes existed already. That was all on a surface 4 pro from ~2015.


Wow. That's competing with some C++ projects.


The number one reason to lean heavily on sectioning via \include is for debugging. Debugging Latex is a disaster, and it is only by compartmentalizing code into smaller sections do you have a hope of isolating the problem.


I first tried writing my MSc theses as a set of AsciiDoc(tor) files. I really enjoyed how much more flexibility AsciiDoc gave me over MD so I was pretty set on it. I _really_ hated the equations it generated and AsciiDoc isn't a Pandoc source, sadly. Even worse, the tooling was monstrous. I had entire build scripts that were getting more and more convoluted.

I relented and went to LaTeX, and while the limitations mentioned here resonate with me, I've found it totally doable.


Here's an old but good book on Unix Text Processing with nroff/troff and its companion tools eqn, pic, tbl. It explains how to use the tools and how to write your own macro packages for them:

https://www.oreilly.com/openbook/utp/


The article is making me want to try it, but it’s a bit light on technical details and I’m concerned of having to go down a rabbit hole of learning a bunch of new tech.

Perhaps posting a git repository of a sample phd thesis (with a couple of empty chapters, sample figures/images, tables) could be something that others would really benefit from.




groff -step -k file.groff > file.pdf

For mom, there's pdfmom -step -k. No need to waste time and ssd space with pandoc.


Nitpick here, but this should really be titled "Using groff to render markdown to PDF faster".


I may have misunderstood the article, but by the end I think they were writing directly in groff.


> I wrote my PhD thesis using groff

I was hoping it would be via gnu ed too, but they used vim. Shame.


@OP: incidentally, i'd love to see examples (thesis or otherwise) of your groff documents and related makefiles if you have any publically available


Writing a PhD is much easier than writing a book.

To begin with, it has a very simple and constrained structure: abstract, problem description, state of the art, interesting subtopic 1, interesting subtopic 2, interesting subtopic 3, results and perspectives.

Interesting subtopics are also just previous articles that you can just recycle.


> Writing a PhD is much easier than writing a book.

IMHO, I think that's a pretty broad statement that's wrong as often as it's true. Surely it depends on the book, the thesis, and the discipline/genre.

Firstly, not all disciplines have theses broken down in the way you've outline. Theses in certain humanities often more resemble non-fiction than those of other disciplines.

Of course, some disciplines or departments or schools or supervisors will have you write a "thesis by manuscript" in which you present manuscripts you've written as chapter and write little interludes connecting them as well as a unifying intro and conclusion.

This on the face of it might seem like "just recycling" previous articles, but I think it overlooks the fact that those manuscripts must be written, at least in majority, by you. Even when you aren't writing a "thesis by manuscript", most people I know write chapters as they go along their PhD.

And finally, a bit of digression, but I don't think it's reasonable to exclude the amount of research it takes to write "books" or theses from the estimation of effort it takes to write them. It's an integral part of the process.


Wow, that looks nice :)


Ahh the procrastination when writing a PhD.

Instead of actually writing it, you research a million different ways how to render it, and then you write a blog post on it :)


I can relate all too well the procrastination involved in fiddling with the typesetting engine! When I was writing my bachelor's thesis (in Philosophy, nonetheless!), I spent an inordinate amount of time on tweaking my LaTeX template and workflow.

I ended up falling way too deep into the Rabbit hole, and started using NixOS just to write the thesis itself. It did eventually result in a fun blog post, though!

https://shen.hong.io/nixos-for-philosophy-installing-firefox...


Great blog post! "Plato didn't write his dialogues on Microsoft Word, and neither should you".


Ha, familiar. I decided I wanted the pdf of my dissertation to be reproducible and hence packaged it with Nix.


In the same way, Knuth wrote a set of books on typesetting. Procrastination at its finest.

https://en.wikipedia.org/wiki/Computers_and_Typesetting


He didn’t just write books on typesetting, he took a year off of work on Volume 2 of The Art of Computer Programming to _write a new typesetting system_. Naturally, he had to invent/implement vector fonts along the way. But even that wasn't quite enough, so he also took the time to invent literate programming, a style of programming where the source code is also a book with narrative structure to guide the user to a complete understanding of every nuance of the source code. If you compile the TeX source code one way, you get a program for typesetting documents. Compile it another way and you get a TeX document containing TeX: The Book. Same with the Metafont source code as well. All together I think it delayed Volume 2 by a decade.

And then he continued publishing books about typography, in his spare time, for the next few decades.


Tex itself took Knuth 10 years to complete, if I recall correctly.


I have been guilty of doing this in so many contexts. Obviously with Latex while in college, later with logging libraries, with config libraries, with server frameworks etc. instead of doing the actual functional task at hand, the compulsion to play with tools, libraries and frameworks was strong. Lately, I'm convinced all these tools with their own DSL syntax, complex mental models and configuration options are just unnecessary cognitive overhead especially for a one-off task you do once in a while. We should make it easy to learn concepts with rich visual representation of the graphical interface and apply them on the fly – this saves cognitive load for the initial few iterations of the task. If the user finds themselves doing the task many times then they can advance to learn a DSL specific to that tool. Even here, I wish there was a widely accepted universal declarative syntax/grammar for all kinds of DSLs – configurations, policies, typesetting etc.


I finished my PhD almost 30 years ago, and finding the motivation to actually finish was as much of a problem as it is today. And tinkering with typesetting was a familiar procrastination magnet. My friends who used LaTeX easily tacked multiple months onto their effort, and my hunch is that CPU time was not the limiting factor.

My advice is: Remember, nobody's going to read it.

My parents' theses were about 50 pages, typewritten, equations and chemical diagrams entered by hand. The got the same grade as I did. ;-)


It’s a rite of passage with latex though. Each one of us has to spend the time and set up our own latex setup the first time we write using it. Hopefully never again though.


Spent my teens and twenties in mathematics departments. This is a very polarized behavioral pattern:

1. The majority of people just don't care very much. Just get LaTeX working locally, get it to run the packages you need - amsmath, etc. - and start working on your mathematics.

2. There is a large minority who dive deep into the rabbithole on their editing environment, typography, diagramming, etc.

Amusingly, you can tell how likely someone is to fall into either camp based on the amount of care they take with and over their notes (mostly hand-written in my day).


Ever since Overleaf (and sharelatex) I would say this is no longer the case. It's not optimal but it works well enough.


I did exactly the same for my master thesis. This is why so am so not interested in doing a PhD.


Same for my undergrad thesis. I learned so much more about the art of typesetting that year than I did about Hypergeometric Integrals or String Amplitudes: the topics of the thesis. That's why I left before doing a masters.


I even took the Artificial Intelligence with Andrew Ng course at Stanford through Coursera. That's how much I was procrastinating. :)

My conclusion is that writing a thesis fucking sucks and damaged me.


I wrote mine in Word and it's SO UGLY. I wish I'd procrastinated more by fiddling with different rendering methods!


But starting with latex is almost always ~0 minute.

You can just write the text, and write the equations between dollar signs, and it renders you book quality output by default.

(You also can tweak it as much as you want it, and spend as much time as you want it on it.)


I wrote my PhD in LaTex with the simplest template I could find online (luckily someone had put one up formatted for my university's engineering department and I didn't have to mess with it almost at all).

But once I was done, I wanted to blow off some steam and started writing a silly little tabletop RPG. I decided the rulebook would be text-only for portability with box drawing borders and ASCII tables and stuff, so I spent the first week or so writing a small ASCII typesetting engine in Prolog (because logic programmer).

And then automated the ToC and section numbering .

And then I spent more time writing a vim syntax file so I could read the glorious ASCII with syntax highlighting.

Here:

https://github.com/stassa/nests-and-insects

I'm still looking for ANSI/ ASCII art contributions btw.


> then you write a blog post on it

Not so fast. You also need to try a million different SSGs for your blog, then eventually write your own.


It's a miracle that the developer of GNU roff actually managed to complete the suite before procrastinating big-time into SGML. AFAICT, he resisted even the urge to implement the roff dot command syntax on top of SGML SHORTREF.


Well, following great ones - Knuth started doing TeX before writing TAOCP


That is incorrect. The first 3 volumes of "The Art of Computer Programming" were published in 1968, 1969, and 1973. Knuth started working on TeX in 1977, because he was disappointed by the quality of the galley proofs for the 2nd edition of TAOCP.


I stand corrected


Blog post will likely have a wider audience than hundreds of pages droning about "The Clausula as Fundamental Unit"




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: