If your only target from Markdown is LaTeX, I recommend the LaTeX markdown package which uses lua to parse and convert the input markdown: https://github.com/Witiko/markdown
Pandoc shines when targetting multiple backends, like EPUB, HTML and LaTeX/PDF. But the LaTeX markdown package has one advantage when only targetting LaTeX, especially in academic settings: it translates everything to the LaTeX you'd expected. For example, citations are translated into biblatex calls, while pandoc generates all of the citations and the bibliography by itself, bypassing the style and behavior you and your professors might expect.
Just spend 20 minutes learning LaTeX. If you’re already not doing anything more complicated than what you can do with markdown, that’s all the longer it will take.
I have found this to be more true than makes sense.
Even going with bare TeX is surprisingly nicer for plaintext files than I would have imagined. Yes, figures can be hard. But not really much harder than they are in any other tool. With the bonus that I have never really seen a document regress with the addition of new sections.
Really the hardest part of figures and such is coming up with the actual figure in the first place. Most layout issues are ultimately solved by editing the figure.
20 minutes? I've used LaTeX on and off for a decade and I still feel like I don't really understand it. I mean, I can copy-paste my way to do anything with it, but usually I'm Googling "latex how do I X" over and over.
If you learned it in 20 minutes please point me to the tutorial.
I think the point was if you are doing something that you could otherwise do with markdown, then it really isn't that difficult. Since, for the complicated things, the answer in Markdown is simply that you can't do them.
What is X though? If it's "paragraphs" then fair enough, but if it's "obscure tikz diagram" etc. etc. then that's a different conundrum.
If this is actually a point about latex being difficult to (say) write libraries for then I agree profusely but equally if you are writing latex-without-latex or whatever then you still have to deal with latex's wackiness even if it's got a shiny new frontend with its own quirks.
Some billionaire funding a well designed latex killer would probably be a real gift to science - there is probably a lot of ground to be made up in spreading information (i.e. rendering latex for the web is surprisingly hard) and adding features like (say) structured programming to things like writing mathematics in LaTeX - I'd love for my work to be checked by a CAS automatically for example.
From my experience, doing anything too advanced is a distraction. It sort of ruins the point of latex for me, if I have to use too many tweaks. So I always try to keep it as vanilla as possible.
LaTeX can be fine for some, but it requires either an internet connection (Overleaf) or the use of some custom IDE. If you write in a plain-text editor, custom compilation sucks.
As a personal aside, I can't stand the LaTeX syntax; needing to have a \begin{itemize}, \end{itemize}, \item for every list, as opposed to a simple dash uses way more characters, adds more visual noise, and makes it less pleasant to read when editing. Ditto for something like \begin{tabular} as opposed to a pleasant ascii grid.
Write in a plain text editor, have evince open the generated document, it will reload when updated. Then use inotifywatch to generate on save, or just write a hook in your text editor that will update the pdf.
You don't need to use overleaf to get the overleaf experience...
Adding a hook for compilation in most decent editors is only a few lines of config, and many popular editors (GNU Emacs, vim, m$ vscode) nowadays have packages that have these hooks predefined, so it pretty much just comes down to installing a package, which I hope you don't think is beyond the capabilities of the average user. Alternatively there are editors like texworks which provide exactly this feature out of the box.
There's a pretty incredible amount of text preparation which can be done only in bog-standard Markdown, and if that's the case, go for it.
Markdown has its limitations, and if you find yourself starting to fight it ... consider that it's probably time to take the next step.
LaTeX itself is surprisingly easy to write, again most especially if you're writing simple documents.
And using Markdown to bootstrap creating a LaTeX document is also a good option. I find Markdown slightly lower-friction to write, and will use it when either composing or copying works initially. If they do require more complex formatting, I'll generate the first iteration of the LaTeX source with Pandoc, then dive into that and add additional elements or tweak complex sections as needed.
There's really no reason you need to switch. You can put raw LaTeX directly into Pandoc Markdown documents; it'll be ignored in non-LaTeX backends, and in LaTeX it'll just be reproduced verbatim. If you need to tweak the template that Pandoc puts "around" the Markdown content, you can do that too (and then centralize your changes so you don't need to do it over and over again). Eisvogel is an example of a project that does this [1].
I think this is really the best of both worlds, because you can produce documents that (fairly) easily generate print books via LaTeX while simultaneously generating webpages or ePubs for electronic distribution.
Whilst true, you do lose the capability of using a single document source to generate multiple output formats (or at least do so easily) once you do that.
A Markdown + LaTeX file processed with Pandoc will generate LaTeX or PDF outputs fine. It will stumble (or more likely: create incompletely formatted outputs) when you try generating other output formats. I am often using single source to create a number of output formats simultaneously (typically: PDF, HTML, ePub, possibly also plain text and PS).
So long as I stick to a single markup language, Pandoc has a much easier time of it.
Yes, you can embed multiple native formatting bits with some clever playing around with comment characters (I did manage to get both HTML and LaTeX going in one document I was working on, with Markdown as the nominal primary). That's ... not very elegant and tends to be fragile and error-prone.
(I've a StackExchange post somewhere detailing how to do this for those interested.)
My experience is that LaTeX is sufficiently simple and powerful that it's arguably the best underlying universal source.
Using Pandoc, you can still produce HTML or ePub from LaTeX source.
You might want to include or exclude some different front- and back-matter elements or other enclosures. For that I'd recommend developing a good standardised template model and build system (e.g., Makefiles).
You don’t need to play tricks with comment characters. From the Pandoc manual (https://pandoc.org/MANUAL.html) [HN is messing with the backtics, but you get the idea]:
Extension: raw_attribute
Inline spans and fenced code blocks with a special kind of attribute will be parsed as raw content with the designated format. For example, the following produces a raw roff ms block:
`````````{=ms}
.MYMACRO
blah blah
`````````
And the following produces a raw html inline element:
This is `<a>html</a>`{=html}
As you suggest, it turns out to be the raw_attribute extension which enables this. Note that this requires maintaining parallel content for each output format.
Yes, it requires parallel content. But I still think this is a better situation than a LaTeX source. While Pandoc can read LaTeX, it's limited to a fairly narrow subset. Anything you do outside of that will generate the raw formatting blocks the gp showed (and won't in general appear in any other format output). And you won't be left with any way to represent what should go in those places in HTML, ePub, etc.
To put it in a more programmer centric way, Pandoc allows you to have the moral equivalent of ifdefs instead of maintaining separate source files for each output format. That's a huge advantage when you want to keep the content in sync.
Though there are other tools for mapping LaTeX to specific formats.
I'll keep this in mind, and I'm aware that all document formatting specification conversions are lossy, ambiguous, or both. LaTeX seems to me to be the least lossy and most consistent over time, as well as sufficiently lightweight in authoring.
There's an idea I've been using for myself, of a minimum sufficient level of typographic specification. For many works that is truly de minimus. I've typeset book-length works using nothing but Markdown, almost always sufficiently (there are a couple of limitations, with underlining and text justification (right or centre) being the principle ones.
There's no specific hierarchy of formatting levels, though there's a rough classification, from whitespace and emphasis to formuale and interactive diagrammes, as well as document-level constructs such as tables of contents / figures / tables, etc., foot / side / end notes, bibliographic references, and indices. LaTeX supports all of these rather well. Markdown can achieve some of these. HTML itself natively has no inherent concept of many of these, though the typographic and some semantic primitives (e.g., hyperlinks) from which they can be created do exist.
So if I'm looking for a consistent sufficient standard, I'd lean strongly toward LaTeX, and find ways to deal with the edge cases in conversion as they crop up.
That’s exactly it. I usually use these raw_attribute blocks for things like tables, when I want more control than I get when leaving it to Pandoc’s translation from Markdown tables to HTML or LaTeX.
Another alternative often missed is LyX. It is a WYSWYM editor with LaTeX in the core. As with other WYSWYM or hybrid systems, it lacks the precise formatting
I remember using early versions of LyX back in college in 1995.
I feel that it led to the development of KDE. I remember Matthias Ettrich released LyX and it used the Qt toolkit.
The next year I remember Matthias announcing the "Kool Desktop Environment" and it also was based on Qt. I don't know if Matthias just really like Qt or if he was happy with the success of LyX and decided to make an entire desktop environment.
This is definitely the answer... people glossing over this should note that the What You Mean part is reflected in the GUI and makes production of quite sophisticated documents easy. As you expand your LaTeX knowledge you can move on to insert it directly into documents.
Highly recommended if you just want to get started.
This is really cool, and not something you'd be able to typically do in Vim or the like. But it also uses latex notation, which I can't stand for things like itemized lists, haha.
Actually, you can just write "numbersections: true" in your front matter. You can also add "toc: true" to get a Table of Contents.
For better control over styles, you can also tell Pandoc to use a DOCX or ODT document (the latter doesn't seem to work as well as the former) as a reference to how to style specific styles when generating a word processor document.
TeXmavs getting a lot of love recently. My favourite feature is the way you generate maths symbols. -> makes an arrow. Hit tab a few times to cycle through different arrow styles. So intuitive. I wish Lyx had this feature. But as a dedicated hater of TeX maybe I should give TM another try.
Pandoc is a great piece of software. It‘s great on its own, but also is great as a basis for automatic book generation with R Markdown or the Julia Books.jl package that I wrote. People told me not to use Pandoc because I would have a bad day if I ever ran into the boundaries of what is possible. That turned out to never be a problem, the templating system is extremely powerful and provides a lot of flexibility in combination with passing variables directly into the templates
A lesser-known feature of Pandoc that I love are filters [0]. They allow you to manipulate the document tree mid-conversion before the output is generated. I've been working on a forked Pandoc filter that executes code blocks with Jupyter kernels, and then inserts the output [1]
LaTeX starts being a massive pain once you write tables—to remediate that I use Org mode’s table editor[0]. Combined with the ability to run code blocks and parse data, it makes LaTeX a very flexible document format. For instance I had a data table that would be rendered via gnuplot and the resulting PNG inserted into the document.
To be fair, most things become a pain once you start doing tables. Just look at the HTML way of getting numbers aligned on a decimal point. Heaven help you if you want to align anything on an equals sign. :D
I wrote the huxtable R package to output tables in HTML, Word, RTF and TeX, and TeX is more painful than all the others combined. Basic tables are less capable than HTML, and for anything more advanced you are quickly in a maze of competing packages.
I'm curious what sort of tables you have in mind. At large, tables that I would use for a report don't seem that hard.
Now, as soon as I'm reaching for packages outside of core, things change. But, again, as soon as I do that in any other package, heaven help me. We still haven't made our tables on our tools site align the numbers on the decimal point.
Rowspans. Coloured backgrounds. Non-standard borders. Vertical alignment (the LaTeX version of this does not do anything intuitive).
And if you ask TeX people how to do these things, they will tell you you don’t need to, at which point you will die a little inside.
I should add that I disagree that TeX has a good solution for decimal-point aligned columns. It has an absurd solution which involves dividing the column into two “columns”. As with many TeX features, as soon as you want to do this programmatically, it becomes a nightmare.(I don’t claim that any other markup languages are better at this – HTML still sucks for it - though Excel may do okay.)
I am probably more aligned with the folks that don't like colors and such. I hasten to add that for interactive and color screens, I change my mind.
Agreed that the TeX solution is bad if you want semantic markup of a table. If you abandon the idea of selective fields, it doesn't seem that bad to me.
One option for Markdown-esque input and Latex output is RMarkdown. RStudio does a nice job of allowing you to write markdown, embed references, code cells, and visualizations. I used it in grad school and only rarely had to drop down to Latex to do something more customized.
I just used markdown-to-pandoc-to-latex to typeset my NaNoWriMo novel, and ordered myself a 6x9 trade paperback from Amazon KDP for Christmas. It worked great. I used a custom LaTeX template.
I'm also trying to do it with a choose-your-own-adventure style novel and am less sure I'll be able to do that since it requires more custom stuff to get the page refs right. But I think it's possible.
I'm not really sure what would be impossible in pandoc-to-latex that is only possible in latex. Can't you always do custom latex templates and filters, and pass through other latex code?
You can use mathpix.com web editor to also export markdown to latex or even directly to Overleaf. We also support converting Markdown to docx/pdf, we also do OCR for full PDF documents (works well for scientific documents). Disclaimer: I am the founder.
I haven’t used pandoc in years, do they allow markdown tables to be rendered as a PDF with the grid separator now? At the time, iirc, a pandoc dev took the very opinionated stance that you never needed the grid between fields, and if you did, you simply had poor typography and needed to adjust your use case.
I think so. My memory fails me (this was 6 or 7 years ago and I’m getting the the bottom of the bottle here), it might have had something to do with the padding in the tables themselves so that they were easier to read. The specific use case was getting recipe ingredients and their quantities presented nicely so they were clear and easy to read at a glance. I wanted to do them all in markdown and export to PDF with a bash script, but in the end, I used plain text and judicious application of white space for maximum portability.
They might have it now, at the time I was shocked the dev refused to even consider inclusion of the feature, even if a patch was submitted (from what I remember).
Yes, there is org-cite that handles reference very well (and reads your bib file) and it can inline-render pretty much any math (using a latex backend).
Pandoc shines when targetting multiple backends, like EPUB, HTML and LaTeX/PDF. But the LaTeX markdown package has one advantage when only targetting LaTeX, especially in academic settings: it translates everything to the LaTeX you'd expected. For example, citations are translated into biblatex calls, while pandoc generates all of the citations and the bibliography by itself, bypassing the style and behavior you and your professors might expect.