To be fair, there are a number of advantages to having content in PDF's rather t...

harshreality · on April 23, 2013

PDF preserves layout whereas html doesn't (generally, although it can preserve layout somewhat if you specify one font, fix font sizes, and fix all dimensions), but that's a negative for PDFs.

Different displays have different resolutions and different dimensions. That makes fixed layouts obnoxious to read on e-readers. The most obvious example of how fixed formatting is an abomination is a PDF on an ebook reader.

There is one reigning reason for having a fixed page layout: ability to cite pages, e.g. for academic citations.

There is no need to cite pages for works that have good digital versions, other than antiquated citation requirements (e.g. MLA/CMS). Ebook readers have text search. Cite a digital form of the work, and anyone with that work in digital form can find it... an exception being OCR'd pdfs and ebooks converted from PDFs, because OCR often screws up line breaks or has other errors that make searching unreliable.

For less-granular searching, there's section, table, and image numbering/labeling. Again, fixed layout is not required to make content searchable.

mcintyre1994 · on April 23, 2013

I think you're missing a key part of citing (at least for me), paraphrasing. When discussing sentiment or similar without quoting, you wouldn't know what to search for. It makes sense to give a page number to review and get more information on the source, without that you wouldn't know where to look. You'd have to read the entire source.

For text, only section in your "less-granular searching" would work, and not necessarily particularly well. Page number would often be more granular, and that's what's needed for paraphrasing - exactly where to find what you're discussing - but not exactly what it said.

harshreality · on April 23, 2013

If you're paraphrasing, then you could direct-quote a critical part of the text so that a reader knows what to search for.

A writer could also adopt page-neutral paragraph or sub-paragraph numbering. This is common, for instance, in classical texts (where such numbering has been done retrospectively, by scholars, because of the difficulties of citing passages with so many translations - the same general problem that paraphrasing has). To cite a few major examples, Plato's and Artistotle's works have such numbering, as does Hobbes' Leviathan.

Page numbering has poor granularity for citations in real scholarly work (where you'd ideally want to be able to cite within a few lines at most), and it prohibits flowed text. If someone has some case where they think they need page numbers for citations, they should figure out an alternative (a few of which have been outlined above).

rzt · on April 23, 2013

No, I totally agree with you on all those points –– and they definitely highlight the value of PDF as a format.

My quarrel is with misapplications of PDF as a way of presenting regular old page content that would probably never get printed, stuff such as lists of contacts, FAQs, blog posts, etc.

I work with folks across a university who know how to make a PDF and embed a hyperlink into a page...and that's all they can do. Their content might be better off as a regular page, but it's a PDF because that's how they learned to upload content to a website in the absence of a CMS. I know, it's weird, but that's what I see on a day-to-day basis.