Don Knuth releases Volume 4, Pre-fascicle 6A [gzipped ps]

rrmm · on Aug 2, 2012

On page 29, Knuth says, "We can regard both variables and clauses as active agents, who continually tweet to their neighbors in this social network." This in the context of statistical physics.

Seeing Knuth talk about tweets feels weird.

On another note, anyone actually run into a random k-SAT problem in "real life"?

repsilat · on Aug 2, 2012

On another note, anyone actually run into a random k-SAT problem in "real life"?

Because there's a lot of research into solving SAT "pretty quickly", one of the half-reasonable ways to approach an NP-complete problem is to find an efficient reduction and use a premade SAT solver/heuristic. I don't think nearly as much work has gone into that sort of thing as, say, integer programming (http://en.wikipedia.org/wiki/Integer_programming), but I remember reading something about it a couple of years ago.

Of course, when people realise their problem is NP-complete they tend to go a different way with things. The usual answers are:

1. Problems in practice are small enough for brute-force (or a simple backtracking search)

2. We probably don't need optimality, just use some kind of heuristic or metaheuristic.

3. Too hard, let's do something else.

zeroonetwothree · on Aug 2, 2012

Yeah in my experience reductions to IP are simpler and result in smaller problem sizes.

eru · on Aug 2, 2012

Or even mixed integer problems.

From what I've heard, researchers are actually rejoicing, once they are proven a problem is in NP (instead of something worse or unknown status): because that's means it's quite likely that most of it's instances are tractable and we know lots about how to solve NP problems.

mjn · on Aug 2, 2012

It's a quite common reaction in AI, yeah: once we can reduce the problem to some standard NP-problem solver, we're probably in good shape.

I believe SatPlan (1992) initiated that trend, by showing that compiling classical planning problems to SAT can get significant speedups over traditional planning algorithms in some cases. This ties into another, mostly informal hallway debate so far, over whether AI needs a replacement idea of "AI-completeness" to specify the problems that are really hard from an AI perspective. I wrote a short opinion piece on that a few years ago: http://www.kmjn.org/notes/nphard_not_always_hard.html

A more common problem than SAT not being fast enough is the reduction being intractable. It's ok if it's "big" (SAT solvers can solve for millions of variables), but it's easy for naive encodings to generate truly absurd blowups, gigabytes or more, which can make it intractable to even state the SAT problem, i.e. write it to disk or send it over a pipe. If you avoid that problem, the actual SAT-solving step is usually fast. I believe that's where some of the interest in alternatives to SAT comes from, as compilation targets, so to speak, that are easier to generate non-blown-up code for.

SAT-heritage grounding targets are still common in logic programming, though, e.g. http://potassco.sourceforge.net/ uses a SAT-like approach, though one specialized for answer-set programming rather than directly targeting SAT.

eru · on Aug 2, 2012

Thanks for the insights! My area of expertise is closer to integer programming. I'll check out the links.

mjn · on Aug 2, 2012

I've got a bit of a mirror image of that: I use these "AI-heritage" finite-domain solvers like SAT, ASP, clp(fd), but have been wondering lately if tools from other communities, such as integer programming, could be more useful to me for some applications. :)

My impression is that there's a little bit of tool-choice segregation by community, with OR people, AI people, software-verification people, and PLs people each having their own favorite tools, and not as much overlap as there could be.

brandall10 · on Aug 2, 2012

More than weird, will a 'tweet' have relevance 50 years from now? Even a social network? Seems using things that have a temporal cultural context is a bit iffy.

Granted, I haven't mustered the courage to begin reading TAOCP, but my impression is it's a timelessly relevant masterwork for the man.

rrmm · on Aug 2, 2012

> More than weird, will a 'tweet' have relevance 50 years from now? Even a social network? Seems using things that have a temporal cultural context is a bit iffy.

I don't think Knuth is under any illusions about staying relevant: The original editions had code examples for a machine having 6-bit bytes and used self-modifying code to store the return address of subroutines rather than a stack.

In later editions he updated the examples, but even with respect to algorithms the new fascicles make references to quite recent papers and results. The books as timeless as they are, are products of their time more so than most books on mathematical topics.

jacobolus · on Aug 2, 2012

The whole thing is filled with dry humor. If you’re not chuckling when you read it, you’re not paying enough attention. :-)

First example from this new section about satisfiability, the epigraphs:

> “He reaps no satisfaction but from low and sensual objects, or from the indulgence of malignant passions.” – David Hume, The Skeptic (1742)

> “I can’t get no ...” — Mick Jagger & Keith Richards, “Satisfaction” (1965)

Or an example from the text, from a few pages later on:

> “Start with any truth assignment. While there are unsatisfied clauses, pick any one, and flip a random literal in it.”

> Some programmers are known to debug their code in a haphazard manner, somewhat like this approach, and we know that such “blind” changes are foolish because they usually introduce new bugs....

kcl · on Aug 2, 2012

"Tweet" is such a good replacement for "asynchronously broadcast" that it won't matter whether Twitter is still around. The meaning is clear from context. It would not surprise me if "tweet" became a reserved word in computer science research.

jere · on Aug 2, 2012

The word tweet has existed for over a hundred years. Assuming that birds still exist in 50 years, its meaning should be pretty easy to infer in that context.

I'm betting "social" and "network" will still be recognizable words in 50 years too.

Zenst · on Aug 2, 2012

Bit off topic, but I have to say "temporal cultural context" is pure wordsmith genius. Thank you.

eru · on Aug 2, 2012

> More than weird, will a 'tweet' have relevance 50 years from now?

It might just give the whole work a nice and quaint touch.

eta_carinae · on Aug 2, 2012

> Seeing Knuth talk about tweets feels weird.

Does it make you feel better that he released the book as a .ps?

dunham · on Aug 3, 2012

The big issue for me are the bitmap fonts. Fortunately he used a modern dvips, so you can run it through "pkfix" to get a version with scalable fonts. (Once it's converted to pdf, it's more challenging to fix.)

sp332 · on Aug 3, 2012

What part of it has bitmapped fonts?

dunham · on Aug 13, 2012

Sorry for the late response, I missed your reply.

All of the fonts in that postscript file are bitmaps. Knuth started with metafont, but the file he distributed only contains bitmaps.

If you look inside the file, you'll find the comment "DVIPSBitmapFont". And if you convert it to PDF and zoom in, you'll see that the fonts are bitmaps. This is typical for TeX files generated by dvips prior to the widespread adoption of the Type1 equivalents for the computer modern fonts.

Fortunately, the file was generated by a recent version of dvips, so the "pkfix" script can be used to rewrite it with Type1 fonts substituted for the bitmapped fonts. I'd recommend doing this before converting it to PDF.

It's a bit harder to fix this after PDF conversion. A few years ago, I wrote some Java code to do it (inspired by "pkfix-helper"), but it worked off of font metric signatures so it's not very accurate for small font subsets (e.g. exponents). To fix this, I need to actually look at bitmaps (or ask the user), and I never got around to implementing that.

I suppose it's a minor issue, but it is a pet peeve of mine. I've come across a lot of papers on the net in PDF or PS format that look unnecessarily bad when viewed on a computer screen.

sp332 · on Aug 13, 2012

OK I didn't know enough about this format to look in it for the fonts. But I zoomed in as far as I could in Evince and everything stayed smooth so I just assumed it wasn't bitmapped.

rrmm · on Aug 2, 2012

I would have preferred a .dvi! (font embedding issues notwithstanding).

ScotterC · on Aug 2, 2012

Would prefer a warning that it's a download link.

kinleyd · on Aug 2, 2012

Agreed. The link should have taken us to a page describing what it's about, with the option to download.

Jach · on Aug 2, 2012

I think the title says what's necessary. But if it did link to such a page (there is no such page that describes just this release that I found by googling the url) someone would complain that it's just blogspam, post the download link in their comment, and get upvotes. There were tweets I found that reference it, who knows what sort of anger that would have generated if the submitter linked to a tweet. (There's already discomfort on this page over Knuth just using the word 'tweet'!)

kinleyd · on Aug 2, 2012

I see what you mean: the link to the download is buried way deep in the OP. Though I don't think anyone would feel it's blogspam if you'd linked to it (as it is clearly Donald Knuth's), directing them to the relevant link on the page is near impossible and linking directly to the download is more efficient. I think you made the right trade-off here.

alexanderh · on Aug 2, 2012

Maybe i'm "out of the loop" but not a single word in that title means anything to me. Dont know who Don Knuth is, dont know why its volume 4, dont know what Pre-fascicle means... And I'm usually pretty with it on hackernews related interests. Not everyone here is part of the Ivy league school circle jerk.

I would have to google pretty much all of that, which is inconvenient. It should have linked to a page describing what it is, even if OP had to make it himself. Its only blogspam if you put ads on your blog... which I'd hope anyone here doesn't.

Jach · on Aug 2, 2012

You're not only out of the loop, you're under a rock if you haven't even heard of Don Knuth before and you're in the computer science world at all. (I guess if you're just a humble web programmer ignorance is understandable.) His material is not for Ivy League students, it's for anyone interested in computer science itself rather than just programming. A "fascicle" is the book-form of a new episode in a television series' season, you learned a new word! (I guess only Ivy League students like learning new words? I'm not one but I like to learn new words.) It's part of the ongoing http://en.wikipedia.org/wiki/TAOCP You did understand "volume 4" I hope, which indicates there is a volume 1, 2, and 3. There are also other inferences you can make from the use of "volume" as opposed to another word (like "part").

For your assertion that it should link to a page describing what it is, I think you're wrong and I don't think you addressed my point about that being less efficient. Even in your case, I think it's more efficient this way. I don't think you would understand much in the submission without a lot of googling, and that's if you're even interested in understanding (which I doubt given your resistance to merely typing "define:fascicle" into Google) in the first place. A blog post describing the work would tell you as much, but since you didn't even understand the title, you already knew that you likely wouldn't understand the work. (Gods help you if you're on Windows and can't read PS files without downloading stuff from websites you've never visited.) An incomprehensible (to you) title saved you time. It's better for you this way. You did ruin your potential efficiency gains by spending time complaining about the submission type in the comments, much more time than it would take to google the parts of the title you didn't understand, but you knew that already when you ventured to the comments (which people seem to do regardless of the format of the content--I know I often get more out of the comments than the submission itself and sometimes will skip the submission entirely).

Edit: I will grant that there is an additional improvement to the title. Adding to the end "on Satisfiability". Which wouldn't help anyone not inclined to google but does at least say more specifically what it's about.

Shorel · on Aug 2, 2012

Honestly, what else would you expect?

Reading the TeX source online?

ScotterC · on Aug 2, 2012

I had no idea what it was. When I clicked, all I knew was that I had heard of Donald Knuth before.

dhruvbird · on Aug 2, 2012

Fixed

insaneirish · on Aug 2, 2012

Change the link.

dhruvbird · on Aug 2, 2012

I can't edit it any more. It seems that the edit timeout has been reached.

sp332 · on Aug 2, 2012

You can't change links, you would have to submit a new story.

walrus · on Aug 2, 2012

Just curious: how did you find this? I don't see a link to it anywhere on Knuth's website. The only reference I can find to it is on Twitter[1].

[1] https://twitter.com/pervognsen/status/230892091188846592

rosh · on Aug 2, 2012

It was on Reddit a few days ago... http://www.reddit.com/r/compsci/comments/xb6ft/

on Aug 2, 2012

[deleted]

emddudley · on Aug 2, 2012

On Windows I use GSView to view PS files.

http://pages.cs.wisc.edu/~ghost/gsview/get50.htm

You must also install GhostScript in order to use it:

http://pages.cs.wisc.edu/~ghost/doc/GPL/gpl902.htm

rat87 · on Aug 2, 2012

You could also use evince or okular on windows.

sp332 · on Aug 2, 2012

Any suggestions for software to read this on a desktop?

eru · on Aug 2, 2012

Gnome's standard document reader Evince works out of the box. As does Okular, KDE's standard reader. (Evince is also available for Windows, I don't know about Okular.)

trotsky · on Aug 2, 2012

http://projects.gnome.org/evince/

emmelaich · on Aug 3, 2012

Google Docs^WDrive works .. eventually. Seems to take a while to process.

I'd make it public, but that would be a copyright violation.

I wonder if Google hashes and caches the result. If they do the next upload should be much faster for anyone.

makmanalp · on Aug 2, 2012

Also for osx, preview works fine for the .ps.

jinmingjian · on Aug 2, 2012

a (very incomplete) draft of section 7.2.2.2...still great

jk · on Aug 2, 2012

It is not a gzipped file; open with any postscript viewer.

cnvogel · on Aug 2, 2012

It is gz-compressed, but (1) your browser might decompresses it on the fly after seeing the "Content-Encoding: x-gzip" and (2) most postscript viewers decompress on the fly as gzipped postscript is so prevalent.

    $ curl -s http://www-cs-faculty.stanford.edu/~knuth/fasc6a.ps.gz | dd bs=16 count=1 | xxd

    0000000: 1f8b 0808 689e 1150 0203 6661 7363 3661  ....h..P..fasc6a

eru · on Aug 2, 2012

> [...] most postscript viewers decompress on the fly as gzipped postscript is so prevalent.

Not only prevalent, but also the right thing to do. PDF was an attempt to (among other aims) achieve smaller filesizes than PS. But that was premature optimization: While a PS file is usually bigger than a PDF, gzipped PS beats PDF.

cshesse · on Aug 2, 2012

file fasc6a.ps.gz fasc6a.ps.gz: gzip compressed data, was "fasc6a.ps", from Unix, last modified: Thu Jul 26 12:45:44 2012, max compression

bulibuta · on Aug 2, 2012

Just made my morning, thanks!