Hacker News new | past | comments | ask | show | jobs | submit login

I had a cryptography professor recommend for one assignment that, instead of trying to recognize the correct plaintext by its similarity to English (an approach that yielded results that could be manually adjusted into the correct plaintext, but wasn't able to find the plaintext by itself...), I should try to disqualify incorrect plaintexts by recognizing them as non-English.

The approach there was just to ingest a long book from Project Gutenberg, record all of the trigrams (including across word boundaries), and disqualify a candidate plaintext if it had more than some low number (3, but it was a short text) of unrecognized trigrams. This worked beautifully.

For reasons I don't recall, I wrote the assignment in C. I didn't want to do the parsing of the book into a trigram table in C. So I did that separately, and emitted C code to assign to a three-dimensional character array for every trigram I found. Then my assignment was the concatenation of some code initializing the array to zeros, the emitted code adding ones to it, and, thousands of lines later, the actual decryption code.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: