Implementing intepreters is cool. But the key value of studying compilers is that it provides you with the tools to build robust, extensible input processing (and, to a slightly lesser extent, data processing and output) code. If programmers are confident with using regular expressions and context-free grammars as appropriate, even if it's just coding up recursive descent parsers, I think they can produce much better input processing in their software.
And a large quantity of software needs to parse input, and possibly turn it in to different output, at some point in its execution.
This reads to me much less like "What every CS major should know" than "Wouldn't it be nice if all CS majors knew...." What I mean by that is I'm not sure it's possible to learn all of that stuff in 4 years at any school anywhere, but it's all useful in certain very broad contexts.
Of course, I may lack perspective, having been a math major (or due to some other deficiency in my background), but I consider myself pretty comfortable with about half the list and somewhat familiar with about half the remainder. How much of it did I learn as an undergrad, though? Not very damn much, and I don't think I would have even been exposed to it all even if I had done CS instead of math.
Of course book-learning and in-class learning needs to be supplemented with a large amount of self-directed learning. As a professor I'm sure Matt is perfectly well acquainted with the limitations of a 4-year degree program.
This brings up an interesting question: assuming that these things can't be taught in four years and that they matter, should we expect that a degree in CS should require more time?
Keywords that I believe would be worth mentioning:
- Lambda Calculus
- Reflective and Meta-programming
- Meta-object Protocol
- Closures
- Continuations
- Monads
- Arrows
- First-class Everything
- Stack and Register-based Programming
- XML
- Linear Algebra
- Fractal and Wavelet Image Compression
- Regular Expressions
- Clojure
- LaTeX
NB. The concepts to which the above keywords refer, may or may not have been covered by the article. The keywords themselves are however absent.
Additional reading suggestions:
- Jon Bentley's Programming Pearls
- Tom Mitchell's Machine Learning
- Douglas Hofstadter's Gödel, Escher, Bach
- Brian Kernighan and Rob Pike's Unix Programming Environment
I recommend The Incompleteness Phenomenon over GEB[1]. It thoroughly presents Gödel's Completeness and Incompleteness theories at a level appropriate to the collegiate Math or Computer Science student.
The discussion of Gödel's theorems is not the only thing that makes GEB an interesting read.
What's more, GEB presents Aristotelianism vs. Platonism in the context of _computation_.
_That_ is a question CS majors should at least be aware of.
The question of whether computations can _ever_ be brought to have consciousness.
Then again, given my limited knowledge, I am far from being an authority on the topic and look forward to gaining more insight by hearing other opinions.
Undoubtedly. However, from a standpoint of teaching a class to Math and Computer Science students you can and should present the backing theories in full detail. Student benefit greatly from having a common write up on the proofs. The philosophical discussion can be saved for the classroom (with supporting supplementary readings).
The best match among existing categories, IMHO is ``Theory,'' which I assume refers to "Theory of Computation."
It is there that formal systems are brought up and CS majors should be aware of the hierarchy and limitations of formal systems, which are discussed in GEB.
If I were to place GEB under an arbitrary category, I would probably suggest ``Philosophy of Computation.''
You mention version control in passing, but it probably deserves more emphasis. CS majors who aren't comfortable using version control are likely to be at a disadvantage.
OCaml may work as an alternative to SML.
You could also mention Information Retrieval, alongside Machine Learning.
Compilers/parsing/interpreters/etc. could really be its own top level item.
Learning to use profilers / performance optimization. _The Practice of Programming_ covers this (and many other things!) quite well.
All in all, a great list. Probably also useful to suggest starting points for people (like myself) who are self-taught.
To be clear: I consider understanding version control essential. I'll emphasize it.
I agree that OCaml and SML are roughly interchangeable for educational purposes. I picked SML because it's slightly "cleaner."
As a compilers prof, I agree that compilers should be its own item, but politically, that's a tough sell. Teaching compilers needs to be sold in the service of another end for most faculties.
Cleaner, yes. I picked OCaml because it seemed easier to find information about, and more widely ported.
Also, I think most people fail to appreciate how useful compiler techniques are outside of "compilers". Even just knowing a little about lexing and parsing goes a long way.
For Prolog, I'd also recommend _The Art of Prolog_ by Sterling & Shapiro and/or Clocksin's _Clause & Effect_.
I think probabilistic learning (or machine learning) should have its own section. Recommder systems, sentiment analysis, and graphical models do not really fall into the traditional understanding of AI. It is more stats than CS and logic.
Also, 'Reflections on Trusting Trust' is a good one to add under security. I'd also recommend 'What's your threat model?' http://iang.org/ssl/wytm.html
The only thing that I had in my undergrad education that I miss in this list is experience with an assembly language. It probably fits in best in the discussion on Architecture: programming in an assembly language, at the very least, makes programmers appreciate high-level languages and makes them think about the underlying hardware.
Missing: "How to install, use and administer Windows."
It is awfully sad how many of self-called "hackers" are unable to reconfigure an IP address on worlds most-popular desktop OS.
I have trouble trusting such "engineers" with ability to design great software when they couldn't even educate themselves on both major world perspectives on OS design.
For the sake of discussion, here's a counter argument to that. Windows is the most common desktop OS, as you note.
That means it is the one most likely to be officially supported everywhere you go. You get to your new job, you'll find a Windows box on your desk, set up by IT, configured the way they like it--and the way they want it to stay.
If you run OS X or Linux at home, and only use Windows on boxes provided and administered by someone else, why learn how to administer Windows boxes? None of us have time to become experts in every technology we encounter as users. Not knowing how to reconfigure an IP address on Windows seems no worse than not knowing how to change a spark plug on a rental car.
Are the more significant inabilities you are _actually_ complaining about here?
I don't have any idea how Windows 7's (or Vista, or OS X, etc. hell, I vaguely remember how to do this on my *nix box) networking software works, but I guarantee you I could google it and solve my problem in minutes.
People may be used to different technologies. Many of us just google or see the doc on the fly for such trouble shooting or configuring settings.
But these types of skills/knowledge is different from understanding a computer scientist would require. I would be more worried if someone who claims to know networking doesn't know popular protocols or algorithms like for example the sliding window protocol.
The issues you face in designing, understanding and analyzing such algorithms will change you more than understanding where windows stores the setting to configure one of their protocol parameters.
Do you think that malware deserves its own section? Or might it be simply a related field? I have yet to hear of a computer security course that does (real) forensics or encourages creativity to better understand the criminal mind.
Implementing intepreters is cool. But the key value of studying compilers is that it provides you with the tools to build robust, extensible input processing (and, to a slightly lesser extent, data processing and output) code. If programmers are confident with using regular expressions and context-free grammars as appropriate, even if it's just coding up recursive descent parsers, I think they can produce much better input processing in their software.
And a large quantity of software needs to parse input, and possibly turn it in to different output, at some point in its execution.