Hacker News new | past | comments | ask | show | jobs | submit login
Must Read CS Books For Self Self-Taught Programmers
251 points by stefanve on April 6, 2011 | hide | past | favorite | 61 comments
In light of past discusion about Self-Taught Programmers vs CS-Educated Programmers:

What are must read CS books (or other resources) that help you in your daily work? (so I don't mean code complete or head first design patterns)

I'm a Self-taught programmer and I think I'm missing some knowledge of algoritmes and discrete math. but perhaps there are other great subjects/books (compilers etc)

EDIT: Thanks everybody for the great suggestions

EDIT 2: The list so far (unsorted)

http://sharetext.org/YUE




I am also self-taught, although I dropped out of University two years into a CompSci/Engineering double major, so I will recommend some resources to you that helped me immensely. Each of these books has an associated MIT course with lecture video, notes, etc. available online.

First is Structure and Interpretation of Computer Programs[1], which you can read online, and the associated course at MIT, which is 6.001[2]

Second is the famous 'Dragon Book', Compilers: Principals, Techniques and Tools[3] and the associated course which is 6.035[4]

Extras would be the Python book 'How to think like a Computer Scientist'[5]. MIT course 6.00[6] uses the book as a reference, and the courseware is again available online.

Other than that - the usual suspects on learning C (K&R), UNIX (TAOUP[7]), the bash shell along with grep, sed, awk, more algorithms(CLRS[8]), functional programming and machine learning. Take your time, it takes years to build the relevant experience and knowledge and you are never done.

I love the MIT courses. Work and learn at your own rate. I feel that it is important to implement all the code yourself even if it looks easy in a lecture - there are little things you pick up as you write algorithms out.

Even though I had worked through SICP I still watched all the lectures again and implemented all the examples with benchmarks and unit tests. I usually set aside one day on the weekend to work on study, and usually an extra evening or two mid-week to read papers and books. Once you get into the routine it is great.

It might be the best approach to set yourself a timetable and weekly schedule, just like in UNI (ie. every Saturday plus Tuesday and Thursday nights) and work through the MIT courseware and associated books in order (6.00, 6.001, 6.035). The more advanced MIT courseware is an excellent bonus.

[1] http://xrl.us/sicp

[2] http://xrl.us/6001

[3] http://xrl.us/dragonbook

[4] http://xrl.us/6035

[5] http://xrl.us/thinkcs

[6] http://xrl.us/6000

[7] http://xrl.us/artunix

[8] http://xrl.us/clrs


That's a great collection of books, but when I see a list of link-shortened URLs to what context dictates are book web pages, I automatically assume they are affiliate links. Please just use the real URLs so people know what they're getting into.

For the record the shortened URLs are not affiliate links, and I'm not trying to make any statement about affiliate links. Here are the original URLs:

[1] http://mitpress.mit.edu/sicp/full-text/book/book.html

[2] http://ocw.mit.edu/courses/electrical-engineering-and-comput...

[3] http://dragonbook.stanford.edu/

[4] http://ocw.mit.edu/courses/electrical-engineering-and-comput...

[5] http://greenteapress.com/thinkpython/

[6] http://ocw.mit.edu/courses/electrical-engineering-and-comput...

[7] http://www.faqs.org/docs/artu/

[8] http://www.amazon.com/dp/0262032937/


Ahh sorry - I am usually totally against shortened links but:

1. Didn't know that HN would trim the long links

2. Remember the shortcodes/aliases used at xrl.us (such as the MIT course names), although browser smart address bars and Google is just as easy.

Thanks for putting the links up, I don't need to update my comment.


If you did not tell me that SICP was actually made for something else, I would think that it was custom-designed for self-taught programmers to fill in their gaps, at least the ones that matter. If you actually finish that book you'll be ahead of the vast bulk of programmers who graduated with computer science degrees in terms of practical software engineering.


Great list. Two things:

(1) MIT has also posted lectures for "Intro to Algorithms" taught by Leiserson (one of the authors of the famous textbook). The course number is 6.046J[0].

(2) You mention shell scripting and sed/awk/grep...can you recommend any resources for those?

[0] http://ocw.mit.edu/courses/electrical-engineering-and-comput...


The best way to learn the UNIX tools is in the context of shell scripting and using the combination of tools to achieve a purpose (such as parsing log files, setting up shortcuts as part of your dev toolchain)

1. Understand the philosophy behind UNIX. ie. everything is a file and it is small apps that are very good at a single task that can be combined through pipes etc. to process larger tasks (this has lost its way somewhat in Linux, but still holds true)

2. Look at your computer usage through development, sysadmin, etc. and find parts that you want to automate, and then set out to write scripts to complete these tasks. for eg. in each one of my projects I have a script called start.sh which has a bunch of tasks implemented (eg. backup, serve, dns (to update dyndns for callbacks), push, diff, deploy, stage (run remote commands) etc.). Remember that git is itself a set of shell scripts, as are most server management commands. Take a peak inside each of these to learn how they are implemented.

3. Once you have a good understanding of what you want to implement, just go for it. Learn along the way by using Bash scripting, regular expressions, an understanding of UNIX/Linux, sed, awk, yacc/bison (to analyze code and extract info). You will end up building your own collection of shell scripts, environment variables, .vimrc, ssh, curl (you can do almost any API request and auth with curl and a shell script - automate tweets, RSS feeds) etc. (I have been meaning to publish my own, the most recent one I wrote greps my code for TODO's and pushes them to a simple webapp where I can view them, sort them, etc. along side my 'personal' todo list)

4. You need two sets of resources, one is for learning, the other is for reference which you keep handy. Here are my own recommendations:

Learning:

The learning style book comes down a lot to personal preference. You can go wrong, so get a feel for each topic by reading online tutorials and then scan the TOC and sample chapters of books that look good. A subscription to O'Reilly Safari comes in handy. There are now also a lot of screencasts online, try searching YouTube for the topic with 'screencast' or 'tutorial', once you find a good screencast publisher look at the rest of his/her videos.

You can also get the O'Reilly books on special in bundles sometimes, check their website

There is no book that really introduces UNIX and then covers most of these topics as I described. ie. 'UNIX for Developers'.

* UNIX:

- FreeBSD handbook: http://www.freebsd.org/doc/handbook/

- Linux Documentation Project: http://tldp.org/LDP/

- Linux Command Line and Shell Scripting Bible: http://www.amazon.com/Linux-Command-Shell-Scripting-Bible/dp...

- UNIX In a Nutshell (I haven't read this in a while but it is on my shelf): http://oreilly.com/catalog/9781565924277

- Learning UNIX (also O'Reilly): http://www.amazon.com/Learning-UNIX-Operating-System-Fifth/d...

* Bash:

- http://linuxcommand.org/

- http://bash.cyberciti.biz/guide/Main_Page (better, and v good)

- Learning the Bash shell: http://oreilly.com/catalog/9780596009656/

- Advanced Bash Scripting (online): http://tldp.org/LDP/abs/html/

- YouTube playlist: http://www.youtube.com/view_play_list?p=2284887FAE36E6D8

* Regular Expressions:

- Mastering Regular Expressions (this is such a great book. I didn't like regexp until I picked this up eons ago): http://oreilly.com/catalog/9781565922570

* sed/awk/grep etc.

- sed & awk (O'Reilly): http://oreilly.com/catalog/9781565922259

* Vim/Vi:

- vimcasts (screencasts): http://vimcasts.org

- more screencasts (these are great): http://vimeo.com/user1690209/videos

- learning vi: http://www.amazon.com/dp/1565924266

Reference:

On my browser toolbar, I have a folder called '_ref', it has around 70+ bookmarks (online documentation, cheatsheats) in it each titled what the link is a reference for (ie. 'vi', 'bash' etc.). I will publish it online, and will post the links here. I have collected the links over years.



It really depends on what you feel you're missing and what you're hoping to do (definitions of "daily work" vary widely). If you're looking to get up on theory by doing your own program of sorts, you could do worse than start with these (in roughly this order):

Structure and Interpretation of Computer Programs - Abelson, Sussman, and Sussman

Introduction to Algorithms - Cormen, Leiserson, Rivest, and Stein

The Art of Assembly Language - Hyde

a digital logic book (not sure which is most recommended), and an architecture book (see reply by tftfmacedo)

Modern Operating Systems - Tanenbaum

Introduction to the Theory of Computation - Sipser

Compilers: Principles, Techniques, and Tools - Aho, Lam, Sethi, and Ullman (a.k.a. "Dragon Book")

Programming Language Pragmatics - Scott

A database design book (one that covers Relational Algebra, not just a book on SQL), and maybe a book on Networks. Also, Roy Fielding's paper on REST is both academic and applicable (and more approachable than you'd expect of a Ph.D paper). If you want to go all the way, an undergraduate program usually also has Calculus, Discrete Math, Linear Algebra, and Statistics. Some schools would also require Physics and Differential Equations. I'm sure I'm missing some topics, too, particularly electives.

If you can get through those and the associated problem sets, you'll have a better foundation than most.


Also, a worthy introduction to the field would be The New Turing Omnibus by A.K.Dewdney.

In fact, I'd recommend this book first, for it gives self-taught programmers a taste of nearly everything in computer science (and thus equips them to know which branch they'd like to pursue next).


i'd probably put some kind of ordering on these.

start with algorithm analysis and basic data structures etc in CLRS. (ch 3,4,10,11,12) and then go back and fill in the more advanced concepts. dasgupta, papadimitriou, vazirani is good too http://www.cs.berkeley.edu/~vazirani/algorithms.html

for the compilers book, it's necessary to go through the assembly, and some of sipser (need DFA/NFA/context free grammars), as well as structure and interpretation of computer programs.

OS could come after the algorithms and preferably after assembly but before the compilers book.


Good idea, I'll do that.


So far the comments in this thread cover Functional (SICP, Haskell books), Logic (Prolog Books), and Procedural/OO (Almost everything else) Languages, so I figure I should add a reference for Stack/Concatenative Languages:

Thinking Forth - Brodie


Although you might prefer to start off with "Starting Forth" first.


I'd recommend this one on architecture:

Computer Architecture: A Quantitative Approach - Hennessy and Patterson


For networks, I would recommend Interconnections - Perlman.


While not strictly a CS book, "Godel, Escher, Bach" by Douglas Hofstadter definitely has strong roots in the area. It's not a text book, nor anything even close, however, there is a reasonable amount of mathematics and programming language design which make it educational as well as inspirational (particularly formal logic systems, around which the premise of the book is built).

The book is somewhat life changing, in the questions that it asks. You might find yourself thinking about things differently, such as what it is to be conscious, can we ever achieve artificial intelligence, is there such a thing as fate, how was J.S. Bach able to produce such stunning compositions, etc.

It's quite heavy going however, but there's a slightly more succinct, terse version which he wrote a few years ago, called "I Am a Strange Loop". This book takes the point he was trying to make in the first book, and expands on it while adding clarification. It does lack a lot of story that the original contained, so it's not a complete replacement however.

While I think of it, there's also Operating System Concepts by Silberschatz, Gagne and Galvin - http://www.amazon.com/Operating-System-Concepts-Windows-Upda.... It's an extremely detailed look at how operating systems work, down to the lowest level, and it explains a large number of things that we interact with on a daily basis.


I think GEB is an excellent books because it basically demonstrates in a reasonable accessible way that CS can be about rather more than "how to write programs".


Along with the usual classics, I highly recommend Computer Systems: A Programmer's Perspective by Randal E. Bryant and David R. O'Hallaron of Carnegie Mellon. The authors wrote it after teaching a class on the subject. It's extremely readable and gives you an excellent introduction of machine level code, processor architecture and memory as well as a solid foundation of higher level concepts including networking and concurrency. If you're considering programming as a career, I'd say this book (or something similar, probably spread across multiple books) is a must-read. It's used by CMU, Stanford, Caltech, UIUC, Harvard and dozens of other schools.

http://csapp.cs.cmu.edu/ http://www.amazon.com/Computer-Systems-Programmers-Perspecti...


Over the years I've found the following CS books helpful, but only a minority in my day-to-day work. Your mileage may vary, as would the utility of these to you.

Algorithms -> Algorithms + Data Structures = Programs by Wirth (worth its weight in gold if you can get past the Pascal syntax)

OS -> Operating System Concepts by Silberschatz et al (The dinosaur book)

CS Theory -> Introduction to Automata Theory, Languages, and Computation by Hopcroft, Ullman

Programming Languages Theory -> Programming languages: design and implementation by Pratt et al

Database Theory -> Database Design by Wiederhold

Architecture -> Structured Computer Organization by Andrew S Tanenbaum


My vote goes for The Little Schemer. It's short (but don't read it all in one sitting), entertaining, and will teach you some important concepts.

http://www.amazon.com/Little-Schemer-Daniel-P-Friedman/dp/02...


This book like no other just makes me smile. I read this right after I graduated and started my "real" learning in programming and math. The intermission page list books on logic and set theory.

FYI: it is the first book in a "trilogy" -- but your next book could be either of the other two (their only prereq is The Little Schemer). Listed below:

Reasoned Schemer: http://www.amazon.com/Reasoned-Schemer-Daniel-P-Friedman/dp/... Seasoned Schemer: http://www.amazon.com/Seasoned-Schemer-Daniel-P-Friedman/dp/...


Programming is more about thinking in a certain way than algorithms or data structures (those are the tools). You should check out the book Structure and Interpretation of Computer Programs -- I found it "enlightening"


SICP is also available as a series of videos of the lectures at http://groups.csail.mit.edu/mac/classes/6.001/abelson-sussma... - very enlightening to this self-taught programmer


You can find some resources in a similar discussion here http://news.ycombinator.com/item?id=297289

You can find some resources, mostly books, in the following links

http://stackoverflow.com/questions/194812/list-of-freely-ava...

http://stackoverflow.com/questions/1711/what-is-the-single-m...


One book that I would suggest to anyone is Introduction to Automata Theory, Languages, and Computation - HMU. It is very approachable and presents some very interesting topics (so you won't write a regex for matching HTML and will learn what P vs NP means). On a more practical side, I think that a must read for machine learning is Tom Mitchell - Machine Learning . Another book that from what I've heard is easier to digest is Data Mining: Practical Machine Learning Tools and Techniques.


The Mitchell book is definitely showing its age these days. It's not terrible, but anyone thinking of buying it should be aware that it is a broad but shallow tour of machine learning as it stood 15 years ago, and machine learning as a field has changed significantly since then. Most notably, there has been something of a Bayesian revolution that Mitchell basically ignores (understandable since the book predates it).

Don't take this as a recommendation, because I haven't read it, but Stephen Marsland's Machine Learning book appears from a glance at the table of contents to be a much more modern attempt to provide the same type of coverage as Mitchell. But again, I can't speak to its quality.

Chris Bishop's Pattern Recognition book is also very good, but it's not the same sort of book. Bishop is exhaustively deep on the narrower range of ML that he covers, but you won't get the same sort of coverage of the wider view of the field.


Also Bishop is much harder to read, so for a first introduction I think that Mitchell is good. There are some chapters on theoretical learning that you can skip, but I do think that it is good for a first overview of the field.

This is the first time I hear about Marsland's book, so I can't comment on that.


This response is late that I doubt will even be read by the poster, but I will throw this out there anyway.

I was (am?) a self-taught programmer, I guess I am transitioning away of that label. I am a bit less than halfway through the MSCS program at the moment. I really cannot recommend it enough.

I think I was a pretty good software engineer prior to getting some formal education, but I cannot tell you how often in class the light from heaven just shines right down.. "oh so that's why x is y". If you enjoy the work, its a real pleasure (albeit a painful amount of work at times).

So finally I get to the point. I can see that you have already received a lot of good recommendations. I think most of them are quite good. However, I have a couple of observations about specific books.

Intro to Algorithms - Cormen etc. If you feel you need a discrete math course, then this book is probably not a good place to start with algorithms. It is a rigorous treatment of the subject. However, if you lack mathematical sophistication, this book can be tough. I aced my discrete course prior to taking an algorithms course taught with this book, and I struggled mightily to get an A-. I found the proofs in the book difficult to understand on many occasions.

Modern Operating Systems - Tannenbaum This book is very easy to understand and provided me with so many "A HA!" moments. A real pleasure. I am not sure what your current work is, but the only pre-req on this book is a modest amount of C/C++ programming. The reason I say this is because I found that having that, this book allowed me to finally understand what is happening from compile time down to the CPU at runtime. A really rewarding journey.


What I often find to be the case is that a course in college only loosely follows the assigned book. Professors like to navigate through the subject material in a very personal way, which will often not be the way that it is covered in the book... if it is covered in the book at all! For this experience, I would suggest going through lecture notes and, when necessary, supplementing them with a book.

While books are certainly valuable in someone's education, I think we are forgetting about the projects. It is very instructive, not to mention very satisfying, to implement an operating system, a compiler, or a transport layer (that interoperates with real TCP!). Moreso than reading the books of a college course, I recommend doing its projects.

To get started, I recommend the Pintos operating system, designed for Stanford's operating systems course, CS 140, traditionally thought to one of the more difficult programming courses in their undergraduate curriculum.

Some links.

http://www.scs.stanford.edu/11wi-cs140/ http://www.scs.stanford.edu/05au-cs240c/


Godel, Escher, Bach

SICP

Art of Computer Programming

C Programming Language

Introduction to Algorithms

Land of Lisp

-- Extracted from my wishlist - http://flipkart.com/wishlist/dhavaltrivedi


functional stuff is making a comeback. haskell can be daunting in its pure-ness sometimes, requiring monads for seemingly anything useful. but it's a wonderful language/lifestyle choice. this online book broaches the subject pretty well: http://learnyouahaskell.com/chapters


If you want to go down the Functional rabbit hole more, "Purely Functional Data Structures" by Chris Okasaki would be a good next step. There's a book and a paper - from what I hear, the paper is just Okasaki's original work, whereas the book is a more complete coverage of the field.


I really liked "The Implementation of Functional Programming Languages" by Simon Peyton Jones, which is available online:

http://research.microsoft.com/en-us/um/people/simonpj/papers...


To steal words from a comment earlier today, Haskell has a very steep but very rewarding learning curve.

It'll reinforce some concepts from math, and lead you toward writing code in a manner that's considered good practice in other languages - such that it's generally maintainable and testable. Similarly, if you're familiar with Design Patterns, you'll see that a lot of patterns are attempts to implement functional concepts in OO languages. That said, as a fan of Haskell I'm definitely biased, and a lot of that can be said for other functional languages as well.

"Real World Haskell" is also very good, but probably not the best place to start if you haven't written in a functional language before.


+1 on real world haskell, which is here: http://book.realworldhaskell.org/read/

definitely a lot less hand-holding than learnyouahaskell, but they complement each other pretty well.


Plus learning a language from another paradigm, e.g. functional or logics programming, can also be insightful to your 'safe areas'[1]:

- After learning functional programming I realized how much of my programs do just mapping or folding over data. You start to spot the cases where this is not immediately obvious after some FP practice.

- After learning logics programming, I realized how much of my programs do a poor man's version of backtracking.

Maybe one of those languages is so much fun that you'll eventually change your 'lifestyle'. E.g. I use Haskell in many cases where I'd previously use Python or C.

[1] Of course, it doesn't always come natural. C++ STL does provide maps and folds, but since pre-C++0x doesn't have lambdas, using them requires more effort.


Patterns of Enterprise Application Architecture by Martin Fowler helped me make the jump from small systems built for myself to large, sophisticated systems built for others.

The Mythical Man Month by Fred Brooks really helped me learn to think about large projects from a personnel and planning perspective. There are some ideas there that have become part of the CS canon; "no silver bullet" and the slightly sexist but accurate metaphor for throwing more people at an overdue project, "nine women can't make a baby in one month." The Mythical Man Month was written in 1975, but it holds up remarkably well.


3 books that I helped me tie my daily work back to more abstract computer science concepts are:

- Ruby Best Practices - Javascript, the Good Parts - Higher Order Perl

I recommend picking these up after you've done work in the languages they're about. They assume that you're already comfortable with the language, but then go back to show how that language uses CS concepts. They highlight how functional programming, and other classic introductory CS concepts, but stays practical. None of them are long reads, and there are clear take aways that make you better at programming.


Two suggestions:

Elements of Programming (by Alexander Stepanov and Paul McJones) takes a mathematical approach to programming. Since its only prerequisite is a basic understanding of high school algebra, the book is very accessible and easy to follow.

Digital Design and Computer Architecture (by David Harris and Sarah Harris) is a great book on computer architecture that starts with digital logic design (i.e. gates and transistors) and ends with a subset of the MIPS instruction set. Though, it probably won't help you much in 'daily work'.


Updated list: (still unsorted)

http://sharetext.org/YVA


Can anybody recommend a good book or resource for learning discreet mathematics?


Discrete Mathematics and Its Applications by Rosen


The standard for computational grammars and parsers in natural language processing is:

Prolog and Natural-Language Analysis - Fernando C. N. Pereira and Stuart M. Shieber

The PDF is available from the publisher: http://www.mtome.com/Publications/PNLA/prolog-digital.pdf

It also serves as a great introduction to Prolog and logics programming.


I've heard "The Art of Prolog" is good, too... a quick glance at Amazon shows it to be ridiculously expensive, though.


Get the first edition for $3.82:

http://www.amazon.com/gp/offer-listing/0262192500/ref=tmm_hr...

...I got mine for like $2, and I've been very happy with it.


I haven't read "The Art of Prolog". But if you want to write correct and fast Prolog, reading "The Craft of Prolog" by Richard O'Keefe is a must.


It is quite good, and the hardcover version is beautiful as well -- good enough to be a coffee table book. But yeah, $250 is pretty steep.


You should be able to get it cheaper than that—I did. AbeBooks shows it available for $40 or so, plus p&p.


Hit up your local college library. Mine has a copy of it and it has been there every time I've gone by to check it out.


I can't remember who said it, but

"If you can write a compiler, you can write any program."

Hence, I'd get compiler books. Modern Compiler Implementation in Standard ML, SICP has a couple of sections on compilation, there's a computational theory book that I don't have on hand which would be useful to this end too.


I happened to be looking at Tony 'Quicksort' Hoare's page at Microsoft Research and I noticed this:

"His research goal was to understand why operating systems were so much more difficult than compilers"

http://research.microsoft.com/en-us/people/thoare/


How to Think Like a Computer Scientist - Python Version

http://greenteapress.com/thinkpython/thinkCSpy/html/


Programming Paradigms - stanford course, available on iTunesU.


Eric Evans' "Domain-Driven Design: Tackling Complexity in the Heart of Software" blew my mind. Domain modeling gets to the heart of object-oriented programming. The book is a bit academic and long-winded, but very deep and complete. This is an immediate classic and required reading for any serious engineer.


Knuth's Art of Computer Programming

Aside from that, there are typically only 1-2 extremely well regarded books in any given area. If you're going to be doing something specific, grab the appropriate book.

eg:

Compilers - Dragon Book

AI - Russel/Norvig's Artificial Intelligence: A Modern Approach

Oh.. everyone needs a whiteboard, as well - they're quite useful


I'd like to hear HN's opinion on the quality of this list:

http://www.reddit.com/r/books/comments/ch0wt/a_reading_list_...


Peopleware - This isn't a programming book, but I feel that every professional programmer (and certainly every manager) should have read it. It is about the human aspect of programming. Programmers are people too!


Previous HN thread on a similar topic: http://news.ycombinator.org/item?id=2262527


The Annotated Turing by Charles Petzold


FYI, you have listed several books twice (SICP, Intro to Algorithms).





Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: