Hacker News new | past | comments | ask | show | jobs | submit login
Introduction to Algorithms (2020) (ocw.mit.edu)
309 points by debanjan16 on Sept 17, 2022 | hide | past | favorite | 60 comments



I'm studying algorithms too right now.

As someone who sucks at maths and puzzle solving in general, Steve Skiena's book is proving to be quite approachable for me. Although I have to jump back and forth between the book and Khan Academy to look up some of the Maths.

He also has video lectures for the book: https://www3.cs.stonybrook.edu/~skiena/373/videos/

Another thing that helped me was brushing up on C programming skills. As a Python programmer, algorithms like hash maps don't make any sense because you just slap keys and values on a dictionary and call it a day but in C, you get to see how buckets and hash functions are used to build hash maps.


I love the idea of Skiena's book. I don't love his editor. The second edition has an error almost every other page. The third edition isn't much better.

2nd Ed. Errata: https://www3.cs.stonybrook.edu/~skiena/algorist/book/errata

3rd Ed. Errata: https://www3.cs.stonybrook.edu/~skiena/algorist/book/errata-...


Thanks for the heads up. I'll be sure to check the errata while going through the book.


I also want to recommend Steve Skiena's book. His math section in particular was my own 'ah hah!' moment where I connected math back to the code I wrote. It really unlocks the rest of the book.


Are you referring to section 2, Algorithm Analysis?


In the syllabus [1] they state that you need to be able to obtain above a grade C on "Problem Set 0" [2] before you take this course.

In my experience teaching 1st year undergrad algorithms, I'd be surprised if the average 1st year student that has just passed an "Introduction to algorithms" course would even be able to solve Problem Set 0.

However I don't think we should delay teaching algorithms until several years into the CS university curriculum. It's too important and central a topic to miss out on. We really need an intro algorithms curriculum that can teach both the basics (as in Problem Set 0) and the algorithms and data structures in the textbook.

[1] https://ocw.mit.edu/courses/6-006-introduction-to-algorithms...

[2] https://ocw.mit.edu/courses/6-006-introduction-to-algorithms...


MIT's discrete math course [1] is listed as a prerequisite for the algorithms course so Problem Set 0 doesn't seem unreasonable in that case.

Requiring an intro programming class and a discrete math class won't require you to delay teaching algorithms for "several years". Maybe just a semester or two.

Universities can also start teaching basic data structures and some algorithms in the programming classes. For example Stanford teaches some in CS106B Programming Abstractions [2] and then goes into more detail in CS161 Design and Analysis of Algorithms [3].

[1] https://ocw.mit.edu/courses/6-042j-mathematics-for-computer-...

[2] https://see.stanford.edu/course/cs106b

[3] https://stanford-cs161.github.io/winter2022/


Pretty common stuff that advanced kids learn in high school in the US[1]. I’d imagine MIT students can breeze through the questions in the Problem Set 0.

[1] Or so I saw in the bay area. Taking AoPS courses and even harder and broader content is pretty standard for kids in the bay area, for good or bad. It’s like there are two countries in the US. One has kids who take 10+ APs by grade 10. The other has schools that have median GPA 0.6. Sigh…


Breeze through is not true though. You can do a search on twitter and will see multiple 6.006 complaints.


The problem set covers the material of 6.0001 and 6.042. I guess you can take both in the first semester and do algorithms course in the second semester.


Here is my latest life hack that I have been using. Pick a major from MIT and see their degree program to form a basic knowledge graph of the major. Find them on https://ocw.mit.edu/ and study yourself. Usually, taking 1-2 classes gives you a great insight in to any topic so that you can at least collaborate better with the experts of those topics in a team environment.


I'm having the weirdest deja-vu right now.

Turns out I did see a thread almost like this, with a top comment almost like this a week ago: https://news.ycombinator.com/item?id=32793139

Felt like I was going insane.


Yup, decided to repost it, since someone replied with a link that even helped me further last time, so I was hoping to start another convo :) which seems like I have successfully achieved since this even sparked a longer thread this time, which I enjoyed reading.


Good memory. I wonder how often things like this happen. hmm


Yup, recognized it as soon as I saw it here because I distinctly remember thinking, when I read the first one, "hmm, I'm not sure I'd call this a 'hack'"


Nice job finding it! Bot or not?


same, thanks for digging up the link. yeah, seriously, verbatim same comment


> Pick a major from MIT and see their degree program to form a basic knowledge graph of the major

This was exactly how I taught myself, but I used CMU as my guide, lol. It is not hard to find good quality material, many universities have them open, but this curriculum is hard to 'graph' when you don't know where to start and what is an actual logical way to organize it.

So yeah, do this if you are a self-learner.


While good materials are easy to find, it'd be much easier if all assignments and solutions were available the way they are in this MIT course. I find a lot of value in verifying my solutions, or comparing to other valid solutions.

Cheating is a major problem, but I think the benefits (at least societally) would greatly outweigh the costs. I know sometimes courses don't change enough to make releasing solutions to assignments from past years a viable compromise.


I'd love to see a site that aggregates all of this content into an easily explorable format.


I made one once! That's a story for another time, but lessons learned:

- Elite schools like MIT are cesspools of crime and corruption, at least at the top. By "crime," I don't mean metaphorical "bad stuff" -- I mean actual, genuine, bona fide scary stuff like the movies. MIT's endowment is $20M per faculty member, and if you're in control of $20B in endowment with that much slush and that little oversight, it draws the wrong people.

- Don't accept money from elite schools. You're gonna get drawn into deeper shit than you want to know about.

- Less elite schools are more honest.

- If you do get in too deep, sign an NDA and non-disparage, and make sure you're connected to powerful friends. The calculus was: (1) if bad stuff happens to me, they hit front page of NY Times (powerful friends) (2) If we're both left in peace, I'm signalling I'm probably not going to expose them (NDA+non-disparage).

What I found is that there's much better content overall one tier down, although without MIT's PR budget. You gotta know where and how to find it. If all of that were aggregate, that'd be golden.


>What I found is that there's much better content overall one tier down

Which school's content are you talking about?


It's unfortunately distributed. ASU, WPI, Georgia Tech, etc. have impressive operations, but for the most part, you need to look a lot of places.

There's a ton of little resources like this: https://ximera.osu.edu/mooculus

Needs a ton of work on CSS and styling, but the content is great.

That's why we'd need an aggregator.


What courses did you use from CMU and which major?


I basically followed this: http://coursecatalog.web.cmu.edu/schools-colleges/schoolofco...

- but "took" more courses than necessary for my self-learning (I think I read through all of Logic / Language courses, insanely fascinating topic).

- I would not recommend a self-learner to follow it rigorously. Like if you are a working programmer, look for material that will help you at your work. Idea is to apply the knowledge.

- Some courses have assignments shared so that's great if you have the time check them out too. Definitely great for their database courses.


CMU courses doesn't have videos, atleast for most of the core courses. Was that a hurdle to your learning?


No, because I dislike videos as a source of information. They usually have absolute top tier lecture notes freely available. Like the Constructive Logic course.


Yes, even for their intro course 15-122 the course notes are excellent.


Here's a comment I left on a different thread a few months ago, which you may find useful if you're looking for CS contents:

---

There are a lot of book recommendations, but I would not focus on books if I wanted to get the equivalent of a solid CS education. Instead I would work through university lecture slides, assignments, and exams -- basically fast forward through a CS undergrad leveraging what you already know. I'm partial to CMU's CS syllabus for obvious reasons, but I find that it's also one of the most open and available resources on the web; i.e. not locked in on an intranet, etc.

With pre-existing background in software development the basic syllabus is more than doable in a 3-4 months. There are two sequences below: programming and theoretical fundamentals; you can do them in parallel.

Programming:

15-211: Introduction to Data Structures. Used to be in C++; now looks like it's in Java. https://www.cs.cmu.edu/~mjs/121/lectures.html

15-212: Principles of Programming. Still in ML. https://www.cs.cmu.edu/~me/212/schedule.html | https://www.cs.cmu.edu/~fp/courses/96-212/ (unlocked assignment pages)

15-213: Intro to Computer Systems. I hesitate to recommend this; 90% of the value in this course is in the labs and assignments, so it's difficult to do on one's own, but I would at least go through the slides and try to work through exams. https://www.cs.cmu.edu/~213/schedule.html

Theoretical:

15-251: Great Ideas in Theoretical Computer Science. Used to be Discrete Math with a heavy CS lean; it may have evolved. https://www.cs.cmu.edu/~15251/schedule.html

15-451: Algorithms. Here's a decade worth of lectures, exams and assignments - take your pick: https://www.cs.cmu.edu/~15451/ The 2013 course looks pretty complete: https://www.cs.cmu.edu/afs/cs.cmu.edu/academic/class/15451-f...

Next steps:

Since you are interested in VR, etc, you should probably look into the Computer Graphics courses. Note that the undergrad and grad courses are combined; the only difference is the expectations:

15-462: Computer Graphics http://15462.courses.cs.cmu.edu/fall2020/ Exam problems and solutions are gold: http://15462.courses.cs.cmu.edu/fall2020content/exams/finals...

15-463: Computational Photography http://graphics.cs.cmu.edu/courses/15-463/

15-464: Technical Animation http://graphics.cs.cmu.edu/nsp/course/15464-s21/www/syllabus... http://graphics.cs.cmu.edu/nsp/course/15464-s21/www/assignme...

Hope this helps. Good luck!


I did a couple of MIT projects in some of their graduate classes (OS, distributed systems) which was a lot of work. At least one week full time for each of them. And that's just the lab, I barely looked at the lectures (which were based on key research articles on the field).

But looking at the first classes of each topic to have a broad view seems like a good idea too.

I wish I could go back to school. CS is changing so fast, it's impossible to keep up to date within your free time.


> I wish I could go back to school. CS is changing so fast,

Is it, really? I got an undergrad degree 25 years ago, and looking at the requirements for the same at a few different universities, they appear largely the same, other than a few extra courses on ML. You still have pretty much the same math courses, theory courses, algorithm/OS/network/database courses, etc. The languages and tools have changed, but the fundamentals have not.


I got my degree 25 years ago too. At that time, a lot of things which are pervasive now barely existed. Virtual machines, clouds, smartphones, javascript, CI/CD... Fundamentals haven't changed but the tools/languages/systems we need to understand to do our jobs have changed a lot. The fundamentals are important but they are a fraction of the things we need to know.


I have also come to the same conclusion, and I believe it would be a very great idea to create a program that graph’s a topic’s “dependency courses”, perhaps with some must-read books for each subject.


It is not such a great idea, the dependency courses. Most prerequisites for a self-learner are kind of overblown IME. If you're applying the knowledge and / or you don't have time to go through them like a college student, your prerequisites have a different function for you (the least amount that makes you understand the main topic you want to understand in order to apply that knowledge, this might mean just literally skimming a paper or reading a subchapter of a chapter in some other book - you are not there for the prerequisites).

I talk about applying the knowledge a lot because it is a wonderful constraint, it makes it so that you don't accidentally read things cover to cover, but do like a depth first search into the subject instead.


The authors that can write a book such that you can understand it without previous knowledge are wonderful, but it is heavily dependent on the subject. It’s hard to talk about differential equations without first knowing calculus.


It is not possible and I didn't mean to imply that. There'll be endless prereqs. most of the time.

Just that if you need to understand differential equations, and you don't know calculus, the answer in this context isn't reading Spivak from start to finish (here I am assuming a working programmer / self-learner that wants to apply the knowledge - constrained by time and application much more heavily than a student of mathematics for example).


Could you please expand on what you mean by applying knowledge? But sure, I don’t believe reading a book from cover to cover is necessary in most cases, and I seldom do so with scientific books.


I firmly believe you need a software problem / idea to go with your learning. When you run into problems fixing / making what you want, search for what might help you.

Like most algorithm courses are fairly abstract form of programming (writing a line of code) where as a working programmer is a software engineer constrained by time and resources. This means that for example this algorithm course, it teaches you to generalize your solutions but that's not always realistic goal, or even desirable goal, in software engineering, and you might find out this if you implement one of the algorithms to do something for you in a small program.

Applying what you've learned in some software project of yours constraints you nicely such that you can't waste your time reading stuff from cover to cover.


I've spent a lot of time learning algorithms, and learning that certain ones exist, without learning their precise details (for future use). They have never got used. Maybe I used a bloom filter once.

It's had some value in knowing when a certain algorithm is inappropriate, but otherwise it's been a massive waste. Surely things can be better than this?

Jobs bleat for "creative, intelligent people with a knowledge of algos + data structures", so why won't you use this?


You might not work in context where you have to recognize these patterns out in the wild and apply the knowledge.

I have friends 2 decades into programming as a profession who've never implemented any of these.


That's exactly my point. And it is, I think, a massive waste of what I (and surely many others) can do.


You can do light study without having to do MIT-level exam questions (but you should still do some easy problems) and that can get you familiar enough to recognize the patterns and give you the vocabulary to google things

For people uninterested in studying CS itself, this has just become a job signal to work towards.


I don't study things for the love of it (well, I suppose I do) but to apply it, for what value is a skill if not applied?

I learnt these things because they should be useful. But employers would rather throw big data software stacks and a ton of expensive hardware at things rather than use brain.

CS and business programming should NOT be two disjoint subjects[1].

[1] IMO



Erik Demaine is a genius


It is a known fact. And a child prodigy as well. But kudos also to the rest of the instructors. They contribute nicely to the course.


I have always wondered if there would be value in a certification or course track in just Algorithms and Algorithm Design. It's such an intricate but very important field that it can have its own curriculum and be separated from CS. Well, not necessarily separated, but can be considered independently of other CS fields even though it is still part of a CS degree.

There are tons of applications of competency with Algorithms beyond CS. I hate that a person who has great competency with this kind of stuff has to do and be evaluated on the whole CS bundle as if that is only where it matters.


It needs a broader knowledge to be useful at depth. For example understanding the performance of memory isn't linear, memory coherency for multi processor systems, and in recent times distributed systems.


I think, by the way how you're describing it, there already is - it's called "Discrete Mathematics"


https://www.edx.org/micromasters/ucsandiegox-algorithms-and-...

> The University of California, San Diego MicroMasters® Program in Algorithms and Data Structures

> What you will learn

> Understand essential algorithmic techniques and apply them to solve algorithmic problems

> Implement programs that work in less than one second even on massive datasets

> Test and debug your code even without knowing the input on which it fails

> Formulate real life computational problems as rigorous algorithmic problems

> Prove correctness of an algorithm and analyze its running time


What does "proficiency in algorithms"? mean? that's general question


To me, it means understanding the basics of data structures and algorithms, and knowing when and how to apply them to solve problems.

A couple of examples:

Seeing a sorted list of items and knowing how to find an item in that list most efficiently (binary search) [0].

Looking at a map of cities and the roads connecting them (including distances between cities) and knowing that you can use a graph to represent the cities and use Dijkstra's algorithm [1] to find the shortest path between any two cities.

Essentially, being able to look at a problem and see how that problem can be mapped to (and solved by) a specific type of algorithm, then being able to implement that algorithm to solve the problem.

[0] https://en.wikipedia.org/wiki/Binary_search_algorithm

[1] https://en.wikipedia.org/wiki/Dijkstra%27s_algorithm


Just looked at the "Course Description" of the "Syllabus" as at

https://ocw.mit.edu/courses/6-006-introduction-to-algorithms...

The course takes itself very seriously and seems to ask each student to devote a lot of time to the course.

My summary reaction is that it would be a shame to devote that much time to what is basically so little material.

Yes, the "Syllabus" mentions

Introduction to Algorithms, Cormen, Leiserson, Rivest, and Stein, CLRS.

Years ago I downloaded a PDF and didn't see much beyond what I'd gotten from Knuth, etc.

I can give a fast overview here:

There I see their lists of topics:

dynamic arrays, heaps, balanced binary search trees, hash tables

and

sorting, graph searching, dynamic programming

I'm a little surprised at how old these topics are: I first learned several of the topics almost entirely from

Donald E. Knuth, The Art of Computer Programming, Volume 3, Sorting and Searching, 1973.

(1) Dynamic Array

A glance at how it works explains why I never heard of it:

Can see

"Dynamic array"

at

https://en.wikipedia.org/wiki/Dynamic_array

So, if have an array A and need more space, then allocate a larger array B and copy over the contents of array A to array B and continue with array B.

A guess is that in nearly all cases a better solution would be a tree where the array subscripts are used as keys and the array elements, as leaves.

Maybe the main reason to include dynamic arrays is to do some applied math to analyze by how much bigger array B should be than array A.

(2) Heaps

I like heaps.

At one point in the software of my startup, I have to search through maybe 1 million numbers and end up with the, say, 20 largest. For that I programmed a heap, and it has worked out great.

There are versions of the heap algorithm that are better on locality of reference and when the heap is carried mostly on slow, secondary storage.

(3) Balanced Binary Search Trees

AVL (Adelson-Velskii, Landis) trees are in the Knuth reference, and they are terrific. An alternative is red-black trees. One of those two is likely the key to .NET collection classes, and my startup uses two instances for a simple, light, fast key-value store instead of Redis.

(4) Hash Tables

Those are also in Knuth. Hashing usually leaves me in doubt due to its various possible problems. But better still sometimes is perfect hashing as I recall also in Knuth.

For hashing in general, a good step forward is in

Ronald Fagin, Jurg Nievergelt, Nicholas Pippenger, H. Raymond Strong, Extendible hashing-a fast access method for dynamic files, "ACM Transactions on Database Systems", ISSN 0362-5915, Volume 4, Issue 3, September 1979, Pages: 315 - 344.

We used that in an AI (artificial intelligence) product we shipped.

(5) Sorting

Knuth covers heap sort and shows that it meets the Gleason bound for sorting by comparing pairs of keys.

(6) Graph Searching

Looking at their lecture notes, it appears that they mean versions of shortest paths on networks.

These are all fairly simple except for minimum cost single commodity network flows where each arc has a maximum flow and also a cost per unit of flow. The problem is linear programming, and the classic simplex algorithm applies and takes on an especially simple form -- a basic solution corresponds to a spanning tree of arcs.

Some good news is that if the arc capacities are all integers and if start the algorithm with an integer solution, then the simplex algorithm maintains an integer solution and will terminate with one.

It is tempting to see that "good news" as a case of progress in NP-complete integer linear programming.

For such content, I recommend

Mokhtar S. Bazaraa and John J. Jarvis, Linear Programming and Network Flows, ISBN 0-471-06015-1, John Wiley and Sons, New York, 1977.

(7) Dynamic Programming

That can be a big subject but does not have to be. I got a good introduction from an expert in about 90 seconds while my cab was waiting to take me to the airport. I ended up writing my Ph.D. dissertation in dynamic programming. For an easy introduction, can have fun, like eating from the appetizer plate at Thanksgiving, say, a half hour at a bite from

Stuart E. Dreyfus and Averill M. Law, The Art and Theory of Dynamic Programming, ISBN 0-12-221860-4, Academic Press, New York, 1977.

One of the amazing advantages of dynamic programming is how well it handles randomness -- then have stochastic optimal control, Markov decision theory, etc.

The flavor of dynamic programming in computer science can be a bit different, e.g., applied to search for some string A in some string B.

Maybe another purpose of the course is to get everyone all wound up and fired up about the question of

P versus NP

My recommendation is to try quite hard to ignore that question.


You should read about dynamic arrays more carefully. They have amortized O(1) insertion which is better than a tree, and the data is contiguous in memory which gives it better cache locality than a tree. They are one of the most popular data structures.

Parts of your post also seem to me to be quite boastful and low-value: (paraphrasing) "the course takes itself very seriously", "why spend so much time teaching so little material", "these topics are mostly old; just read Knuth", and "dynamic programming is easy; I learned it in 90 seconds and then did my PhD in it".


I had a revision of that post, but I had it ready only just after the end of the 2 hour window for revisions.

The course pressed hard on the students to devote, what was it, 4 hours of time a week in group sessions with more hours in independent study. That's asking a lot from the students.

In response I had a lesson and purpose in that post: (A) That collection of fundamental algorithms hasn't changed very fast, were much the same 40 years ago. (B) Nearly all the algorithms are quite simple and each one can be learned quickly, including the derivation of its big-O performance, and coded, running, and tested in 2 hours or so. (C) I mentioned Knuth v.3 as a reference: Tough to improve on that Knuth volume as a reference for such algorithms. (D) For hashing, network flows (graph search), and dynamic programming I gave really good references -- tough to compete with any of them. I used some of my experience to illustrate (A) -- (D).

That lesson should be some quite good news for any HN readers considering studying the algorithms.

> Parts of your post also seem to me to be quite boastful and low-value:

No, I just used some of my experience to give examples of my points.

> You should read about dynamic arrays more carefully. They have amortized O(1) insertion ....

I saw all that. Get the O(1) property only with some assumptions and some math derivations, and I mentioned the math.

Two obvious problems:

(1) It is trivial for any of us to think of applications where the new allocations and copying would be wildly wasteful.

(2) For the assumptions, we will rarely have enough information to have much confidence in our math derivation.

We also can think of

(3) The reallocations, when there are a lot of them, will create problems for the memory management, garbage collection.

Sure, any of us can think of niche situations where (a) we do a few reallocations and (b) then go for hours, ..., months with no more reallocations and with the advantages of arrays.

Dynamic arrays don't belong on a list of Best Algorithms.

Again, my guess is that the interest of the course in dynamic arrays is an opportunity to do the math derivations.

The people that MIT course is intended for are maybe good HN readers, so an issue for HN readers is, should they devote a lot of time to that course? We review movies, restaurants, etc., and we might also review courses.

Your attack on my review was mostly an attack on me: You resented my post because I mentioned some of my background and in response attacked me. Instead, make a good contribution of your own, maybe as a review of that MIT course.

I'll state the basic lesson again:

The algorithms in that course are nearly all quite good but old, with some really good old references, and can be learned quickly, say, the whole course in a few weekends.


Thanks for taking the time to read and respond. I admit the second paragraph of my post was a bit aggressive and I was on the fence about posting it. I don't have a problem with you sharing your background but the parts I mentioned previously came off in a certain way to me.

I found your initial argument on dynamic arrays dismissive because you admitted you had never heard of it, then implied that they don't make sense as if to justify why you had never heard of them. I find that intellectually dishonest and it really ticked me off; it's just confirming one's own bias. I still find your argument a bit dismissive although we can agree to disagree. It's not a case of worthiness to be on a list of best algorithms or of fancy math derivations. They are widely used in practice, are O(1) for many operations, work well with caches, and are worth studying for that reason.

As for making a "good contribution of my own" by reviewing the course, I don't feel the need. It's a standard undergrad algorithms course of the kind that most CS students would take. I don't think there's any value in reviewing the syllabus when they all tend to cover the same material.

I'm probably won't reply again so (sincerely) have a good day. I realize you feel attacked but if you're going to opine on something then other people might opine on your opinion. You don't hesitate in your writing style so I didn't either. I just apologize if I made it too personal. I read some things that I couldn't let slide.


Algorithms at that level are maths for computers. And mostly are for big data.

I think it is critically missing the word "maths" somewhere.


I'm not sure why you feel the word "algorithms" needs the word "math" in the same sentence. I feel math is implied in the word. Like, how we don't need to say "mathematical statistics" or "mathematical calculus".

Anyway, first line of the course description gives you what you are asking for:

> This course is an introduction to mathematical modeling of computational problems


I agree with your point, just wanted to mention that mathematical statistics is a thing. It's usually a name given to textbooks or courses where statistics is built up from foundations in probability theory and where theorems are stated and proven vs introductory statistics courses that approach things with a more decidedly applied orientation that usually also touch lightly on subjects like experimental design and data collection


Yeah, I meant in the hacker news.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: