As someone who sucks at maths and puzzle solving in general, Steve Skiena's book is proving to be quite approachable for me. Although I have to jump back and forth between the book and Khan Academy to look up some of the Maths.
Another thing that helped me was brushing up on C programming skills. As a Python programmer, algorithms like hash maps don't make any sense because you just slap keys and values on a dictionary and call it a day but in C, you get to see how buckets and hash functions are used to build hash maps.
I love the idea of Skiena's book. I don't love his editor. The second edition has an error almost every other page. The third edition isn't much better.
I also want to recommend Steve Skiena's book. His math section in particular was my own 'ah hah!' moment where I connected math back to the code I wrote. It really unlocks the rest of the book.
In the syllabus [1] they state that you need to be able to obtain above a grade C on "Problem Set 0" [2] before you take this course.
In my experience teaching 1st year undergrad algorithms, I'd be surprised if the average 1st year student that has just passed an "Introduction to algorithms" course would even be able to solve Problem Set 0.
However I don't think we should delay teaching algorithms until several years into the CS university curriculum. It's too important and central a topic to miss out on. We really need an intro algorithms curriculum that can teach both the basics (as in Problem Set 0) and the algorithms and data structures in the textbook.
MIT's discrete math course [1] is listed as a prerequisite for the algorithms course so Problem Set 0 doesn't seem unreasonable in that case.
Requiring an intro programming class and a discrete math class won't require you to delay teaching algorithms for "several years". Maybe just a semester or two.
Universities can also start teaching basic data structures and some algorithms in the programming classes. For example Stanford teaches some in CS106B Programming Abstractions [2] and then goes into more detail in CS161 Design and Analysis of Algorithms [3].
Pretty common stuff that advanced kids learn in high school in the US[1]. I’d imagine MIT students can breeze through the questions in the Problem Set 0.
[1] Or so I saw in the bay area. Taking AoPS courses and even harder and broader content is pretty standard for kids in the bay area, for good or bad. It’s like there are two countries in the US. One has kids who take 10+ APs by grade 10. The other has schools that have median GPA 0.6. Sigh…
The problem set covers the material of 6.0001 and 6.042. I guess you can take both in the first semester and do algorithms course in the second semester.
Here is my latest life hack that I have been using. Pick a major from MIT and see their degree program to form a basic knowledge graph of the major. Find them on https://ocw.mit.edu/ and study yourself. Usually, taking 1-2 classes gives you a great insight in to any topic so that you can at least collaborate better with the experts of those topics in a team environment.
Yup, decided to repost it, since someone replied with a link that even helped me further last time, so I was hoping to start another convo :) which seems like I have successfully achieved since this even sparked a longer thread this time, which I enjoyed reading.
Yup, recognized it as soon as I saw it here because I distinctly remember thinking, when I read the first one, "hmm, I'm not sure I'd call this a 'hack'"
> Pick a major from MIT and see their degree program to form a basic knowledge graph of the major
This was exactly how I taught myself, but I used CMU as my guide, lol. It is not hard to find good quality material, many universities have them open, but this curriculum is hard to 'graph' when you don't know where to start and what is an actual logical way to organize it.
While good materials are easy to find, it'd be much easier if all assignments and solutions were available the way they are in this MIT course. I find a lot of value in verifying my solutions, or comparing to other valid solutions.
Cheating is a major problem, but I think the benefits (at least societally) would greatly outweigh the costs. I know sometimes courses don't change enough to make releasing solutions to assignments from past years a viable compromise.
I made one once! That's a story for another time, but lessons learned:
- Elite schools like MIT are cesspools of crime and corruption, at least at the top. By "crime," I don't mean metaphorical "bad stuff" -- I mean actual, genuine, bona fide scary stuff like the movies. MIT's endowment is $20M per faculty member, and if you're in control of $20B in endowment with that much slush and that little oversight, it draws the wrong people.
- Don't accept money from elite schools. You're gonna get drawn into deeper shit than you want to know about.
- Less elite schools are more honest.
- If you do get in too deep, sign an NDA and non-disparage, and make sure you're connected to powerful friends. The calculus was: (1) if bad stuff happens to me, they hit front page of NY Times (powerful friends) (2) If we're both left in peace, I'm signalling I'm probably not going to expose them (NDA+non-disparage).
What I found is that there's much better content overall one tier down, although without MIT's PR budget. You gotta know where and how to find it. If all of that were aggregate, that'd be golden.
- but "took" more courses than necessary for my self-learning (I think I read through all of Logic / Language courses, insanely fascinating topic).
- I would not recommend a self-learner to follow it rigorously. Like if you are a working programmer, look for material that will help you at your work. Idea is to apply the knowledge.
- Some courses have assignments shared so that's great if you have the time check them out too. Definitely great for their database courses.
No, because I dislike videos as a source of information. They usually have absolute top tier lecture notes freely available. Like the Constructive Logic course.
Here's a comment I left on a different thread a few months ago, which you may find useful if you're looking for CS contents:
---
There are a lot of book recommendations, but I would not focus on books if I wanted to get the equivalent of a solid CS education. Instead I would work through university lecture slides, assignments, and exams -- basically fast forward through a CS undergrad leveraging what you already know.
I'm partial to CMU's CS syllabus for obvious reasons, but I find that it's also one of the most open and available resources on the web; i.e. not locked in on an intranet, etc.
With pre-existing background in software development the basic syllabus is more than doable in a 3-4 months. There are two sequences below: programming and theoretical fundamentals; you can do them in parallel.
15-213: Intro to Computer Systems. I hesitate to recommend this; 90% of the value in this course is in the labs and assignments, so it's difficult to do on one's own, but I would at least go through the slides and try to work through exams. https://www.cs.cmu.edu/~213/schedule.html
Since you are interested in VR, etc, you should probably look into the Computer Graphics courses. Note that the undergrad and grad courses are combined; the only difference is the expectations:
I did a couple of MIT projects in some of their graduate classes (OS, distributed systems) which was a lot of work. At least one week full time for each of them. And that's just the lab, I barely looked at the lectures (which were based on key research articles on the field).
But looking at the first classes of each topic to have a broad view seems like a good idea too.
I wish I could go back to school. CS is changing so fast, it's impossible to keep up to date within your free time.
> I wish I could go back to school. CS is changing so fast,
Is it, really? I got an undergrad degree 25 years ago, and looking at the requirements for the same at a few different universities, they appear largely the same, other than a few extra courses on ML. You still have pretty much the same math courses, theory courses, algorithm/OS/network/database courses, etc. The languages and tools have changed, but the fundamentals have not.
I got my degree 25 years ago too. At that time, a lot of things which are pervasive now barely existed. Virtual machines, clouds, smartphones, javascript, CI/CD... Fundamentals haven't changed but the tools/languages/systems we need to understand to do our jobs have changed a lot. The fundamentals are important but they are a fraction of the things we need to know.
I have also come to the same conclusion, and I believe it would be a very great idea to create a program that graph’s a topic’s “dependency courses”, perhaps with some must-read books for each subject.
It is not such a great idea, the dependency courses. Most prerequisites for a self-learner are kind of overblown IME. If you're applying the knowledge and / or you don't have time to go through them like a college student, your prerequisites have a different function for you (the least amount that makes you understand the main topic you want to understand in order to apply that knowledge, this might mean just literally skimming a paper or reading a subchapter of a chapter in some other book - you are not there for the prerequisites).
I talk about applying the knowledge a lot because it is a wonderful constraint, it makes it so that you don't accidentally read things cover to cover, but do like a depth first search into the subject instead.
The authors that can write a book such that you can understand it without previous knowledge are wonderful, but it is heavily dependent on the subject. It’s hard to talk about differential equations without first knowing calculus.
It is not possible and I didn't mean to imply that. There'll be endless prereqs. most of the time.
Just that if you need to understand differential equations, and you don't know calculus, the answer in this context isn't reading Spivak from start to finish (here I am assuming a working programmer / self-learner that wants to apply the knowledge - constrained by time and application much more heavily than a student of mathematics for example).
Could you please expand on what you mean by applying knowledge? But sure, I don’t believe reading a book from cover to cover is necessary in most cases, and I seldom do so with scientific books.
I firmly believe you need a software problem / idea to go with your learning. When you run into problems fixing / making what you want, search for what might help you.
Like most algorithm courses are fairly abstract form of programming (writing a line of code) where as a working programmer is a software engineer constrained by time and resources. This means that for example this algorithm course, it teaches you to generalize your solutions but that's not always realistic goal, or even desirable goal, in software engineering, and you might find out this if you implement one of the algorithms to do something for you in a small program.
Applying what you've learned in some software project of yours constraints you nicely such that you can't waste your time reading stuff from cover to cover.
I've spent a lot of time learning algorithms, and learning that certain ones exist, without learning their precise details (for future use). They have never got used. Maybe I used a bloom filter once.
It's had some value in knowing when a certain algorithm is inappropriate, but otherwise it's been a massive waste. Surely things can be better than this?
Jobs bleat for "creative, intelligent people with a knowledge of algos + data structures", so why won't you use this?
You can do light study without having to do MIT-level exam questions (but you should still do some easy problems) and that can get you familiar enough to recognize the patterns and give you the vocabulary to google things
For people uninterested in studying CS itself, this has just become a job signal to work towards.
I don't study things for the love of it (well, I suppose I do) but to apply it, for what value is a skill if not applied?
I learnt these things because they should be useful. But employers would rather throw big data software stacks and a ton of expensive hardware at things rather than use brain.
CS and business programming should NOT be two disjoint subjects[1].
I have always wondered if there would be value in a certification or course track in just Algorithms and Algorithm Design. It's such an intricate but very important field that it can have its own curriculum and be separated from CS. Well, not necessarily separated, but can be considered independently of other CS fields even though it is still part of a CS degree.
There are tons of applications of competency with Algorithms beyond CS. I hate that a person who has great competency with this kind of stuff has to do and be evaluated on the whole CS bundle as if that is only where it matters.
It needs a broader knowledge to be useful at depth. For example understanding the performance of memory isn't linear, memory coherency for multi processor systems, and in recent times distributed systems.
To me, it means understanding the basics of data structures and algorithms, and knowing when and how to apply them to solve problems.
A couple of examples:
Seeing a sorted list of items and knowing how to find an item in that list most efficiently (binary search) [0].
Looking at a map of cities and the roads connecting them (including distances between cities) and knowing that you can use a graph to represent the cities and use Dijkstra's algorithm [1] to find the shortest path between any two cities.
Essentially, being able to look at a problem and see how that problem can be mapped to (and solved by) a specific type of algorithm, then being able to implement that algorithm to solve the problem.
So, if have an array A and need more
space, then allocate a larger array B and
copy over the contents of array A to array
B and continue with array B.
A guess is that in nearly all cases a
better solution would be a tree where the
array subscripts are used as keys and the
array elements, as leaves.
Maybe the main reason to include dynamic
arrays is to do some applied math to
analyze by how much bigger array B should
be than array A.
(2) Heaps
I like heaps.
At one point in the software of my
startup, I have to search through maybe 1
million numbers and end up with the, say,
20 largest. For that I programmed a heap,
and it has worked out great.
There are versions of the heap algorithm
that are better on locality of reference
and when the heap is carried mostly on
slow, secondary storage.
(3) Balanced Binary Search Trees
AVL (Adelson-Velskii, Landis) trees are in
the Knuth reference, and they are
terrific. An alternative is red-black
trees. One of those two is likely the key
to .NET collection classes, and my
startup uses two instances for a simple,
light, fast key-value store instead of
Redis.
(4) Hash Tables
Those are also in Knuth. Hashing usually
leaves me in doubt due to its various
possible problems. But better still
sometimes is perfect hashing as I recall
also in Knuth.
For hashing in general, a good step
forward is in
Ronald Fagin, Jurg Nievergelt, Nicholas
Pippenger, H. Raymond Strong, Extendible
hashing-a fast access method for dynamic
files, "ACM Transactions on Database
Systems", ISSN 0362-5915, Volume 4, Issue
3, September 1979, Pages: 315 - 344.
We used that in an AI (artificial
intelligence) product we shipped.
(5) Sorting
Knuth covers heap sort and shows that it
meets the Gleason bound for sorting by
comparing pairs of keys.
(6) Graph Searching
Looking at their lecture notes, it appears
that they mean versions of shortest paths
on networks.
These are all fairly simple except for
minimum cost single commodity network
flows where each arc has a maximum flow
and also a cost per unit of flow. The
problem is linear programming, and the
classic simplex algorithm applies and
takes on an especially simple form -- a
basic solution corresponds to a spanning
tree of arcs.
Some good news is that if the arc
capacities are all integers and if start
the algorithm with an integer solution,
then the simplex algorithm maintains an
integer solution and will terminate with
one.
It is tempting to see that "good news" as
a case of progress in NP-complete integer
linear programming.
For such content, I recommend
Mokhtar S. Bazaraa and John J. Jarvis,
Linear Programming and Network Flows,
ISBN 0-471-06015-1, John Wiley and Sons,
New York, 1977.
(7) Dynamic Programming
That can be a big subject but does not
have to be. I got a good introduction
from an expert in about 90 seconds while
my cab was waiting to take me to the
airport. I ended up writing my Ph.D.
dissertation in dynamic programming. For
an easy introduction, can have fun, like
eating from the appetizer plate at
Thanksgiving, say, a half hour at a bite
from
Stuart E. Dreyfus and Averill M. Law, The
Art and Theory of Dynamic Programming,
ISBN 0-12-221860-4, Academic Press, New
York, 1977.
One of the amazing advantages of dynamic
programming is how well it handles
randomness -- then have stochastic optimal
control, Markov decision theory, etc.
The flavor of dynamic programming in
computer science can be a bit different,
e.g., applied to search for some string A
in some string B.
Maybe another purpose of the course is to
get everyone all wound up and fired up
about the question of
P versus NP
My recommendation is to try quite hard to
ignore that question.
You should read about dynamic arrays more carefully. They have amortized O(1) insertion which is better than a tree, and the data is contiguous in memory which gives it better cache locality than a tree. They are one of the most popular data structures.
Parts of your post also seem to me to be quite boastful and low-value: (paraphrasing) "the course takes itself very seriously", "why spend so much time teaching so little material", "these topics are mostly old; just read Knuth", and "dynamic programming is easy; I learned it in 90 seconds and then did my PhD in it".
I had a revision of that post, but I had
it ready only just after the end of the 2
hour window for revisions.
The course pressed hard on the students to
devote, what was it, 4 hours of time a
week in group sessions with more hours
in independent study. That's asking a lot
from the students.
In response I had a lesson and purpose
in that post: (A) That collection of
fundamental algorithms hasn't changed very
fast, were much the same 40 years ago.
(B) Nearly all the algorithms are quite
simple and each one can be learned
quickly, including the derivation of its
big-O performance, and coded, running, and
tested in 2 hours or so. (C) I mentioned
Knuth v.3 as a reference: Tough to
improve on that Knuth volume as a
reference for such algorithms. (D) For
hashing, network flows (graph search),
and dynamic programming I gave really good
references -- tough to compete with any of
them. I used some of my experience to
illustrate (A) -- (D).
That lesson should be some quite good news
for any HN readers considering studying
the algorithms.
> Parts of your post also seem to me to be
quite boastful and low-value:
No, I just used some of my experience to
give examples of my points.
> You should read about dynamic arrays
more carefully. They have amortized O(1)
insertion ....
I saw all that. Get the O(1) property
only with some assumptions and some math
derivations, and I mentioned the math.
Two obvious problems:
(1) It is trivial for any of us to think
of applications where the new allocations
and copying would be wildly wasteful.
(2) For the assumptions, we will rarely
have enough information to have much
confidence in our math derivation.
We also can think of
(3) The reallocations, when there are a
lot of them, will create problems for the
memory management, garbage collection.
Sure, any of us can think of niche
situations where (a) we do a few
reallocations and (b) then go for hours,
..., months with no more reallocations and
with the advantages of arrays.
Dynamic arrays don't belong on a list of
Best Algorithms.
Again, my guess is that the interest of
the course in dynamic arrays is an
opportunity to do the math derivations.
The people that MIT course is intended for
are maybe good HN readers, so an issue for
HN readers is, should they devote a lot of
time to that course? We review movies,
restaurants, etc., and we might also
review courses.
Your attack on my review was mostly an
attack on me: You resented my post
because I mentioned some of my background
and in response attacked me. Instead,
make a good contribution of your own,
maybe as a review of that MIT course.
I'll state the basic lesson again:
The algorithms in that course are nearly
all quite good but old, with some really
good old references, and can be learned
quickly, say, the whole course in a few
weekends.
Thanks for taking the time to read and respond. I admit the second paragraph of my post was a bit aggressive and I was on the fence about posting it. I don't have a problem with you sharing your background but the parts I mentioned previously came off in a certain way to me.
I found your initial argument on dynamic arrays dismissive because you admitted you had never heard of it, then implied that they don't make sense as if to justify why you had never heard of them. I find that intellectually dishonest and it really ticked me off; it's just confirming one's own bias. I still find your argument a bit dismissive although we can agree to disagree. It's not a case of worthiness to be on a list of best algorithms or of fancy math derivations. They are widely used in practice, are O(1) for many operations, work well with caches, and are worth studying for that reason.
As for making a "good contribution of my own" by reviewing the course, I don't feel the need. It's a standard undergrad algorithms course of the kind that most CS students would take. I don't think there's any value in reviewing the syllabus when they all tend to cover the same material.
I'm probably won't reply again so (sincerely) have a good day. I realize you feel attacked but if you're going to opine on something then other people might opine on your opinion. You don't hesitate in your writing style so I didn't either. I just apologize if I made it too personal. I read some things that I couldn't let slide.
I'm not sure why you feel the word "algorithms" needs the word "math" in the same sentence. I feel math is implied in the word. Like, how we don't need to say "mathematical statistics" or "mathematical calculus".
Anyway, first line of the course description gives you what you are asking for:
> This course is an introduction to mathematical modeling of computational problems
I agree with your point, just wanted to mention that mathematical statistics is a thing. It's usually a name given to textbooks or courses where statistics is built up from foundations in probability theory and where theorems are stated and proven vs introductory statistics courses that approach things with a more decidedly applied orientation that usually also touch lightly on subjects like experimental design and data collection
As someone who sucks at maths and puzzle solving in general, Steve Skiena's book is proving to be quite approachable for me. Although I have to jump back and forth between the book and Khan Academy to look up some of the Maths.
He also has video lectures for the book: https://www3.cs.stonybrook.edu/~skiena/373/videos/
Another thing that helped me was brushing up on C programming skills. As a Python programmer, algorithms like hash maps don't make any sense because you just slap keys and values on a dictionary and call it a day but in C, you get to see how buckets and hash functions are used to build hash maps.