Can anyone recommend a good intro in machine learning that does teaches the building blocks of the math side of things? It is fascinating as a topic, but there is such a large prerequisite learning curve that it seems out of reach for those not as strong in math.
That might just be the reality of it, but hoping there might be a better introduction (even something super simple like a Codecademy equivalent).
I was in a similar situation a year ago. Two things fixed my problems:
I took a course on Linear Algebra (Bretscher's book up to chap 9) and a Probability course (Ross' book up to chap 6) and did very many problems by hand on paper. I just finished a ML course (Bishop's book, and Jordan's book), mixed grad+undergrad, which was 80% problems w/ pen and paper and 20% code up something algorithmically trivial but mathematically challenging, and don't think I would've been able to pick up the additional math along the way without these two great books and their many exercise problems behind me.
I read layman's explanations of ML concepts a year ago, and got nowhere in terms of my own implementation+debugging/improvement upon established techniques. Now I can solve problems I saw a year ago and thought "no one can do this."
My advice is to take the gateway drugs first, Probability (Ross <- I love this book!) and Lin Alg (I like Bretscher much better than Strang, but not everyone agrees with me :) Take a course in real life (for a grade and a transcript) at a competitive university if possible, nothing makes you study thoroughly like a gun to your head.
Sometimes the paper can be thinner. Sometimes it's in black and white and the US edition is in color. The international edition is almost always softcover and the cover may be in Chinese (for instance). The problem sets may be in differing order.
Many of these things are described in the comments. I almost exclusively buy international textbooks for home reference if available. The price difference and the relatively small quality difference makes it a no-brainer. If you are doing it for a class though, find a friend with the overpriced version for homework.
My guess is that you won't find any course that explains all the prerequisite math. It's probably more useful to build a solid foundation in probability theory (and therefore calculus) before going on.
For machine learning, a good place to start is Andrew Ng's course on Coursera:
It is a bit of a jump, but it is a great course in presenting the field of machine learning and explaining the mathematical and statistical underpinnings in a systematic way.
I just finished the Coursera course by Andrew Ng. It was great. The only hand waving done with math was when calculus was necessary. You can take some extra time to do that work yourself if you like, but you will not be missing the underpinnings of why things work statistically. The introduction to neural networks what finally gave me that aha moment.
It is a very self contained course that is quite easy to follow. You can skip the programming exercises if you don't have the time.
For anyone interested in more about the specific math of neural networks, http://www.iro.umontreal.ca/~bengioy/dlbook has a couple good introductory chapters that give overviews of most of the necessary topics for NNs, but also provides additional resource suggestions if you need more in-depth info on a certain subject.
There is a large mathematical foundation behind machine learning, that is not even taught to most computer science students. The concepts that built machine learning are often found in engineering and mathematics:
It's not easy to learn, especially if you are not strong in math, but if you want an intuitive understanding as to how machine learning works, I would recommend learning a combination of probability, linear algebra, and formal theories of computation (abstract machines):
This book was really what opened the gate of tying the content together:
It doesn't cover machine learning in a lot of detail (there is one dogs and cats example) but J. Nathan Kutz book is an excellent introduction to the math behind this and other data modeling techniques with lots of hands on examples. The best thing about this book is that it is a broad survey of the maths you need in Data Science -- you get a feel for how various subjects fit together from a high level with practical examples, then you can branch out to other sources to learn more about specific methods.
I have found Andrew Ng's math handouts for CS229 Stanford (not Coursera) the best. There's necessary introduction and abstraction of irrelevant details to the topic at hand.
This is a book from MIT press not quiet yet finished but getting there - might be a good source for material for getting the linear algebra and the probability pieces! Core material is Deep Learning
http://www.iro.umontreal.ca/~bengioy/dlbook/
I think this has to do with learning styles, but I've found that working on real problems (like those on Kaggle) is a better way to learn machine learning than reading text books. When I'm working on problems, it becomes evident what I don't know. Then I'm able to intelligently go through the books and learn the relevant bits. When I start with the math, I tend not to remember anything because I have no foundation to attach the math knowledge to.
A very interesting requirement. Thanks for noting it.
I'm setting up a blog on Artificial Intelligence (which inherently includes Machine Learning) focused on the contents of the AIMA book (by Stuart Russell and Peter Norvig):
http://www.metacademy.org/ is a great source. It tells you what all the prerequisites are for everything as well as where you can learn them and what their prerequisites are and so on.
Does anyone have the book? Having looked through the ToC on Amazon, there are a few topics that interest me and it seems to be more in-depth than these lectures. But as it is from 1997 (and doesn't appear to have been updated), I'm concerned it will be a bit out of date.
I read this cover to cover for my ML course at Imperial College London in UK. While not an easy read, reviewing the same topics a few times did make you understand the fundamentals better. AbeBooks sometimes has it going for £20(~$30). The exercises were a bit tricky as often the answers weren't attainable by simply following the book and resulted in you needing to consult other material.
I used this book in my Machine Learning course last spring at Georgia Tech. I wouldn't consider it out of date. It is missing a few topics like SVMs that we covered, but otherwise it's a good introduction.
It's still a good introduction to the principles: how problems like regression, classification, and reinforcement learning are defined; concepts like overfitting, bias-variance tradeoff, etc.; some general classes of algorithms and how to analyze them.
The age mainly affects its usefulness as an off-the-shelf guide to applied ML, because some of the currently best performing general-purpose algorithms aren't mentioned [1]. It also spends quite a bit of time on algorithms now considered mainly of historical interest, like version spaces.
So imo its main current usefulness is as a foundational text, which it's quite good for. It helps that it's also well written and understandable.
[1] A recent empirical analysis found that random forests and support vector machines seem to perform most consistently well at classification tasks, neither of which are in this book. http://jmlr.org/papers/v15/delgado14a.html
As far as I can tell there are lots of tiny fragments each about 1-2 seconds long rather than one video file.
It's actually not just a video, but a specially designed player that shows the lecture video and the slides in sync, along with a bunch of bookmarks so you can jump to specific slides.
Try it in Chrome. At least on the Mac, there's a dedicated but cut-down version, which does not require SilverLight. It's based on FlowPlayer and it also seems to work with Flash disabled.
I am checking on my iPad, though I don't get prompts to install silver light, I get redirected to some page, it says some error with session and to contact support :-/
However, with the exception of one or two of those courses, those are nothing but introductory courses. And while this is great material to have access to, grants almost no access to the wealth of knowledge at CMU.
I have a "pre-existing working knowledge of probability, linear algebra, statistics and algorithms", but can't he be more precise ? I mean to what difficulty are those math used ?
This is a Phd-level course at CMU. Pretty heavy on math. You can just take a look at the material to know the difficulty. If you are not sure, you probably should try easier ones such as Andrew Ng's on Coursera.
I am assuming he means at a undergraduate level. If you need a probability and statistics book to refresh, I can recommend "All of Statistics" by Wasserman.
Presumably this means at the level of an introductory undergraduate-level course in each of those subjects (the kind of courses targeted at first or second year science/engineering undergraduates). You could try working through the material, and if anything is beyond you, it shouldn’t be not too hard to find the appropriate textbook and catch up.
That might just be the reality of it, but hoping there might be a better introduction (even something super simple like a Codecademy equivalent).