A lot of easy to digest content in this! Always great to see quality free material to help more people pick up machine learning and get involved solving problems using data science.
One thing I didn't see covered in depth here was feature engineering, which is the process of preparing your raw data for the machine learning algorithms. They cover it briefly in the chapter on "Practical Considerations", but anyone looking to apply ML in the real world should look into feature engineering more on their own.
One resource I recommend (and I am biased), is a python library for automated feature engineering called, Featuretools (https://github.com/featuretools/featuretools/). It can help when your raw data is still too granular for modeling or comprised of multiple tables. We have several demos you can run yourself to apply it to real datasets here: https://www.featuretools.com/demos.
With all of these online courses around, I'm just curious: is there an ML course that teaches you the right model to use? Say for example, the right amount of layers/nodes for a neural net? As a newcomer doing one of these machine learning MOOCs on his free time, it seems to me that it's about chucking in a load of parameters into a black box hoping for the best.
Usually you throw everything and see what sticks. There are some obvious trends like if you're doing image classification, a convnet will beat just about anything else. There are problems which are particularly well suited to e.g. random forests and cases where you might want to use a simpler model which will run faster at the expense of a bit of accuracy.
In terms of neural nets, people have more or less done the hard work for you. Typically you'll want to take an empirically well-performing network like VGG, ResNet, Inception, etc and then re-train the top few layers. There is ongoing work in the field to try and train the structure of the network as well.
If you look at Kaggle, the vast majority of winners do so using ensembles. In a recent example - Iceberg or Ship - the winning team used a bit of feature engineering and apparently over 100 ensembled CNNs:
> We started with a CNN pipeline of 100+ diverse models which included architectures such as customized CNNs, VGG, mini-GoogLeNet, several different Densenet variations, and others. Some of these included inc_angle after the fully connected layers and others didn’t.
> Usually you throw everything and see what sticks.
most practitioners start with the simplest possible learner, then gradually, and thoughtfully, increase model complexity while paying attention to bias/variance. this is far from a "kitchen sink" approach.
Certainly that's what most sensible practitioners do. I somewhat doubt that most people follow this to the letter every time.
It's a nice theory, and it works intuitively with models where you can ramp up complexity easily (like neural nets). It's less obvious if you have a "simple" problem that might be solved with a number of techniques. In that situation I don't see why you would be criticised for trying say any of SVM, random forest, logistic regression, naive bayes and comparing. Pretty much the only way you can categorically say that your method is better than others is by trying those other methods.
The simple approach actually came up in the iceberg challenge. The winning team won because they bothered to plot the data. It turned out that the satellite incidence angle was sufficient to segment a large number of icebergs with basically 100% reliability. So they simply thresholded that angle range and trained a bunch of models to solve the more complicated case when there might be a ship.
This was one of the things Andrew Ng really hammers on in the coursera course. This alongside separating out the training set from the cross validation set for tuning parameters went a long way to dispel some of the "magic" in how you iterate towards a sensible model.
There is no “best”. A common thing to do is to fit a ton of models and then choose the best based on out-of-sample/CV performance, or a mix of top performing models.
I quickly skimmed a few chapters and like what I see: the language is very friendly and the examples are easy to understand, but there is enough rigor to make this a good introductory book on the subject. Nice!
I have not yet found a course that can cover machine learning and also explains the Maths needed to understand this.This course is no exception. Its extremely difficult for a programmer with no solid footing in Maths to understand it. Any help is a lot appreciated
True, very true and you'll hit that language barrier in no time. The courses, mostly, provide hands-on experience and don't explain, say for example, what does standard deviation mean?
But this is a good news/bad news kind of thing. Bad news, you need some statistics education to make a sense of what your computer is telling you. Good news, that mathematics isn't "high-grade" mathematics involving integrals and other stuff (as far as I can see anyway).
After I finish those I'll move to Kirill Eremenko's a-z courses on Machine Learning, AI and Deep Learning. I've found that even though they teach cool tricks and some basics before going in detail, that basics part didn't contain enough information for me as a new student. So I feel if you have some proper background in stats and python data analysis you can skip the parts I mentioned and go straight to a-z courses.
Thanks for your insights. I was wondering if one also needs a footing in vector calculus / linear algebra / integrals and other such words I have no idea about.
Also Kirill Eremenko has 38 courses listed. In what order should one take them?
I sympathize, but do keep in mind that it would be very difficult to teach all the math at the same time. Imagine trying to run a course in French on analytic philosophy. If your students didn't know much French coming in, you'd be in a tough spot.
I understand. I see so many people joining the ML / AI bandwagon and I wonder if so many people really understand the maths behind it or is it as simple as calling a function?
Wow, this is a lot more in-depth than I was expecting. I started out with the O'Reilly book (Hands-on Machine Learning)[1]. There's some overlap there, but I'd say both this course and the O'Reilly book are well worth your time if you're starting out with machine learning.
[1] https://amzn.to/2kKaNiQ
There is some truth to this. Many jobs in machine learning are all bark and no bite. The company may even have created a machine learning team solely out of hype, and just equivocates machine learning with making d3.js visualizations or maintaining Spark jobs that just tabulate summary statistics.
Yet these jobs will still require you to do outrageous things during the interviews, like deriving the full backpropagation formulas for a 3-layer MLP network, or explain some esoteric issue with vanishing gradients or offer from memory a bunch of time complexity info about the SVM fitting algorithm, etc.
They require exponentially more impressive knowledge about machine learning in the interview than what the job experience will actually offer once you’re hired. Most positions will fundamentally make your skills atrophy.
As a result, a lot of people resort to writing blog posts or courses about their experiences implementing toy models, studying trade-offs between approaches, analyzing publicly available data sets.
In part it’s to help pad a resume and look relevant for getting hired. In part it’s to build or exercise skills that the person’s day job won’t actually enable them to use. And in part to try to get your name out there and associated with a hyped up field.
I'm not questioning your analysis of behavioral psychology of the herd, but this particular example, prof. Hal Daume of University of Maryland and Microsoft Research has been long associated with the field, though not particularly neural networks, since way before the current hype.
I totally agree for this specific link. Many well-respected and established people in machine learning write course materials.
I was only responding to the parent comment regarding why random blog posts, articles, and course materials seem so widespread and constant in machine learning overall.
It's the second time I see you whine about job interviews. If you're still bitter more than a few days after the rejection, you better start attitude adjustment process. Learn from your mistakes and move on.
Your comment is significantly uncivil and seems to indicate that you take a capricious attitude towards others.
I deny that any part of pointing out the hypocrisy, inefficiency and myopia of tech hiring in general, including specific parts in machine learning, is "whining."
Rather, speaking clearly about how ridiculous it is happens to be progress, however small, toward correcting glaring social wrongs in our myopic industry.
And finally, none of this is in connection to my own job. I manage a team of machine learning engineers in a large enterprise company. From my own experiences earlier in my career, from some of the battles I have had to fight to protect my current team members, and from some of the horror stories my team members and job candidates I've interviewed have shared with me, my general certainty that our industry gets it completely backwards has been informed.
Between this topic and the abject failure of open-plan offices, I cannot think of any more worthwhile tech industry topics to continuously and unrelentingly "whine" about, and feel that such "whining" is in fact a civic responsibility for any mature critical thinker, and indeed I am quite proud of it, and hope to keep it up with ebullience.
There are a lot of courses about almost every topic of importance. Have you ever looked at the number of books that teach something as simple as mx+b? The number of courses that teach about the different planets in the solar system? The number of courses that teach about Shakespeare?
Lots of people learn a lot of different ways. The more courses available for a single topic increases the likelihood that a learner will find a course that matches their learning style.
Writing courses is also a great way for someone to internally reinforce what they learned, and expose what they need to know more about.
I'm not complaining or attacking in any way the authors, it's very cool to see a lot of material on any subject.
Yes, I googled about other subjects as well. Sometimes I'm very surprised how many tutorials and materials is out there, even on very narrow topics.
In the university ages I tried to build a self made telescope. I found a lot of tutorials and books on the subject, event how to create self-made parabolic mirrors, Internet is f*kin awesome -.-
I just wondered if people with good AI/ML knowledge can find jobs.
One thing I didn't see covered in depth here was feature engineering, which is the process of preparing your raw data for the machine learning algorithms. They cover it briefly in the chapter on "Practical Considerations", but anyone looking to apply ML in the real world should look into feature engineering more on their own.
One resource I recommend (and I am biased), is a python library for automated feature engineering called, Featuretools (https://github.com/featuretools/featuretools/). It can help when your raw data is still too granular for modeling or comprised of multiple tables. We have several demos you can run yourself to apply it to real datasets here: https://www.featuretools.com/demos.