Mathematics of Machine Learning (2016)

mykeliu · on Sept 1, 2017

Good God that pie chart.

edit: Should've probably elaborated. Not only is a pie chart usually ineffective at conveying information, but these percentages are totally arbitrary. There's not really a specific explanation for why linear algebra is weighted more than multivariate calculus and algorithms & complexity combined, not to mention the fact that linear algebra and multivariate calculus in machine learning overlap to a large degree, and they both feed into algorithms & complexity as well.

And then on top of that, the 3D and shading seem totally unnecessary, but that's just me.

nxc18 · on Sept 2, 2017

I agree. The degree to which that pie chart is both hideous (we can do much better in 2003, oh wait, 2017) and uninformative+dishonest makes me not want to trust IBM with understanding any of my data (which is what machine learning is all about).

Heck, the fact that the humans at IBM can't classify it as dated makes me suspicious of their ability to make a machine classify anything.

IBM has been making a lot of marketing efforts around trying to own 'big data' and machine learning. Depending on who you ask (typically people entrenched in the dying mainframe space), IBM already is the predominant choice in cloud. If they're going to rely on marketing exclusively, they should at least shell out for pretty graphics.

trendia · on Sept 1, 2017

I'm not sure why you're criticizing that 3D pie chart -- big companies pay lots of money to consultants from IBM to get a chart that good.

thinkr42 · on Sept 2, 2017

Surely there's a Chart.ly (or alternatively 314Chart.io) that is in the works somewhere - disrupting this market using tensor flow deep learning networks to produce the best pie charts possible.

jwdunne · on Sept 2, 2017

It's still early stages. I'd recommend a multivariate test on the best pie chart colours, following Google's astounding results from their search link colour test.

mykeliu · on Sept 1, 2017

You're right; perhaps my envy got the best of me.

laichzeit0 · on Sept 2, 2017

I stopped reading when I saw that chart and came here to comment. Lo and behold what is the top comment..

It could have been worse though. It could have been a 3D donut chart.

Btw there are well studied reasons for why this particular chart is a terribly bad idea for displaying quantitative information [1].

[1] http://slideplayer.com/slide/5143061/16/images/9/Human+Visua...

thinkr42 · on Sept 2, 2017

Absolutely agree. This weighting is pointless considering the interdependent nature of the items being weighted.

Houshalter · on Sept 2, 2017

You might enjoy /r/dataisugly.

dmix · on Sept 2, 2017

Someone should create a wiki to collect these types of math books/resource recommendations.

Basically a definitive list of math for AI/CS with subpages for the various branches (ML, NLP, haskell-esque typed FP, etc) and focused on self-learning or even hobbyist entertainment instead of focused on being good for formal university classes - which is hard to discern from just browsing Amazon reviews which are mostly full of anecdotes from people's old college days.

I remember when I started down the relearning math rabbit hole last year and found so many threads on HN via the search feature recommending different math books in each.

It also doesn't help that there are 100 math books written each year thanks to the backwards university-fueled incentivize systems to write new ones each year.

I ended up spending a ton of time hunting down the best ones for each subject. Which always seems like a great opportunity for optimization if someone takes a crack at it.

Although once you get past the basics of math I've found a good general rule is to get one the Dover [1] math book series for the particular subject. These were written largely in the <1990s but are almost always still relevant and always my favourites. And notably frequently far more succinct than the university professor ones.

[1] https://www.amazon.com/s/field-keywords=dover+math

rawnlq · on Sept 2, 2017

Wiki didn't catch on for whatever reason, but github "Awesome ____" lists did:

https://github.com/sindresorhus/awesome

You can find one for machine learning:

https://github.com/josephmisiti/awesome-machine-learning

auvrw · on Sept 2, 2017

i suggest that it's more important to collect results (theorems) than texts. course descriptions on university sites would be a good thing to scrape (and and NLP >>?) for both.

WARN: the following is ranty.

it turns out that, no matter where you find them, the results in math are always the same. Hahn-Banach is Hahn-Banach, whether you read Folland or Rudin.

moreover, language and notation are very consistent across sources, and math tends to be very "optimized" for /human/ learning. i haven't read much ML theory (yet?), but my (uninformed) gripe about it so far is that the math feels "noisy" in much the same way that Java is often called a "noisy" language: things are often expressed using the minimal mathematical machinery (e.g. multivariate calculus rather than geometry for gradients). as a result, i find the ML stuff imposes a lot of cognitive overhead to decipher equations that could be written more simply.

so while the ML community might challenge the programming community to learn more maths on a regular basis, as someone who studied math in school and has historically worked as a non-ML developer, i would challenge the ML community to develop the theory in such a way that it encourages a more intuitive understanding of mathematics rather than the bare minimum (multivar calc + linear algebra) machinery.

----

for all this i shall plug one (perhaps lesser-known?) multivar calc text that i found recently and i think looks pretty good:

Casper Goffman's _Calculus of Several Variables_

Houshalter · on Sept 2, 2017

Metacademy might be what you are looking for.

It's easy to find textbook recommendations just by searching "best textbook for x". It's true most people just recommend the books that are the most well known and they are familiar with. It's hard to side by side comparison two textbooks. But it does filter out the garbage which there is a lot of.

Best advice is to look at the table of contents and see what it covers and doesn't cover.

ketralnis · on Sept 2, 2017

> Someone should

By all means, go for it!

ghufran_syed · on Sept 2, 2017

Come on folks, that's not particularly charitable. Of course the numbers on the pie chart are arbitrary, the whole article is clearly presented as the author's opinion. I still found it more useful than just saying "everything is important".

nerdponx · on Sept 2, 2017

It's not that the numbers are arbitrary, it's that the chart flagrantly implements the 3 biggest "don't ever do this" items taught in data visualization courses, books, and articles. In an article aimed at novice data scientists, no less!

It would be like reading an article about what an aspiring engineer needs to know about lower-level systems programming, and then coming across a snippet of bubble sort written in the ugliest Perl you can imagine.

Myrmornis · on Sept 2, 2017

What on Earth's wrong with it? It's perfectly clear to normal people. And please, don't reference Tufte. People who go round criticizing data visualizations and quoting Tufte are like audiophiles -- no-one else notices or cares.

BeetleB · on Sept 2, 2017

>What on Earth's wrong with it?

The usual things that are wrong with pie charts:

Ask yourself: How useful would the chart be had the numbers not been provided? Why did he have to give the actual figures? Perhaps because it would be hard to gauge areas? Especially for a 3-D pie chart where you are not looking at it head on. The 25% is not much bigger than the 15% (it should be 66% bigger). And with that projection, it's hard to tell.

In fact, the 25% looks like one fourth of the pie chart - the angle it makes is 90 degrees. That's good and as it should be. Looking at Multivariate calculus and Algorithms combined, it really looks like 90 degrees to me too. But it's not 25%.

What value is added by padding blank spaces between all the slices of the chart? Seriously - how does that help at all? Imagine looking at a pie chart head on (i.e. in 2-D), and then seeing they decided to expand the circle and add spaces. Why?

Now imagine presenting this as a simple bar chart. It would be clearer, and the pie chart shows nothing more than what a simple bar chart would. The bar chart would be less confusing.

f00_ · on Sept 2, 2017

Why does it use green for both Probability and Algorithms? Why does it use red for both Linear Algebra and Others? Why is it 3d?

Thin font is hard to read

Myrmornis · on Sept 2, 2017

> Why does it use green for both Probability and Algorithms? Why does it use red for both Linear Algebra and Others?

I don't know very much about variations in human color perception, but to me at least, all the colors are clearly distinct (Algorithms&Complexity is a light blue (cyan perhaps), and Others is a sort of pink/purple (magenta perhaps).

> Why is it 3d?

Why not? Don't be so serious!

f00_ · on Sept 4, 2017

i may in fact be color blind

j_s · on Sept 2, 2017

Ask HN: What maths are critical to pursuing ML/AI? https://news.ycombinator.com/item?id=15116379 (135 comments, 4 days ago)

uptownfunk · on Sept 2, 2017

I know this is probably irrational but because ibm is associated with garbage Watson I don't take any of their links seriously anymore.

Houshalter · on Sept 2, 2017

Watson isn't garbage, it's just vastly over hyped by their marketing department. The technologies behind it are pretty impressive though.

fonnesbeck · on Sept 1, 2017

Stopped reading when I saw the 3d pie chart atrocity. I'm sure the rest is great.