Hacker News new | past | comments | ask | show | jobs | submit | more shazeline's comments login

One common approach is to look for the elbow in the curve <metric> vs K (number of clusters). This is essentially finding the number of clusters after which the rate of information gained/variance explained/<metric> slows. I believe it's possible to binary search for this point if you can assume the curve is convex.


Current UCLA student. I just sent a request. Waiting to see what happens.


Keep me updated please!


I think it is more accurate to say that data science isn't Kaggle. The process of taking a data set and fitting it to the most robust model is certainly machine learning.


I wrote a quick script to generate a list similar to (but not as deep or well-annotated as) Prof Palsberg's.

https://gist.github.com/shazeline/d9881d06be31a59a93d3

It's a BFS for Google Scholar seeded with Herbert Simon.


postmaster


Yeah, trg2 would need to put all the edge case rules towards the beginning. That, or just put the basic rules at the beginning and have separate conditionals at the end to handle the edge cases.


Good call - thanks for the CR! :)


reminds me of commit logs from last night

http://www.youtube.com/watch?v=V44kscaJe3M


Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: