How? just get started working on a fun problem. A good place to start is keyword extraction. You don't need a PhD or expensive tools. All you need is some free time and willingness to read some cool stuff.
Copy a few articles into text files and get working on implementing some of these methods until you have enough of an understanding to construct your own methods for the fun of it.
I played around with this a bit to develop https://www.findlectures.com, so knowing what works/doesn't work there I'm developing some NLP scripts to support my use cases.
I never thought about this particular use-case. The subtitle for TED talks should be an ocean of info for you to extract keywords from :D Pretty neat site you got there. I will be using it. Thanks!
I would say that a good example for starting in this field would be to implement something like Tf-Idf [0] for identifying keywords on a set of documents. I don't know where one can find current datasets for this, but I made WikiCorpusExtractor [1] to build sets of documents from the Wikipedia.
The only thing one really needs is to count the frequency of words in each document and do very simple math. Tf-Idf is still very relevant today and provides you with a very good idea on how statistics is used on text-mining.
I started even simpler than that. I started by just eliminating stopwords and count the frequency in each word in the document itself. I did not use a set of documents as the goal was for the algorithm to be used on the spot for a single block of text.
A few months later and after many iterations + a whole lot of testing, the algorithm now can extract super relevant keywords 90%+ of the time!
I wish I knew about the WikiCorpusExtractor. Thanks for the link!
Copy a few articles into text files and get working on implementing some of these methods until you have enough of an understanding to construct your own methods for the fun of it.
Here's some good reading material:
https://www.facebook.com/notes/facebook-engineering/under-th...
https://www.researchgate.net/profile/Stuart_Rose/publication...
http://cdn.intechopen.com/pdfs/5338.pdf
https://arxiv.org/pdf/1603.03827v1.pdf
https://www.quora.com/Sentiment-Analysis-What-are-the-good-w...
http://hrcak.srce.hr/file/207669
http://nlp.stanford.edu/fsnlp/promo/colloc.pdf
https://arxiv.org/ftp/cs/papers/0410/0410062.pdf
http://delivery.acm.org/10.1145/1120000/1119383/p216-hulth.p...
Edit: Don't get deterred by the math formulas in these papers. They look far more complicated than they actually are.