Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

1) could probably be achieved reasonably by filtering out keywords from the sources and simplify them ("app find friend" or "app sell item") and generate tags based on that.


1) you could also try to use word embeddings for distance. Afaik they work better than bag of words when few words are present in one observation.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: