Examples and best practices for building recommendation systems

emourkai · on Jan 18, 2019

this really should've been named "recommendations for building recommendation systems"

hoaphumanoid · on Jan 18, 2019

hey one of the authors here. We are planning to create some notebooks on recommendation about recommendation algos :-)

yueguoguo1024 · on Jan 19, 2019

LOL brilliant idea! just don’t hesitate to create an issue and follow that with a PR!

anjc · on Jan 18, 2019

If you're the author, I'm not sure that the definition of recall@k is correct.

>Recall@k is a metric evaluate how many items, in the recommendation list, are relevant (hit) in the ground-truth data.

hoaphumanoid · on Jan 18, 2019

thanks for the comment, would this definition be clearer: "Recall at k is the proportion of relevant items found in the top-k recommendations"?

tandav · on Jan 19, 2019

My collection of usefull links about recommender systems (dirty) https://gist.github.com/tandav/a2f87e91cb5c441c57657cceb788c...

tandav · on Jan 18, 2019

Sadly but evaluation metrics are only implemented in pyspark.mllib (RDD API) but not in pyspark.ml (Dataframe API)

Also worth mention about implicit feedback

yueguoguo1024 · on Jan 19, 2019

It is also my sadness - AFAIK pyspark.ml does not support ranking metrics (does it?). So we wrapped the RDD and expose data frame as interfaces in our own implementations (the `reco_utils`). Implicit feedback is mentioned in a notebook under `staging` branch (https://github.com/Microsoft/Recommenders/blob/staging/noteb...). Will be there soon in the next release (1st of Feb)!