I've been thinking about something like this for a while. My idea was to use dat...

mcphilip · on March 31, 2014

I started on something similar as a side project[1]. I decided to build a dataset from the AFI Top 100 Films list and persist it in neo4j. The goal was to find interesting questions to answer with this dataset that couldn't easily be googled.

Most of my time thus far has been spent gathering the dataset, but I do have a few example cypher queries answering the following simple questions [2]:

1) What actors have appeared in the most AFI Top 100 films?

2) What are the genres of the top ten films?

3) Have any actors appeared in 2 or more of the top 25 films?

I'm working on building a much larger data set using a combination of freebase and imdb so that I can have enough data to start exploring much more interesting interesting questions (e.g. graph the frequencies of genres over the past 60 years; for a given film, find movies with the greatest overlap in genres, actors, and directors; generalize the n-degrees-to-bacon problem to work on any two actors; etc).

[1]https://github.com/mcphilip/film-graph

[2]http://htmlpreview.github.io/?https://github.com/mcphilip/fi...

rpicard · on March 31, 2014

Very cool. Thanks for posting this. The n-degrees-to-bacon problem is actual what made me think of this in the first place. It would be great to be able to plug in two actors and have it spit out an answer with the shortest path.

mg · on March 31, 2014

Im doing something similar at http://www.movie-map.com

Its not based on imdb, but based on http://www.gnovies.com

LanceH · on March 31, 2014

I've contemplated the idea of taking soundtracks from movies and comparing them to a user's favorite tracks.

It wouldn't provide accurate prediction of the best pick movie to watch, but it might come up with an indirect, quality pick that might otherwise never been seen.

rpicard · on March 31, 2014

Wouldn't it be awesome if it turned out to be a great indicator of movies they'd like though? That sounds like a fascinating experiment.

rpicard · on March 31, 2014

That's awesome! What is your algorithm like for determining the similarity? There are some good answers, but The Wolf of Wall Street is apparently pretty close to Frozen. ;)

rmc · on March 31, 2014

"Films similar to X" is definitly a good use case. But sometimes you can just go to the Amazon page for a DVD, and look at "people who bought this also bought that" ;)