Show HN: QuoDB – Movie quote search engine based on subtitles

boomzilla · on Aug 5, 2014

Ah, some search relevance needs to be worked on :) No movie should rank higher than the Terminators for "I'll be back":

http://www.quodb.com/#search/i'll%20be%20back

NAFV_P · on Aug 5, 2014

I typed in a few ARNIE quotes:

  Terminator:
    "nothing clean right" - no results
    "fuck you, asshole" - one result for Terminator, but the phrase occurs twice in the film.
  Predator:
    "If it bleeds, we can kill it" - Predator came up, and a few others (interesting).
  Total Recall:
    "Sue me, dickhead" - it got that one.
  Commando:
    "You're a funny guy Sully, I like you. That's why I'm going to kill you last." - no results.

I'm probably gonna spend all night on this.

EDIT: reformatted.

cettox · on Aug 5, 2014

"asta la vista baby" query does not result with Terminator at all!

pelario · on Aug 5, 2014

For that you need auto correction; "hasta la vista baby" gives the correct result

thisjepisje · on Aug 4, 2014

I think lots of us have had this idea, great to see it implemented.

superasn · on Aug 5, 2014

I've often wondered how a database such as this can be used in other fields of programming like say, a text-to-speech engine[1] where using subtitles the algorithm can guess the context of the conversation to produce better results.

[1] http://www.slate.com/articles/technology/technology/2009/03/...

stingraycharles · on Aug 5, 2014

I actually worked on this exact problem as an intern job at our university. We used a huge corpus of communication (for example, we had access to all the emails every sent internally at Enron).

We used this as the basis to train a speech-to-text engine by automatically correcting likely-wrong interpretations. "I go loo school" would be corrected to "I go to school", for example. It worked remarkably well.

The basis of all these subtitles can be used, but there are far bigger (and better?) collections of data to be used to train these machine learning engines.

haraball · on Aug 5, 2014

Could you recommend any of these data collections if they are open to the public?

law · on Aug 5, 2014

This is very likely the Enron corpus that was used: https://www.cs.cmu.edu/~./enron/

stingraycharles · on Aug 6, 2014

I can confirm that this is the corpus. I can also confirm that, even though the emails are all from mid-to-senior management, the writing style is very sloppy.

acangiano · on Aug 4, 2014

The first thing I searched for was "you look like shit" which is a very common remark in movies and shows.

http://www.quodb.com/#search/you%20look%20like%20shit

531 titles. Wow.

jnks · on Aug 5, 2014

I'm partial to "We've got company!"

http://www.quodb.com/#search/we've%20got%20company

Incoming is the moral equivalent (and is much more popular), but is less impressive since it's only one word.

http://www.quodb.com/#search/incoming!

oneeyedpigeon · on Aug 5, 2014

http://www.quodb.com/#search/is%20my%20middle%20name

jamespo · on Aug 5, 2014

Where are the movie cover thumbnail images from? I'd like a source for an idea I have.

adityar · on Aug 5, 2014

Suddenly Fight Club is in the same league as Ugly Betty...

http://www.quodb.com/#search/i%20want%20you%20to%20hit%20me%...

cafard · on Aug 5, 2014

Very neat. I queried for a line I remembered as "a story Englishmen tell when they're down in the mouth", and it corrected this "Englishmen tell it [etc.]", identifying the movie as Beat the Devil.

krmmalik · on Aug 5, 2014

I typed in "finality" as the search term. There's a scene in which this word is used where Nick Nolte gives a speech to the Hulk. It only came up with results that had "finaLLy" in the results(?)

wamatt · on Aug 5, 2014

Nice design; fast and functional too. Kudos!

So while the fuzzy matching is neat, sometimes it's handy to be able to perform an exact search as well.

Typically this is done using "quotation marks" around the search term(s).

xerophtye · on Aug 5, 2014

Interesting case: i searched "screw you" and it als turned up results like "are you screwing with me?" , "we would have been screwed..." etc

LukeShu · on Aug 5, 2014

Very cool!

What is it using localStorage for? Without dom.storage.enabled, it's just a blank white page with a footer.

fletchowns · on Aug 5, 2014

Is this legal?

golergka · on Aug 5, 2014

As someone who often hunts old movies for samples of random phrases, this is just so perfect. Thanks.

boristhespider · on Aug 5, 2014

Excellent. One useful feature would be the ability to sort search results by age, for example.

mileschet · on Aug 5, 2014

congrats ! do you plan to include subtitles from another languages ?

tonglil · on Aug 5, 2014

Waterboy: "this is some high quality h2o" yields nothing.

HugoDias · on Aug 5, 2014

Ok, really works! http://cl.ly/image/2b1l1s321123

d0100 · on Aug 5, 2014

This is halfway trough what I wanted to do. Just add the movie's clip together with the quote and, boom, gold.

xerophtye · on Aug 5, 2014

One developer did something like that. She'd use subtitiles to create .gif's of movie quotes. made a utility for it! will post if i can find the link

mpeg · on Aug 5, 2014

http://quotacle.com/

Also posted to HN not too long ago

thisjepisje · on Aug 5, 2014

That should be easy, uploading every movie you can find.