Hacker News new | past | comments | ask | show | jobs | submit | more legaloslotr's comments login

(Opta's lead maintainer here)

Hey! That's a good question and something we need to document better. In short, Opta is a framework to build opinionated TF modules that work well together (and a CLI to deliver those modules to non-experts). A wide set of modules have been built by our team but it is very customizable and we expect folks to add their own or augment the existing ones with custom TF code.

Additionally, we augment vanilla TF modules with a variety of things that improve the scalability of a TF codebase - stuff like pre/post hooks, abstraction for creating environments and services, "better" state management, etc.


So, is this legal? As in, what imdbapi.com is doing or what my app is doing! Does imdb have a FAQ regarding apps somewhere?


I'm still trying to figure out what imdbapi is doing exactly. They cite freebase and wikipedia (at the bottom of their home page) but neither of those have imdb ratings.

What data imdb makes available is here, along with references to their ToS: http://www.imdb.com/interfaces/

It seems the data IMDB makes available is nearly useless (or I'm misunderstanding it) and doesn't include ratings.

This implies IMDB API is scraping IMDB. Whether that's legal or not, I can't say. But I can say this- IMDB was created on the work of many users who licensed their efforts under terms that were, in my opinion, broken when IMDB sold to Amazon, and thus I think scraping is moral.


Umm... well, there is this site http://imdbapi.com; which does the scraping (or maybe they have the offline datasets!), and then you can get json replies from them.


Their source is shown as Freebase, which doesn't have anything to do with IMDB. It's freely usable data as long as it's attributed.

I suspect they may have trademark issues with their site name, but their data looks free and clear.

EDIT: Nirvana has a good point about the imdb ratings. That may not be legally obtained.


I use IMDBapi and for one of my project and I can assure you that they are indeed scraping IMDB. The ratings, desc and the poster I get back is ditto to that of IMDB. It is a great service and I have also donated to them but as far as legality goes, not sure how they support IMDB's terms and conditions.


In the footer:

  Source: Freebase, licensed under CC-BY
  Other content from Wikipedia, licensed under CC BY-SA


Yeah! That is a high-priority item in my todos! :)


Great.

Another feature, that might be trickier to implement: like in web browsers, pressing middle button could present a little gizmo that helps scrolling faster (I though about it for five minutes and can't think of a better way to describe that thing).


Yeah.. this might be a bit trickier! Anyways, isn't the scrolling good enough with the mouse scroll wheel?


Well, I'm almost always on my Mac and use the trackpad, so I always feel a little awkward holding a mouse in my hand. But even so, I think if you have a large enough library (say 200 movies), it would take a lot to scroll to the end (I tried it right now: 40 seconds).


Btw, so MDB works fine on a Mac? Never tested it! Any Mac-specific comments/bugs/feature reqs?


Unfortunately I haven't tested it on a Mac yet (I didn't know it could work, as the Windows GUI seemed native). I use Mac all the time, but my movies are on my old Windows PC.

I'll test it later on and see if there's any bugs :)


Glad to hear that you liked it! It works for everything that IMDB supports! Right now, it tries to show the IMDB info for each episode separately. Maybe, later I will add an option to consider the series as a whole.


Yeah mostly regex. Plus some heuristics. Works most of the times. Pretty efficient for the simple amount of heuristics it uses. You can see the code here https://github.com/legalosLOTR/mdb/blob/master/MDB/DBbuilder... Check out the function get_movie_name!


Wouldnt this break a movie like 2012?

    #2 remove year and stuff following it
    filename = re.sub('\d\d\d\d.*', ' ', filename)


Yeah it totally would! Thanks for pointing this out. Btw, I am planning to use another library that has better detection rates - https://github.com/wackou/guessit


Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: