Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: What APIs do you wish existed?
68 points by shafqat on Nov 21, 2009 | hide | past | favorite | 153 comments
Are there APIs that you wish existed to help make your work easier? Help your startup? Or just allow you to build cool products?



- US & State Government tax information. The tax code is amazingly complex. It would be good if you could figure out the tax implications of various events. Better if they simplified the tax code to where this was possible. Moreover, data.gov so far is a disappointment. Maybe they'll improve it. But I'd love to dig in to purchasing and personnel costs for every branch of government.

- Good stock market historical tick data (and streaming data). Opentick tried for a time, but mostly you need to contact expensive services. This makes it hard to mess around.

- A good local event API. Lots of companies have tried, few have good results.

- LinkedIn: They've been promising an open contact book API since 2007, but have kept it closed. If they had an open API (that lets you actually store data/invite), it would mean a lot of sites would be building on them.

- My dream: The big scientific journals would require everyone publishing a paper to upload all relevant datasets to a central repository which could be queried.

- Anyone you have an account with (Financial firms, banks, vendors) would have a standard commerce API. Sure, today you can export stuff in Quicken, etc formats, but Mint had to do a big deal with Yodlee to get the data in a uniform, queryable way.

And, mad props to Microsoft for opening the Bing API with pretty good terms. Google used to have a search API and it had horrible terms. Then they decommissioned it. Who would have guessed that MS would be more developer-friendly than G?


Who would have guessed that MS would be more developer-friendly than G?

Anyone who's done Microsoft-based development? MS is incredibly developer-friendly compared to many platform vendors. It's a point of business strategy for them (if memory serves me, that's what Ballmer's infamous "developers, developers, developers, developers" thing was about in context).

Just don't try and mix in any technology from one of MS's competitors and you'll be fine.


True, they are developer friendly. .NET is a good platform. It's extensive and well documented. There is innovation in C# as well. But the purpose of all that goodness is to make you sell their licenses. Where Oracle has their salesforce, Microsoft has their developers to do the selling for them.

Unfortunately their licensing is from an era that has come and gone. Building software that uses things like Windows Server, SQL Server, SharePoint or Office means to limit your scale to what they call "Micro ISV". You provide that package to your client and 95% of your revenues go straight to Microsoft.

You can build on their stuff but you won't scale and you won't grow, not because their technology doesn't scale, but because their licensing doesn't scale.


You can build on their stuff but you won't scale and you won't grow, not because their technology doesn't scale, but because their licensing doesn't scale.

What about their licensing for Azure? Does that scale or is it more of the same?


I don't think it's entirely clear yet what direction Azure will take. The base offering using only .NET and Windows with simple storage is priced almost exactly the same as google app engine. It's difficult to compare to Amazon because the architecture is so different. Google and Azure (I believe) won't let you do any serious computation in memory whereas Amazon does support that very well.

However, look at SQL Azure. Microsoft thinks that their SQL offering is worth paying 66 times what you pay for google's database (or Azure BLOBs&Tables) and that's just for storage alone ($1 per GB, maxing out at 10GB right now). Add data transfer and you may get to 100 times.

Yes SQL Server has vastly more features than google's db, but does it make me 100 times more productive or profitable? I don't think so. If I need more SQL features I could run Postgres on Amazon or use Amazon's MySQL service for roughly 10% of what SQL Azure costs.

So if SQL Azure is the model of what's to come then I think it's indeed more of the same.


"My dream: The big scientific journals would require everyone publishing a paper to upload all relevant datasets to a central repository which could be queried."

I work on a site that's trying to do that for the learning sciences.

http://pslcdatashop.web.cmu.edu/

Most of the data so far is from various studies in the Pittsburgh Science of Learning Center, because the head of the PSLC can tell the researchers to put their data in there, but we'd like to convince others to share, too.

I also happen to be working on a web services API to this data at work right now.


"Google used to have a search API and it had horrible terms. Then they decommissioned it"

It always surprises me when people say this. Apparently it's not widely known that Google do make their search API available: http://code.google.com/apis/ajaxsearch/documentation/#fonje

(Before anyone says "that's just the AJAX API", please READ THE LINK, and scroll down to see the Java & PHP samples)


What quotas do they have, though? I remember people reporting to be blocked from Google for "looking like a bot". So I thought the search API will only work if the requests come from lots of different IP addresses, as would be the case for use in AJAX applications.


Tax information would be HUGE. I would also like to see an API for general government records.

For example, if we had voting records, we could finally stop hearing all this "but you voted on -----" "no I didn't go back and check the record" "no look you voted on ----- which is basically the same" "shut up I served in the war" "yay war" stuff every election. We could just look it up and say "hey look you did". The Times "Congress API" is on its way to this, but last I checked, all it had was the attendance records.


My body, starting with my brain. Please provide clear documentation.


The UI layer of an app is more or less a (primitive) API into the human mind.

Though, I think you're right. The documentation is quite poor.


On a similar note, I remember reading a comment somewhere requesting a search interface for reality 'so I can find my damn keys when I lose them'.


If the index is broken then the brute force search is going to take too long.



Also a GNU screen implementation would be nice. ^A-D during certain interminable lectures, particularly.


Amen. I've always wanted a way to "diff" the different states that my body has been in.


I need cmd + z for my brain.


And fg


I would like:

  storeNewBrainMemory();
and

  saveBrainToDisk();


brain.pickle()


I think brain.serialize()

would be pretty cool...


thoughts are to writing as data structures are to serialization stream


I'm thinking that an electronic interface could greatly increase transfer speeds though.


Banks. Online payment processing is a mess.


Interestingly, since the failure of OFX, many many banks in North America prefer screen scraping because they can guarantee the web interface has accurate data since their actual customers use it. Conversely, these banks find supporting APIs to be highly fragile since no one is watching them.

I find that logic amazing, but considering how old the online banking software is and the high risk of changing it wholesale, I don't think they are in a position to fix it in the short term.


Free or low-cost tv schedule api. Currently one has to pay about $500 per market/region per month.


Not exactly an API (just data), but it's cheap (USD20/year):

http://www.schedulesdirect.org/


Too many restrictions on the use of the data:

http://www.schedulesdirect.org/sagreement


Check out http://www.cruxle.com. It recommends movies and TV shows available on TV you might love to watch. We are planning to open our TV guide recommendations via XML-based API. Please send us an email at info@cruxle.com, if you would like to access our APIs.


++


Generally: I wish practically everything had an open API. It's incredible what people can build when they have decent access to an API.

Specifically:

TuneCore, as it would make my startup idea a whole lot easier.

School registrations. I really wish there was some standardized API for universities. That'd make it possible to plug in the classes you want to take along with when you want to take them and get back a personalized schedule. As it is now (at least for my school), you pretty much have to write down on paper the classes you want to take along with when they're available and do it all yourself. I'd prefer something like this be standardized so one could make one website that would serve all universities and their students.


+1 for school registrations. My school only provides READ access to the courses catalog but not WRITE access to actually register courses.

http://courses.illinois.edu/cis/2010/spring/schedule/index.h... http://courses.illinois.edu/cis/2010/spring/schedule/index.x...

I really wish that they will provide one in future.


Appliances: stove, oven, coffee maker, refrigerator. Various other household items: garage doors, locks, etc. This overly expensive, yet dumb, flat panel TV we just bought. Extending out to the driveway...my car. The energy meter on the side of the house (read-only access, obviously, just to keep the power co. happy). The most mundane stuff could use an API, if you ask me!


have you heard of http://www.pachube.com?


Gmail.

But more than a specific API, it would be cool if websites simply provided an XML(/JSON/etc) version for every urls. Eg, http://news.ycombinator.com/item.xml?id=955077 would return the data in this page in XML format. This would be pretty simply to create (at least as read-only API), handle the situations where people resort to HTML scraping and effectively remove the need for API docs.


GMail has APIs - POP3 and IMAP.


There's also RSS, but all of these provide access to mail functionality, while Gmail is a lot more than that. The uses I have in mind are closer to Google Labs' "Unsend", or having web service-like access to moving stuff around in gmail, creating alternate UI etc.


Real-estate API (open homes, home sales by zip etc)


Funny, I came to say the same thing.

Definite business opportunity there, the two services I have found (in this case I'm looking for Texas data) both consider using an iframe "integration".


zillow has a pretty decent api


Zillow is a start but there is a ton of data that is tied behind MLS licensing and Realtor associations. That data should be free and public.


Yes. I got hired to integrate MLS data and man was that a nightmare. Many are surprisingly low-tech and almost all use different formats. There is a half-assed attempt at standards (RETS), but that's for accessing the data not the data itself. Not to mention getting and presenting the data legally--each MLS group has different rules about what you're allowed to display (and it can be incredibly restricting). tl;dr the project went down in flames.

I'd love to see Google get ahold of the data and make it available through Base.


I'm sure Google will get there eventually, as they're doing with all industries that traditionally make information inaccessible. Google just enhanced Google Scholar to be a major competitor to Westlaw, which should make a lot of frugal lawyers happy.


Sports statistics. NFL, MLB, NBA, etc.


I work for a sports data integration company, so you can get this data easily. Oh, you mean for free? Yeah, that's not gonna happen any time soon. Someone has to pay people to collect those statistics and then make them available.


A lot of people really enjoy keeping track of sports stats. Maybe some sort of a wiki would work well for this purpose?


You know how sportscasters always have some sort of weird statistic to pull out of their ass, like "Brett Favre has never lost a game, at home, with the temperature less than 34 degrees"? Is there some sort of API or query system so someone at the networks can throw this shit together out of the raw data in real time?


Realtime (i.e. in game stats) or just up to date player, team, league stats.

I think the leagues fight hard to disallow this although they have recently lost a couple high profile lawsuits.


In 2007, MLB even tried to claim copyright on player's names and statistics.

http://arstechnica.com/tech-policy/news/2007/06/mlb-tries-to...


Again, the company I work for offers this kind of data but you have to pay for it. We have lots of clients who run fantasy baseball web sites and make use of our real-time stats feeds to update their games.


Do you ever have any issues with leagues about using their data? I figured something like this would be a good idea, to format stats etc, but thought I might get in trouble for using "their" stats inappropriately or whatever.


Link?


Sure. http://www.xmlteam.com

We offer stuff ranging from simple on-demand calls to our web service (you purchase credits, docs cost you X credits per access) to a Perl-based app that captures an XML feed and parses and inserts the data into a DB for you to use as you will at your end, usually MySQL but we support MS-SQL too.

We've got some pretty big clients: ESPN, USA Today are amongst them. We supplied Google with Olympic content during the Beijing Olympics.


Fascinating. Can I drop you an email? Didn't see it in your profile. I'm shafqat at newscred dot com.


An API into the postal system of every country.



Can you elaborate?

I've been building a little side thing called snailpad (http://www.snailpad.com) and have a beta API running for some of the paying customers at this point.


APIs into hospitals medical data.


And schools: class registration


I'll start. Wish there was an API that:

1) Allowed me to find high quality, license free images easily.

2) Let me access LinkedIn data. I know LinkedIn has an API, but its not open.


On flickr there is an option to search for commercially usable images. To do it from the API you need "prior arrangement" though.


Second that LinkedIn API request.


Try Photos8 for public domain photos


Disclaimer: I'm an amateur in every sense of the word but eager to learn. A Craigslist API could make for some really cool mashups but many of those could probably also be created using the RSS feeds they provide. Do you see any substantial advantage to a distinct Craigslist API?


There was a mashup that did something with Craigslist and photos, but Craigslist shut them down. I think it allowed you to view the listing with the photos in it (rather than needing to click on each listing to see the photos).


Hacker News API?



I would love to be able to get my upvotes (stories/comments separate) in an RSS feed somehow...

Not at all an API, but since we're on the subject.

I like reddit's format, where you can add .rss to almost any url and it returns the same result as an RSS feed


I can't imagine what you would need other than HN's RSS feed. What did you have in mind that could extend Hacker News?


Writing a Hacker News Android Client?

I think I will start without an API anyway. Most of the things I need to do should be possible with the RSS "API".


comments for a given thread, score of posts and comments, and user information.


Can't you do that already with a simple scrape?


scrape != API


"scrape != API"

No kidding. He's looking to get "comments for a given thread, score of posts and comments, and user information", which you can use an HTML parser (like Hpricot if you're using Rails) to retrieve any info you need.


Care to explain why I'm being downvoted?

EDIT: If you're going to downvote, please leave an explanation to teach me what I'm missing. I'd sincerely appreciate that. Thank you.


to teach me what I'm missing

You're being annoying. Nobody wants to screenscrape, it's awkward and fiddly and fragile and verbose and requires reverse engineering and is wasteful of bandwidth.

It really doesn't matter if you can get the information by screenscraping. The thread topic is "what would you like an API for?", not "scorn people about what they want an API for."

Care to explain why I'm being downvoted?

Because I want less of this sort of thing on HN. I want it to fall to the bottom of the page, and to discourage it in future. I cannot articulate precisely what it is that I want less of, it's a big vague fuzzy blob of things, some of which I even agree with, and your few posts here and above here are within it, so down they go.


Notification of replies to comments you make.


Subway. On the stations here they have displays, that show in how many minutes the next train will arrive (which is real-time data). Would be cool if they broadcast this information on the net, you could see if you need to hurry to the station or not.


Petition your local system to get on it. BART makes the data available which is pretty slick:

http://www.bart.gov/schedules/developers/index.aspx


I doubt there's an API, but you can see departure boards for most (London) tube stations online: http://www.tfl.gov.uk/tfl/livetravelnews/departureboards/tub...

They're actually more detailed than the boards in the stations as they show current train locations.


Same, but for ALL public transportation. The number of iphone apps/sms notifications that would pop up from it would be astounding.


a real-world grep. this might be solved in my lifetime, since books, notes, etc are all moving to digital. sometimes i just wish i could grep for x and it would find x in all my books, notebooks, etc.


I'm actually working on this problem. It's remarkable the amount of content we actually generate and the number of services/mediums we generate them on. Simply fetching all the data and hosting it centrally is a large enough problem, let alone indexing all of it.


IMDB's API


Have you tried TMDb? http://api.themoviedb.org/2.1/

I've played with it and seems good enough.


TMDb is pretty decent (I use it to keep track of my personal movie collection [1]), but a lot of the data is missing/incomplete.

It's a whole lot better than nothing, but it's far from perfect.

[1] http://moviedb.samwarmuth.com


Freebase? It has data on most movies in IMDB. And an open API.


I really want an API that provides info on TV shows. Ever since I heard about the Tommyverse[1] I've wanted to build an application that automatically maps it and lets you play around with the connections between shows, and while it's technically possible to do it with the data IMDB does make available, you can't easily make it public and it's a real hassle.

[1] http://home.vicnet.net.au/~kwgow/crossovers.html


A solid global small business geo data API.

Fortunately there are a few different startups working on this, and a rudimentary way to hack it with Google AJAX API but there's still nobody who can allow me to simply punch in name+city search and give me the address, geocode meta reliably for any city in the world and without cache restrictions.

Google has it all but needs to open this data up better.


Centralized Spam filter API with an access to Gmail spam filter, Yahoo Mail spam filter, etc. to report spam and check emails for spam.


Why centralized? Why not just make it decentralized with peering?


A OCR API. The Google Docs OCR API is a good start, but it's tied pretty closely to GDocs. I'd like to post an image and get back text.


Is that even possible at the moment? Most software I've tried just isn't ready for something like that (granted, I haven't tried any of the commercial products). Tesseract is pretty good, but definitely not perfect.


The general answer to the character recognition problem is negative (currently not possible). However, several important subproblems are solvable. For example, if you know that a particular image came from a printed text, or from handwriting, or a (printed) form, etc, there are some very good solvers. Not perfect (neither is a human, in case of handwriting or a poor fax), but good to very good.

For further googling, see terms ICR, OCR, character recognition.

As a user, I have had very good results with Finereader. Have not tried Tesseract. Parascript was good with online character recognition, but that market is small, and I have not looked at them for a while (disclaimer: I used to work in a previous incarnation of the company).


"Not perfect" is good enough for me. There's a whole class of applications where being able to take a photo of some words and have at least a few of them understood would be useful. I'd like to be able to index against the words that can be understood, and present the original image as the search result.


You can do this with the Evernote API. It was designed for search, and does a good job with both print and handwriting.

http://www.evernote.com/about/developer/api/evernote-api.htm...


I think it would be a fantastic idea be able to hold up a cameraphone, take a picture of a sign in a foreign country, send it to the cloud and have the text recognised and Google-translated.

"Not perfect" would be fine for many non-life-or-death signs.


College class schedules (as well as registration schedules). Currently, even the best are buggy, slow and cumbersome. I have to take certain classes at different schools in order to fulfill my degree requirements and keep working full time. So, I often need to search for a single class across multiple institutions for the time that fits.


I had the same problem, worked over-time (startup life), and took classes at two institutions as a way around the bureaucracy for pre-reqs. I ended up automating the class look up process using selenium on Firefox. But then they called me in because my account was making requests every second, and were worried more students would try what I was doing and crash the system. (The DID tell me to keep checking for open spots, they never said to not use scripts)

This was the only way I could get the classes I needed to graduate on time


Ouch, polling every second? How often did the data actually change?


Hulu, Clicker, Mixcloud, Comcast's "TV Everywhere", Spotify, Mog, iTunes store, Amazon VOD & mp3, etc.


A reverse lookup for finding all the shortened URLs associated with a given canonical URL.


Interesting... what would be the application of this? Any ideas you can share?


Lots of possibilities, but think backtweets.com as an example of something which would be much easier to accomplish if such an API existed.


We opened up an API for backtweets a while ago: http://backtweets.com/api

What would you build with it? Some people are using it in cool ways right now, e.g. hype machine uses it to power a twitter music chart.



Suggest/autocomplete. Well, it's not really enough to call it an API, since it's already just AJAX requests for JSON, but it'd be nice if the format was standardized and they all supported JSONP so that it can be cross-domain.


Craigslist


I've always wanted to break into CL's space due to their utter disregard for the garden of web APIs growing up around them. They could be doing so much more with their service for the community, and I don't mean going the big time corporate route obsessed with short-term revenue.

There have been many CL clones(kijiji), but none gained enough eyeballs to make an interoperating API work. I suspect these attempts failed because they didn't pay attention to the community and instead acted liked closed corporations right out the starting gate. I thought the CL killer would appear on Facebook, but that hasn't happened because of totally different dynamics in Facebook's ecosystem.

Actually, I'm surprised that I haven't seen someone one here posting a hacked up API wrapper around Craigslist as a mini-startup.


Sorely needed. I can imagine pretty amazing mashups that revolved around your local community if Craigslist had an API. I'd imagine there would be a much bigger developer community than Twitter's because of the usefulness of their data and how important they are to local businesses


An API that implements multiple tableview row editing in the iPhone SDK.


An semantic equivalent to an "#include" or "using" statement that looked for the library on a central repository instead the local disk, automatically managing caching and versioning.


maven?


Pandora - Music Genome Project. I've been working on a media player, AuraMP, for a number of years and I've really wanted to plug into sites like Pandora.



Correct, they do have an open api but it just doesn't always seem to have the ability or content that Pandora has. I would like best to use as many of the major providers as possible.


Music. I'mm yet to find a good api which let's me query information about artists / songs etc.

And the good local events api someone mentioned would also be great.



> Music

http://musicbrainz.org/ is extremely good at this.


A concurrent REST api, written in scheme. That support a parallel/multicore environment. Which in turn compiles to C.


Fantasy Football APIs, partuclarly Yahoo's.


Wow, I forgot about this - couldn't agree more. Have been waiting for this for a long time.


MyFantasyLeague.com has a read/write API: http://football.myfantasyleague.com/2009/export


fantasy anything would be nice


Any and all data from the Federal Government. There cannot possibly be enough structured data coming from there.


Image processing API? http://urlimg.com


I would love to have a food nutrition info global read write api


An open Linkedin API.


Wish granted. I think LinkedIn released their API to the public this week.


They did. Rock on!


mlb game data, within 24 hours of each game


Isn't that already done? You can even get raw Pitch F/X data.

http://gd2.mlb.com/components/game/mlb/year_2009/

Here's an example day's scoreboard in JSON:

http://gd2.mlb.com/components/game/mlb/year_2009/month_07/da...


i wrote a parser for NBA games, get the data straight from yahoo. Using Python and BeautifulSoup, piece of cake. I would think MLB could work the same.


Yeah, HTML scraping ESPN.com is not very hard. I did it for college basketball once in a strange sort of database that attempted to predict March Madness brackets. It worked really well for teams that had encountered each other during the season... except only 2 or 3 pairs of teams had ever played each other during the regular season. Works much better on NBA and NFL games, where there are far fewer teams, they see each other more often, and the rosters are a little more static.


> Yeah, HTML scraping ESPN.com is not very hard.

I've seen a few people respond with this "you can just html scrape that." Sure you can HTML-scrape for the information, but the topic of this "Ask HN" is about what APIs you would like to see. Maybe he already HTML-scrapes ESPN.com, but would prefer that there was an official API for it, no?


Yeah, but I prefer real solutions, regardless of how ugly they are, rather than wishful thinking for so-called "elegant" solutions.


No offense, but isn't this wrong discussion for you then?


Care to share? Sounds awesome.


Alright, will send you an email


iTunes Connect (Contains sales, payments, reviews, ratings for publishers with products in iTunes)


Google Tasks


This guy reverse engineered the google tasks ajax calls and created a perl module interface for it:

http://github.com/nickspacek/Net-Google-Tasks/

Works decently. Could easily be replicated in any language.


Stubhub & movie tickets


Health insurance


Traktor Pro


doWhatIMean("parse a bunch of names and sort")

This is from the same framework that has winTheGame() and doMyHomework()


and justMagicallyWork(), the call that papers over leaky abstractions, off-by-one errors, misunderstandings, logical errors in your thinking and just does whatever it is you were trying to do.

>>> data = urlopen('example.org/lovelydata/2009/') YetAnotherException: [...]

It needs some kind of web form cookie login non-standard authentication. Aaargh I can't spend any more time on this sub-project it was only supposed to take two minutes!

>>> import bigGuns Warning, Universe enters a fragile state. Tread carefully. >>> data = justMagicallyWork(urlopen('example.org/lovelydata/2009/')) Success. Cost: 12 Karma. >>> del bigGuns normality restored

Phew!


Wolfram Alpha's api?


Local movie times. It's stupid this isn't available from someone. The only thing it's going to do is increase the number of people going to see a movie.

Maybe there's one now but 3 years ago there wasn't one and I had to build a scraper for Yahoo movies to build the product I wanted.


There's one for the iPhone http://code.google.com/p/metasyntactic/wiki/NowPlaying

(From a Microsoft C# compiler/parser guy, no less).


Meta's an ex-Microsoftie -- he's currently at Google.


An API for a service that prints and snail mails documents cheaply and professionally in the Netherlands. That would let me build a cool online billing app.


I'd like an API that had a function similar to this:

     make_this_many_dollars_magically_appear_in_my_bank_account(1000000000);




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: