How Yahoo's Latest Acquisition Stole & Broke My Heart

coderintherye · on Oct 16, 2010

>"It turns out the client at the end of the long pipeline of invoices sold a diet pill, and young women were complaining on MySpace and forums that the pill sometimes caused leakage from their..."

So you thought cause you were a consultant that it didn't matter that you were aggregating people's personal posts from MySpace for the purposes of Big Pharma? Surely, we must draw the ethical line somewhere.

marshallk · on Oct 16, 2010

Well, it was aggregating complaints of side-effects, which is a good thing for the pharma company to keep track of if they're going to sell such a drug. But yeah, that's why I felt uncomfortable about working with them, didn't continue and now joke about it publicly. Plus it's odd.

bluesnowmonkey · on Oct 16, 2010

> It was beautiful, but people didn't want it, they didn't understand it. Because people are stupid.

I watched his video explaining Dapper and could only loosely follow what he was doing. It was clear that he was making an RSS feed of changes to a web page, but the process of selecting the dynamic elements of the target page was unclear to me. And several times he had to say things like, "Oh, it's confused now. We'll just fix that..."

So maybe you came to the right conclusion that the rest of the world is stupid. Or maybe Dapper was a little difficult to use, and the value proposition was a little vague, and it never really took off. Thousands of cool projects have met the same fate. That's just it goes.

Keep your chin up. At least you got a cool sweatshirt.

andrewljohnson · on Oct 16, 2010

A company called "Fetch" is very similar to Dapper. Their tech is used to aggregate things like Dow Jones news stories.

They monetized by selling licenses to use their scraper. I wonder why Dapper couldn't do the same?

http://www.fetch.com/

danielnicollet · on Oct 16, 2010

anyone knows of something equivalent to Dapper out there? I really wish there was since I need this for a project. Thanks!

hartror · on Oct 16, 2010

Pythoneers have BeautifulSoup, it is fast simple and can deal with real world html. I have used it for site scraping with great success.

Luyt · on Oct 16, 2010

lxml.html does a pretty good job too, and offers elementtree and xpath querying. http://codespeak.net/lxml/lxmlhtml.html

Recently I used Beautiful Soup in a very simple program to scrape playlists from Soma.fm: http://www.michielovertoom.com/hobby/somafm-playlists/

yurylifshits · on Oct 16, 2010

http://webnumbr.com is like dapper for numbers

mnutt · on Oct 16, 2010

If you're pretty technically inclined and know your way around FireBug/Webkit Inspector, YQL (Yahoo Query Language) is very convenient. It lets you use css selectors to grab data and returns it in JSON or XML.

It's great for quick and dirty hacks, but the big question is how long Yahoo will allow it to stick around.

gtani · on Oct 16, 2010

there's (sort of) related (for-pay) web apps as well:

http://www.metafy.com/index.html

http://www.diffbot.com/howitworks

http://www.aignes.com/

http://sharedcopy.com/public/andthensome

http://webcache.googleusercontent.com/search?q=cache:L5dhj2w...

danielnicollet · on Oct 16, 2010

Thanks for all those great replies. I will now spend some time reviewing!!

andrewljohnson · on Oct 16, 2010

Fetch, but it's super expensive: http://www.fetch.com/

SudarshanP · on Oct 16, 2010

http://needlebase.com/ belongs to ITA software being acquired by google... You can find some public datasets at https://pub.needlebase.com/

aditya · on Oct 16, 2010

Enjoy it while you can, until an uncaring market starves it to death and it turns into an ad network, for lack of viable alternatives.

Depressing as that is, perhaps the future holds promise. As the cost of building startups keeps trending down, maybe timing will stop mattering as much, and business model innovation (such as freemium) will save good technology from turning into ad networks?

Isn't this part of the skill required to build a successful business anyway, your technology is only as good as the people selling it. Where would google be, if they hadn't hit on AdSense?

lanstein · on Oct 16, 2010

Fun fact: Jon is the person who told me about HN in the first place.