I think vaksel's comment "Mahalo is monetized through Adsense..." at least partly explains it. Other than that, is anyone lodging policy-violation complaints when they find dodgy pages?
A friend just told me today that he got his iphone stolen by a class mate. He went to the police, gave them the UID of the phone, and he got it back 3 days later.
Apperantly, as he told me, the police contacted apple which gave them the exact position of the mobile phone through the build in gps device, so the police could just pick it up.
We have explicit partnerships with a large percentage of our content providers. We also have partnerships with some for full text. We continue to reach out to more sources every day - we're not in the business of harvesting feeds without permission.
The whole idea is that we allow people to write, edit, curate and also aggregate content. A simple analogy would be HuffingtonPost. I realize people have strong opinions about HuffPo as well, but that's another debate.
Baidu already uses Hadoop's HDFS and MapReduce. They also support Hypertable. I would guess that they could probably put "something" together over time.