Facebook Shakes Hardware World With Own Storage Gear

raganwald · on Feb 24, 2012

  “People who are really serious about software should make their own hardware.”

Alan Kay, http://www.folklore.org/StoryView.py?project=Macintosh&s...

dsr_ · on Feb 24, 2012

This is a placeholder for the interesting story that will come in May at the Open Compute Summit. Until then, no real content.

alexgartrell · on Feb 24, 2012

I think this is a little unfair. Lots of people skim the comments before going to the article, and they'd assume you meant that there's no content there (other than "Facebook will talk about it in May"), but they actually did talk to a source or two at Facebook and they actually do mention Rackspace's open virtualization stuff.

I thought it was a cool article.

mmc · on Feb 24, 2012

Does anyone know any technical details behind this paragraph? Specifically, are they talking about a new kind of interconnect technology with low power over ~1m distance?

(Searching for "rackspace virtual I/O" was not so useful.)

"Rackspace is leading an effort to build a “virtual I/O” protocol, which would allow companies to physically separate various parts of today’s servers. You could have your CPUs in one enclosure, for instance, your memory in another, and your network cards in a third. This would let you, say, upgrade your CPUs without touching other parts of the traditional system. “DRAM doesn’t [change] as fast as CPUs,” Frankovsky says. “Wouldn’t it be cool if you could actually disaggregate the CPUs from the DRAM complex?”"

bri3d · on Feb 24, 2012

I don't think this would be good at all for most real workloads - you'd be taking the performance hit of having high-latency memory at all times. Even most hardcore NUMA vendors try to keep DRAM CPU-local, and writing high-performance software for NUMA generally involves ensuring that your data stays close to your CPU. Otherwise missing a branch or getting preempted by another task which flushes your cache lines becomes really, really expensive.

I do think this could be useful for a memcached workload, though, in tandem with some smaller amount of fast CPU local memory - you could basically share "memory bricks" between CPUs, and swap CPUs out independently without evicting an entire system's worth of memcache.

victork2 · on Feb 24, 2012

Tomorrow: "In other news, it appears that Facebook has lost every pictures hosted on their services."

More seriously though, there were an interesting article on Google DataCenter and how their customize their hard drive to accommodate their needs. There was another blank paper on how, because of the massive scale of the data, Cosmic ray influence quite often data stored. And also they said in that survey that every 3 minutes one hard drive is failing in one of their datacenter somewhere in the world.

Pretty neat stuff, but I can't find the source, sorry.

mda · on Feb 24, 2012

I guess this is the paper: http://static.googleusercontent.com/external_content/untrust...

genkaos · on Feb 24, 2012

About that cosmic rays thing... wasn't that about DRAM?

[PDF] http://www.cs.toronto.edu/~bianca/papers/sigmetrics09.pdf

victork2 · on Feb 24, 2012

Ah damn, yes, you are right ! Nevermind then ;), still interesting though !

genkaos · on Feb 24, 2012

Thanks, I almost forgot about that paper! :D

larrys · on Feb 24, 2012

"Now, Facebook has provided a new option for these big name Wall Street outfits. But Krey also says that even among traditional companies who can probably benefit from this new breed of hardware, the project isn’t always met with open arms. “These guys have done things the same way for a long time,” he tells Wired."

Maybe one reason is because they've been around long enough to know what happens with bleeding edge technology.

And as the (old) saying went, "nobody ever got fired by going with IBM".

But the truth is the reliability and "shit hits the fan" if a Wall Street system goes down (and financial loss) is much greater in a traditional business system then if the same thing happens for a free service like facebook. Or somebodies "Show HN what I built this weekend" app.

So of course they are going to move slower. And they should. They have more to lose.

pnathan · on Feb 24, 2012

If Facebook transitions to its new hardware and the hardware begins to crash and burn and users start getting affected, so will Facebook... there are competitors who would love to have Facebook's lunch. So reliability is pretty up-there for Facebook too.

unexpected · on Feb 24, 2012

If Facebook is down for 2 hours, you're not suddenly going to sign up on MySpace. By contrast, other companies can lose millions of dollars in that same timespan!

mrdodge · on Feb 24, 2012

Major stock exchanges have gone down for hours, I've had trouble reaching my bank's web site for hours. Let this myth of the enterprise having any idea of what its doing die.

And Facebook could lose millions of dollars in that time span, depending on what time of the day the downtime occurs. They make money from advertising, down-time means no clicking and no eyeballs.

mukyu · on Feb 25, 2012

Facebook regularly pisses off large swathes of its users and they stay anyways. You can look at basically any feature they launch and subsequently have users trying to revolt over or the numerous privacy scandals.

Timothee · on Feb 24, 2012

I was at a conference last year where the SVP of Network Services for NYSE/Euronext (Andrew Bach) was talking about their growing data and bandwidth needs, and they are pushing the envelope. They need something very reliable obviously, but they're not sitting on their hands. Instead they're constantly pushing their vendors for less latency, more bandwidth and faster storage.

AutoCorrect · on Feb 24, 2012

and inertia. It's thinking like this that causes the smaller companies to outpace, outgrow, and eventually eclipse larger companies.

dmragone · on Feb 24, 2012

At the same time, Facebook seems to do a pretty good job of building reliable systems.

GnomeChomsky · on Feb 24, 2012

What are some examples of the hardware Facebook's leaving out, and that 'traditional' suppliers were insisting on leaving in? (cf. Peter Krey's quote)

flyt · on Feb 24, 2012

bezels, complicated and difficult to configure LOM systems usually tied to proprietary vendor management systems, fiddly components that are easy to optimize and build once but that introduce extra overhead over long-term maintenance (i.e. small screws on drive carriers, chassis that require tools, etc)

thrusong · on Feb 24, 2012

I hope this means that they'll open source Haystack, but I assume it will only be the hardware designs.

wslh · on Feb 25, 2012

Sorry for the joke, but they include a "Like" button there?

shingen · on Feb 24, 2012

It's fascinating to me how software companies like Google, Facebook, Apple, and others have had to push the hardware industry forward, because they often feel so loathe to eat their own children.

I suspect there's a huge correlation in there to the net cost of changing hardware compared to changing software code (not to mention the related margins in the businesses).

iamgoat · on Feb 24, 2012

This can be said for energy and medical industries, too. We are smart enough to solve all the world's problems, but we won't do it until we are backed in a corner and need to.

Not including war and religion.