Faster PHP fo shizzle

cmallen · on Feb 5, 2010

Am I the only one wondering why facebook hasn't implemented a compression backend into memcache much like Reiser4 and ZFS has done?

They've made it very clear that they're RAM limited (in particular with respect to capacity), so why not just have the processor compress/decompress memcache operations back and forth with a highly efficient and relatively low compression algorithm?

It's not even like you couldn't tune the algorithm to detect duplicate/similar data and create atomic globs of data that represent multiple informational objects.

It seems like their big cost is putting together machines with tons of RAM for their memcache clusters, so why not bring that cost down?

stephenjudkins · on Feb 5, 2010

http://highscalability.com/blog/2009/10/26/facebooks-memcach... would indicate that Facebook's memcached instances are sometimes CPU-bound. They've submitted several patches to Memcached to improve its performance so that would back that up as well.

I've wondered the same thing--compression would be enormously helpful to us since we're RAM-bound (even with tons of RAM) and store a lot of easily compressible HTML. Further, our memcached instances show almost no CPU load.

cmallen · on Feb 5, 2010

Well, I guess for their scale, CPU-limiting factors would make compression not worthwhile.

That said, for more common scaling issues, I think compression would be a huge win, especially with Redis-style backends.

messel · on Feb 8, 2010

Thanks Stephen, great info and highly relevant the scale issues.

chrisbolt · on Feb 5, 2010

memcache clients already support client-side compression, which compresses the data before it goes over the network. It wouldn't make sense to move that to the server.

I'm quite sure they're already doing compression.

cmallen · on Feb 6, 2010

It would certainly make sense to do compression while it's stored server-side in RAM if they're RAM capacity limited.

messel · on Feb 8, 2010

In your suggestion, are fragments of the memory compressed separately, then decompressed on the fly for use?

I thought large scale distributed servers were more bandwidth limited based on this paper (http://www.cs.cornell.edu/projects/ladis2009/talks/dean-keyn...)

leej · on Feb 5, 2010

no they said the opposite that they are surprisingly cpu-bound.

Zak · on Feb 5, 2010

If you are making an argument to recode your entire site from PHP to some other language, the answer is you just lost that argument.

This only works if execution time was a major part of the argument, and the site meets the conditions for benefiting from HipHop discussed in the article.

agazso · on Feb 5, 2010

I was afraid that the usefulness of HipHop would be as limited as is regarding that it's not an easy feat to create a PHP-to-C++ compiler that handles C library dependencies (which PHP has a lot!) well.

BTW It was the second time in a week that there was a product that created incredible buzz in HN community without anyone able to trying out the product (the other was iPad of course) and I was amazed at the amount of well-informed opinion based on such little information.

PS. This blog is a good reading if you are interested in Facebook architecture, scaling and design issues in general.

pbiggar · on Feb 5, 2010

They aren't trying to reuse the Zend C libraries. However, phc does, and it is tricky and requires sacrifices.

ojbyrne · on Feb 5, 2010

Some day they'll post the code.

teoruiz · on Feb 5, 2010

It seems the code will be out "soon":

We're in the process of opening the list and approving members to the group. Code will follow soon after any recent cherges have been merged to the branch and we're sure that anything Facebook specific is removed.

More in the HipHop mailing list: http://groups.google.com/group/hiphop-php-dev/browse_thread/...

seldo · on Feb 5, 2010

Terry's earlier article about the future of PHP (linked in the first paragraph of this one) is also very good reading:

http://phpadvent.org/2009/1500-lines-of-code-by-terry-chay

leej · on Feb 5, 2010

thinking that php is written in C why didnt they do php->c instead of php->c++?

pbiggar · on Feb 5, 2010

They aren't reusing any of the Zend PHP implementation. But that's exactly what phc (http://www.phpcompiler.org) does.

leej · on Feb 5, 2010

not asking that. is there that much performance diff btw c and c++ or what are the reasons for the choice? first c++ is sexier that's for sure.

jared314 · on Feb 5, 2010

From the presentation, it sounded like it generates a C++ object for each user created php object (1 to 1). We will all know when they release the source.

messel · on Feb 8, 2010

I suspect they have a lower level c++ framework for speed purposes and they're transition more of the php over. Just a wild guess though.