Facebook's HipHop not fast enough

Facebook developers announced and released HipHop, a compiler that takes PHP code and outputs C++ code to later be compiled by g++.  I'll disclaim that I haven't seen the source code (still not on github as I'm writing this) and am basing my conclusions only on their blog post, but I think there won't be a "tremendous impact" from this project.  It's basic cost/benefit analysis.

The benefits are nearly great enough.  Only 50% CPU reduction?  Let's walk through the math of what that might mean for Facebook in terms of cost savings.  Their post says they do 400B PHP impressions per month, which works out to about 150k/sec.  That's a phenomenal number and a challenge for any site to perform at.  Facebook describes how PHP is used almost exclusively for the front end, and that the back end services are powered by Erlang, Java, Python, etc.  So if we assume they've got the front and back fairly well separated and can scale the front end based entirely on CPU load, we can make up some numbers to guess their total cost.

In my experience, even a poorly implemented PHP site can run around 20 requests per second on a single core.  Assuming Facebook uses beefy 8 core machines, that would be 160/sec per box, and 1,000 machines to serve their 150k/sec load.  Given that their load probably peaks with US traffic, and that they probably need double capacity to handle a co-location failing, I could see that going up to 4,000 machines.

Facebook has over 30,000 machines, most of which I'm sure are used to handle photos data and their tremendous memcached layer, so 4,000 web serving machines sounds about right.  At Facebook's scale, they can probably get a beefy 8 core machine relatively cheaply, but let's use $5k as a number, that means that saving 2,000 machines saves Facebook $10MM.  That's pretty impressive, but 2,000 machines compared against their total machine load of 30k isn't that impressive.  It's 7%.

Further, if we expect this to catch on and have a "tremendous impact" on the world at large, it would have to be a big benefit for all the small PHP sites out there.  I'd wager that almost all small sites are not bottlenecked on PHP CPU performance, but database performance instead.  Even if it was PHP performance, for most small sites the savings would likely be going from 30 machines to 15, which is almost unnoticeable.  For this system to be really effective, I'd want to see a speedup of 10-100x, not merely 2x.

So I've laid out that the savings aren't that great.  But they aren't nothing, a speedup of 2x which saves Facebook $10MM is pretty impressive.  but what does it cost.  I suspect the costs are pretty high and measured almost exclusively in terms of developer productivity.  Let's set aside the man year of development their lead developer spent developing this technology.  Facebook has to integrate this system into the development process, and there are two obvious ways.

1. Developers add a "compile" step to their development cycle and test using HipHop on their development boxes.  This, to me, would represent a worst case outcome (and from their blog post is almost certainly not what they do).  One of the many development speed benefits of scripted languages like PHP and Ruby comes from the fact that developers don't have a compile cycle.  While that compile cycle seems trivial (how hard is it to interrupt your code/test cycle with a 2 minute compile?) in my experience it's a large amount of the benefit.  Sure, dynamic typing is another major benefit, but I believe that if every Facebook developer now has to add a compile cycle, their overall productivity will plummet.

2. Alternatively (and more likely), they keep developers with an interpreted language on their dev box, and they move the compile step into the release process.  They indicate this in their blog post by talking about developers using HPHPi, their experimental interpreter.  The problem here is that you've now doubled your testing.  A developer writes his code and tests it locally using HPHPi, or even the standard PHP install.  After that, they have to compile it with HipHop and test it again to make sure the compiler didn't break anything.  If the compiler were a mature technology, you could probably skip this step, but with an experimental compiler you simply can't trust that it will work on your code.  Further, when you find an issue in production that you didn't anticipate, you're now not sure if it's the compiler or the code that is broken.  Given the difficulty of debugging live site bugs that often come about due to circumstances that are hard to duplicate in the test environment, this seems like a disaster.

So we're left with a system that doesn't provide enough benefit for the costs that will be associated with it.  A neat toy, but if I were Facebook, I would value the developer productivity way over the $10MM I might be able to save in server costs.  What do you think?

Share this post!

Bookmark and Share


Anonymous said...

It's 50% faster right now. They basically took their fate in their own hands (as Google did before).

And I'd expect they'll try to make it all the way to 90% which is realistically possible. That still leaves it at 1:10 against native C. To get any farther you need an extremely complicated JIT , with tracing and specialization.

This opens up all those doors that PHP dev community wasn't able to open up themselves.

It's a smart move.

Jack said...

1. You're calculating the cost of 'now' and we know very well that when it comes to software its best served 'mature'. Takes time but I can see the benefit in the long term.

2. It might not be beneficial for a small business to compile their website but it might be worth it for 'small' hosting companies or startups who are looking for open source 'tools' to resist spending 5k for a new machine. Every little bit helps!

3. You analysis is quite hollow. It doesn't do any justice on the technical side.

Eric Boyd said...

Anonymous: if they can get to a 10x or greater speedup, then I get more interested. It's definitely a good area for them to look into, but I'm much more interested in areas like JITs that offer performance improvements that are completely below the surface of how the developers work.

Jack: I'm not sure what you mean by not doing justice on the technical side. I haven't seen the code or played with it for what it can do, so this is just my initial reaction. I'm honestly surprised that they only got a 50% reduction in CPU. I would have expected much closer to 90-95% reduction.

Josh Turmel said...

Keep in mind the 50% reduction is in comparing HipHop to PHP with APC, for a first release I'd say this is pretty significant.

Anonymous said...

I think the biggest mistake you make about the costs, is that you assume that their technology (HipHop) is not mature. If it wouldn't be mature, the wouldn't deploy it in production on 90% of their machines, they would use it just on some small part of servers.

That said, in my opinion you're overestimating the impact it makes on testing. In my opinion, they wouldn't use that if testing or developer cost were too big and they can almost skip extra testing for compiled version.

Post a Comment