Very cool that this uses the PyPy toolchain, I hope this becomes a trend because it means we have one make-dynamic-languages-fast effort that will pay dividends across multiple languages.
Well, it doesn't surprise me as RPython is bloody amazing. I actually started doing something like HippyVM about six months ago for work, to explore if it was possible, and sure enough someone has gone way further than I did haha
I would like to mention my (just released) PHP preprocessor here, since it has a feature that suggests a number of micro-optimizations. People seem to care so much about performance, so why not spend a few minutes to improve source code in the first place?
As someone with a bit of experience scaling PHP, it seems to me that the main problem with the language is not its raw performance (which seems on a par with its more cousins Python and Ruby) but rather its single-request model. For every request, mod_php creates a new interpreter, which has to do all the work to set up the framework, auto-loading the required classes, reading additional data every time. What's more, the request has to create a new database connection for every request; the single-request model doesn't allow the use of a database pool. (This doesn't change fundamentally with handlers other than mod_php, such as php-fpm.)
With languages like Python and Ruby, not to mention Java, a single interpreter process or thread can handle many requests and can re-use the database connection. In PHP, you're forced to do all the set-up all over again for every request. This is particularly wasteful with frameworks like Laravel, which do a non-trivial amount of work for setting up the application object, IoC controller.
Am I missing something here? I can't believe companies using PHP on a larger scale, like Facebook, discard and re-create the application environment and all DB connections (!) for every single request.
But it's terrible for performance. The performance of trivial ajax requests is completely dominated by framework setup time, in my experience with Zend based code.
I think what the OP was getting at was that frameworks can be good - saving you a lot of DIY and building on (theoretically) good practices. But all that potential code reuse and possibly elegance is problematic in PHP because that has to be rebuilt on every single request. Arguably many Java frameworks are even more convoluted than, say, Zend Framework, but the Java platform means there's less of a performance price to pay for the bloat of a framework vs any PHP framework with similar 'bloat' (hierarchy of classes, etc).
"Trim that fat"
The "fat" is something which is useful to a lot of people for development, it's just a performance hindrance (and far moreso than in other platforms).
That said, I've had some Java folks express some amazement at how fast PHP is, even when it's doing all the library loading and class instantiation on every page request. In multiple cases, they all expressed that they thought it would be far far slower, having grown used to "object creation is slow" in Java.
That's why you should host using fastcgi, so the PHP processes are long lived. You do pay the startup cost in every request of reconstructing the state (code and session data), but you get the benefit of request isolation. I don't know how other db's handle it, but oracle supports reusing connections across PHP requests.
Facebook does recreate the state on every request, but they've got that down to a 10 ms bootstrap. In my own web services the bootstrapping before the business logic runs is below 20 ms. Most PHP frameworks load too much code up front though, which is why you'll see much worse request bootstrap performance in many cases.
The big benefit of request isolation is predictability. You know exactly what environment your code runs in because you construct it from scratch for every request.
It's not a problem, it's a trade-off. An architecture that doesn't keep state across requests trades off per-request performance for stability and scalability, because there is no long-term state on the web server to manage or migrate. CGI traded off too much performance because of the need to launch a process per request, but fastcgi keeps the process alive but throws away the state, so i find the upside beats the downside in a lot of cases (sadly not all, i wish php was less opinionated here, like java where you can choose to have a stateless architecture but don't have to.)
I have to admit our architecture is unconventional. It's custom code except for a few ZF1 parts like zend_db, zend_json_service and zend_validate. Even our session and auth code is all custom. The server we use is zend server (with zend opcache). We only have web services on the server, it's a javascript ui.
The web service wrapper on the server has been cut down to the essence and does not use autoloading for the basic bootstrap. I've found that the PSR approach of a gazillion files each containing a small class and liberal use of autoloading is inherently slow, it is death by a thousand paper cuts due to the high degree of file access, memory consumption and running of all that constructor code. The code i write typically groups a web service in a handful of files, each containing a namespace with the bulk of the implementation in procedural logic (though i do use closures quite a bit). I typically use classes / objects only as types that encapsulate data, with the logic on the object limited to that logic which is necessary for working with that type (e.g. I don't put a save method on an object but instead write a saveObject function which accepts the object as a parameter and passes back an array of errors).
I suppose a lot of people nowadays find that sort of coding style blasphemous, but it is easy to write, test and maintain, and fast to execute. I'm not against the heavy OO approach common in most web dev nowadays, there are times when i do use it and need it. But CRUD web services are very linear: validate the input, create a transaction, run a bunch of db operations, commit or roll back, and return status / errors. If something is linear, i prefer to see it implemented in a cohesive linear procedural style, instead of getting chopped up into lots of objects that in practice end up obfuscating what's happening.
Thanks for sharing this. It's interesting that you can improve your performance so much by, essentially, getting rid of the automatic class-loader. On the other hand, the autoloader is one of the most important improvements to PHP in the last couple of years. It looks like PHP's recent improvements -- easier testing using dependency injection, ORMs, class loaders -- all have a pretty big cost in terms of performance.
Thanks for sharing your approach. would love to see maybe an open source version of your framework to use a starting point. I am starting work on customizing CodeIgniter to be slim framework for serving a REST API, and your approach is going to be super helpful to learn from :)
The way of initializing and validating data objects is quite similar to the approach we have in our production code. The missing part is the web service layer that maps the type definitions to a service description (i use reflection of phpdoc comments on a web service class to generate json and soap bindings, similar to the reflection approach in that github project).
I've been meaning to pick it back up and implement both the web service and db layers to make it an all-purpose web dev framework, but i've had other side projects intervene.
They are: their design (not in the PLT sense, but in the "shared-nothing super simple startup" in PHPs case, and "only language in the browser" for javascripts case) is why they get this sort of attention.
Apart from being bitter and using better languages personally, what can one do?
I suppose it's better off for the world if shitty languages are at least fast. Because I probably use something every day that depends on their performance :\.
You can make it easier to use better languages. For example, I've really enjoyed using yesod (Haskell) recently, but it doesn't have great documentation, especially for deploying. So that's something I'd like to work on.
I wonder what the claimed and "aimed for" compatibility with Zend is actually like. How much of the standard library is implemented? How about common extensions? Is the casting behaviour the same?
Performance is meaningless if your VM can't run people's code.
Looks like they've copied some of PHP's tests, yes. But being able to run run-tests.php and being compatible aren't quite the same thing, really. It's how many tests pass that count.
Technical side-note: /Zend there appears to be just the tests for the Zend engine (i.e. from PHP's /Zend/tests), while the containing directory contains test folders copied from other parts of PHP (like the standard library)
We working hard to be as close to Zend as possible, though we are not there yet. You can try "translate" and run HippyVM against php-src suite.
"Basic" PHP compatibility was achieved which means that most of the syntax and stdlib from recent PHP is there. What we are doing right now is implementing modules (phar, fcgi, ...)
The original project at Facebook was called High-Performance PHP and given the acronym HPHP. Some clever soul then decided that HPHP should be pronounced HipHop. I guess HippyVM is just following in that trend.
Oh, I should have expanded on my question. I know of HHVM already, I was curious about the use of "Hip" in HHVM and in HippyVm, it appeared that there must be some common heritage
Thanks, that makes sense. I already assumed that HHVM just happened because it's a cool name. But then seeing the second use of the word made me think there might be something behind it
I'm not really familiar with the PyPy tooling and capabilities, but if this takes of, will there be a possibility of interop between RPython language implementations? It would be mind blowing to be able to write libs in different languages and have them working together.
We could always use the best one available regardless of the language.
They mention interop with Python code, but I was just wondering if this could go a bit further...
As you mention. We have ongoing branch for a implementation of HippyVM-Python bridge. We are hoping to achieve the goal that you described, though we are not sure if this will the last of first step on this path.
I got HHVM running wordpress on a relatively high traffic site. The difference is night and day and setup was simple. To have something faster is crazy :)
For those that can't find the repository, it can be found here on github https://github.com/hippyvm/hippyvm I can never seem to find the link to github from the hippyvm page. It doesn't seem to be very active as far as emerging technology goes, but in my opinion it has good potential since it can build off existing language development tools that are continuously improving.
I have to ensure that project under development, 490 commits, not impressive but we are working, here ;)
Though our current problem is which we are trying to fix is ,we have two repos: public and private. We are hopping to solve this soon, and move development to github. Sorry for the confusion.
>It started off as a research project [...] for Facebook
Huh? Isn't that what HipHop is? Can someone elaborate on the matter? Is this a fork? Why aren't they just migrating their work back to HipHop, I mean all this leads to is fragmentation.
Facebook had multiple research projects to develop better implementations of PHP programming language. HippyVM is not a fork of HHVM and indeed two codebases are completely unrelated. Therefore "just migrating their work back to HipHop" is not possible; there is nowhere to go back to, since it didn't come from there.
From memory (I think one of the dev's comments on HHVM or Hippy a while back) Hippy started after HHVM was already underway. Hippy was mostly a proof of concept, to see if PyPy would be a viable strategy for a PHP runtime. HipHop was more so the logical followup to HipHop... I think being forked from the HipHop interpreter they were using for development.
Or even more precisely, the VM of its main (and de facto default) implementation needs love. Rubinius and JRuby are really VASTLY superior implementations but well, there's no specification so 50% of development time is spent in asinine catch up with an underspecified "standard" (MRI that is).
Really sad state of affairs and the #1 reason why Ruby isn't faster and more feature-packed.
haha jokes... well I see why fb wanted to create their own framework since the whole site is in php and it would take em a lot of time to translate it something else (i think most nowadays is JS anyways) but srsly, why people are trying to make faster frameworks for php ... no idea. It doesn't get any faster due its design. At least join one of the open source projects and help that.
Combine all those working on those open source frameworks for years and all together create something fast...
I'm a developer and PHP is in my toolbox when suitable. But the thing I can't understand is... why waste all this effort to make a slow, broken language fast, instead of using a faster language?
I mean, sure, it's impressive that this VM is 7 times faster than stock PHP, but Java is over 100 times faster than stock PHP.
What is the gain here? Not having to go out of you little shell and learn a new language?
Preexisting codebases, compatibility with great libraries and mature full-featured frameworks, progressive enhancement of infrastructures already in place, recycling of developers. You name it.
This is not about pleasing the bedroom programmer that can just "go out of his little shell and learn a new language". This is great news for any decent sized project or enterprise with infrastructure already running in PHP.
> compatibility with great libraries and mature full-featured frameworks
So as I said I've been doing PHP. And I have yet to see one of those unicorns in PHP-land. In fact, my first gut reaction was that you're being sarcastic up there.
If you already know PHP you should really check out frameworks like Symfony and Laravel (and Symfony Components, amazing reusable pieces to grow your own stuff). Also great tools like Composer for dependency management, PHPUnit, Behat, PHPSpec or Codeception for TDD/BDD. Same with ORMs like Doctrine or the Twig templating engine.
PHP has evolved greatly in the last years. There are tools for most of your needs and you can build stable and robust applications with them, and be excited about the future now that we are seeing efforts like HHVM or HippyVM, or even the internal changes being discussed for PHP 6.
You don't need a new "better" language to make new better products. PHP allows you to write perfectly nice, object oriented, clean code, no problem at all, that integrates easily with proven technologies. It also allows you to write a mess impossible to maintain. If somebody does the later when they can do the former with no extra effort, well, then maybe it's fair to question if that person is a good programmer at all.
I'm familiar with Symfony, etc. Please don't sell it to me as "perfectly nice" code.
The reason most of the PHP code is crap is not because PHP makes it impossible, but because a popular crap language with a low initial learning curve tends to attract a lot of crap programmers.
I'd actually say PHP has only gotten better over time, and every new release is better than the previous one. But you have a crap base (which PHP maintains religiously as not to break BC), and a crap community, and the software they write is crap. It's just what it is. The odd exception here and there.
You can overcome a horrible language, but that's not the problem (plus I'd argue PHP isn't the worst out there).
No, it's the crappy community around it. It's hard to overcome the horrible low quality PHP code libraries people write in PHP, because writing everything from scratch isn't practical in any language these days.
Composer is great and all (except for the lack of package signing, hey, who needs security, right?), but all Composer means you can install other people's PHP crap code more conveniently.
And while I wouldn't call the people behind Symfony "crap programmers", that framework is horribly overdesigned, slow, and comprised of an endless Inception of leaky abstractions.
It shows that a programmer can take all their experience and knowledge of design patterns, best practices and what not, and write a complete piece of turd.
Anyway I better run, because the modding of my posts here is showing I've angered the PHP fans.
Okay, I get where you are coming from. And I have to say that I agree with you on several of your points. But what I wanted to stress in my sort of "defence" of PHP is that, the fact is, lots of valid programmer use it, either for familiarity reasons, due to existing infrastructures or because they actually enjoy it (I do sometimes!). Complete rewrites are a mythical creature in the real world, so why not embrace, improve and push forward anything that tries to make PHP better in the future, instead of dismissing the whole technology as useless.
There are large PHP production applications out there that can't easily be rewritten overnight (e.g. Facebook's frontend), a better PHP parser is a great improvement for those cases.
As a PHP perf guy for serveral years, I have some weird questions to ask.
Is this an HTTP serving language VM or just a PHP as a language VM?
Because I spent years fixing the first part without ever really doing anything about the second part of that question.
If it can't listen on port 80, my interest drops a lot in the new VM - I'm not going to write scipy/numba.py problems in PHP, even if I could.