-----Original Message----- From: wikitech-l-bounces@lists.wikimedia.org [mailto:wikitech-l-bounces@lists.wikimedia.org] On Behalf Of Ævar Arnfjörð Bjarmason Sent: 01 March 2010 13:34 To: Wikimedia developers Subject: Re: [Wikitech-l] hiphop progress
On Mon, Mar 1, 2010 at 10:10, Domas Mituzas midom.lists@gmail.com wrote:
Howdy,
Most of the code in MediaWiki works just fine with it
(since most of
it is mundane) but things like dynamically including
certain files,
declaring classes, eval() and so on are all out.
There're two types of includes in MediaWiki, ones I fixed
for AutoLoader and ones I didn't - HPHP has all classes loaded, so AutoLoader is redundant.
Generally, every include that just defines
classes/functions is fine with HPHP, it is just some of MediaWiki's startup logic (Setup/WebStart) that depends on files included in certain order, so we have to make sure HipHop understands those includes.
There was some different behavior with file including - in Zend
you
can say require("File.php"), and it will try current script's directory, but if you do require("../File.php") - it will
We don't have any eval() at the moment, and actually
there's a mode when eval() works, people are just scared too much of it.
We had some double class definitions (depending on whether certain
components are available), as well as double function definitions
(
ProfilerStub vs Profiler )
One of major problems is simply still not complete function
set, that we'd need:
- session - though we could sure work around it by setting
up our own
Session abstraction, team at facebook is already busy implementing
full support
- xdiff, mhash - the only two calls to it are from
DiffHistoryBlob -
so getting the feature to work is mandatory for production,
not needed
for testing :)
- tidy - have to call the binary now
function_exists() is somewhat crippled, as far as I
understand, so I had to work around certain issues there.
There're some other crippled functions, which we hit
through the testing...
It is quite fun to hit all the various edge cases in PHP
language (e.g. interfaces may have constants) which are broken in hiphop.
Good thing is having developers carefully reading/looking
at those. Some things are still broken, some can be worked around in MediaWiki.
Some of crashes I hit are quite difficult to reproduce - it
is easier to bypass that code for now, and come up with good reproduction cases later.
Even if it wasn't hotspots like the parser could still be
compiled
with hiphop and turned into a PECL extension.
hiphop provides major boost for actual mediawiki
initialization too - while Zend has to reinitialize objects and data all the time, having all that in core process image is quite efficient.
One other nice thing about hiphop is that the compiler output is relatively readable compared to most compilers. Meaning that if
you
That especially helps with debugging :)
need to optimize some particular function it's easy to take the generated .cpp output and replace the generated code with
something
more native to C++ that doesn't lose speed because it needs to manipulate everything as a php object.
Well, that is not entirely true - if it manipulated
everything as PHP object (zval), it would be as slow and inefficient as PHP. The major cost benefit here is that it does strict type inference, and falls back to Variant only when it cannot come up with decent type.
And yes, one can find offending code that causes the
expensive paths. I don't see manual C++ code optimizations as way to go though - because they'd be overwritten by next code build.
The case I had in mind is when you have say a function in the parser that takes a $string and munges it. If that turns out to be a bottleneck you could just get a char* out of that $string and munge it at the C level instead of calling the PHP wrappers for things like explode() and other php string/array munging.
That's some future project once it's working and those bottlenecks are found though, I was just pleasantly surprised that hphp makes this relatively easy.
I would think that getting hiphop to compile out regular expressions from preg_*() calls to C++ (like re2c), would be the idea.
Jared