I think we should migrate MediaWiki to target HipHop [1] as its primary high-performance platform. I think we should continue to support Zend, for the benefit of small installations. But we should additionally support HipHop, use it on Wikimedia, and optimise our algorithms for it.
In cases where an algorithm optimised for HipHop would be excessively slow when running under Zend, we can split the implementations by subclassing.
I was skeptical about HipHop at first, since the road is littered with the bodies of dead PHP compilers. But it looks like Facebook is pretty well committed to this one, and they have the resources to maintain it. I waited and watched for a while, but I think the time has come to make a decision on this.
Facebook now write their PHP code to target HipHop exclusively, so by trying to write code that works on both platforms, we'll be in new territory, to some degree. Maybe that's scary, but I think it can work.
Who's with me?
-- Tim Starling
2011/3/28 Tim Starling tstarling@wikimedia.org:
Who's with me?
I don't really have a good idea of what would need to change to support HipHop, but if the changes aren't to intrusive I'm all for it.
If we decide to do this, we should also decide when to start and when we want to have HPHP support working (1.18? 1.19?).
Roan Kattouw (Catrope)
On 28/03/11 17:36, Roan Kattouw wrote:
2011/3/28 Tim Starling tstarling@wikimedia.org:
Who's with me?
I don't really have a good idea of what would need to change to support HipHop, but if the changes aren't to intrusive I'm all for it.
If we decide to do this, we should also decide when to start and when we want to have HPHP support working (1.18? 1.19?).
It depends on how many people are interested in it, and I'm not sure how much work there is to do. But as long as we're careful to maintain compatibility with Zend, we can work in trunk. Once it's ready, we can add it to the installation docs.
It should be ready for 1.19 at the latest. If it's not done by then, we should shelve the project.
-- Tim Starling
On 11-03-28 12:44 AM, Tim Starling wrote:
On 28/03/11 17:36, Roan Kattouw wrote:
2011/3/28 Tim Starlingtstarling@wikimedia.org:
Who's with me?
I don't really have a good idea of what would need to change to support HipHop, but if the changes aren't to intrusive I'm all for it.
If we decide to do this, we should also decide when to start and when we want to have HPHP support working (1.18? 1.19?).
It depends on how many people are interested in it, and I'm not sure how much work there is to do. But as long as we're careful to maintain compatibility with Zend, we can work in trunk. Once it's ready, we can add it to the installation docs.
It should be ready for 1.19 at the latest. If it's not done by then, we should shelve the project.
-- Tim Starling
Sounds interesting...
Then again, I'm also interested in making Drizzle work, and switching our skin systems to using a custom xml/html template system.
Maybe I'll try running HPHP myself in production in my upcoming project when it's ready in core.
~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://daniel.friesen.name]
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Not sure this is the right place, but:
Are the Vector skin versions of the external link icons [1] on Commons anywhere? I couldn't find them when I looked, and I'm not entirely sure where to find them in order to upload [and I don't want to dupe upload things].
Sorry if this is the wrong place,
Douglas Gardner
[1] http://commons.wikimedia.org/wiki/Category:External_link_icons
011/3/28 Douglas Gardner douglas.gardner@wikinewsie.org:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Not sure this is the right place, but:
Are the Vector skin versions of the external link icons [1] on Commons anywhere? I couldn't find them when I looked, and I'm not entirely sure where to find them in order to upload [and I don't want to dupe upload things].
I don't think so. There's no technical reason for them to be on Commons. If there's some community/policy/whatever reason to have them there, ask on a community/policy/whatever list, not a technical list.
The icons can be found at http://bits.wikimedia.org/skins-1.5/vector/images/external-link-ltr-icon.png and http://bits.wikimedia.org/skins-1.5/vector/images/external-link-rtl-icon.png
Roan Kattouw (Catrope)
Roan Kattouw wrote:
011/3/28 Douglas Gardner douglas.gardner@wikinewsie.org:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Not sure this is the right place, but:
Are the Vector skin versions of the external link icons [1] on Commons anywhere? I couldn't find them when I looked, and I'm not entirely sure where to find them in order to upload [and I don't want to dupe upload things].
I don't think so. There's no technical reason for them to be on Commons. If there's some community/policy/whatever reason to have them there, ask on a community/policy/whatever list, not a technical list.
The icons can be found at http://bits.wikimedia.org/skins-1.5/vector/images/external-link-ltr-icon.png and http://bits.wikimedia.org/skins-1.5/vector/images/external-link-rtl-icon.png
Roan Kattouw (Catrope)
And are also available out of the box on every mediawiki install.
Two things: (i) I'd really hope that subclassing would be very rare here. I don't think this will be much of an issue though. (ii) Also, it would be nice if developers could all have hiphop running on their test wikis, so that code that's broken on hiphop isn't committed in ignorance. The only problem is that, last time I checked, the dependency list for hiphop is very considerable...and isn't for Windows yet. However, I believe Domas didn't need *too* many patches to get MW working, which suggests that having to write code that compiles with hiphop won't be that difficult and error prone. If there can be a small yet complete list of "things that only work in regular PHP" then that might be an OK alternative to each dev running/testing hiphop.
Otherwise,
Tim Starling-2 wrote:
I think we should migrate MediaWiki to target HipHop [1] as its primary high-performance platform. I think we should continue to support Zend, for the benefit of small installations. But we should additionally support HipHop, use it on Wikimedia, and optimise our algorithms for it.
In cases where an algorithm optimised for HipHop would be excessively slow when running under Zend, we can split the implementations by subclassing.
I was skeptical about HipHop at first, since the road is littered with the bodies of dead PHP compilers. But it looks like Facebook is pretty well committed to this one, and they have the resources to maintain it. I waited and watched for a while, but I think the time has come to make a decision on this.
Facebook now write their PHP code to target HipHop exclusively, so by trying to write code that works on both platforms, we'll be in new territory, to some degree. Maybe that's scary, but I think it can work.
Who's with me?
-- Tim Starling
[1] https://github.com/facebook/hiphop-php/wiki/
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
On Mon, Mar 28, 2011 at 2:50 AM, Aaron Schulz aschulz4587@gmail.com wrote:
(ii) Also, it would be nice if developers could all have hiphop running on their test wikis, so that code that's broken on hiphop isn't committed in ignorance. The only problem is that, last time I checked, the dependency list for hiphop is very considerable.
I also don't know if they've actually merged the 32bit work into their mainline yet--I know a volunteer was working on it. If they're lacking 32bit support in the main release still, that might be a reason to hold off for now.
I've compiled HPHP before, the dependencies aren't really that bad (anymore), you just have to compile a custom build of libevent and libcurl.
I know nothing of trying to get it to work on Windows, probably would be a royal PITA without cygwin.
-Chad
On Sun, Mar 27, 2011 at 11:21 PM, Tim Starling tstarling@wikimedia.org wrote:
Facebook now write their PHP code to target HipHop exclusively, so by trying to write code that works on both platforms, we'll be in new territory, to some degree. Maybe that's scary, but I think it can work.
What happens when the feature lists start diverging, because Zend adds what it thinks would be useful and Facebook ignores that and adds what it thinks would be useful? Then we can't use any new features from either. Or are we sure Facebook is committed to maintaining long-term compatibility with Zend PHP?
On Mon, Mar 28, 2011 at 9:42 AM, Chad innocentkiller@gmail.com wrote:
I also don't know if they've actually merged the 32bit work into their mainline yet--I know a volunteer was working on it. If they're lacking 32bit support in the main release still, that might be a reason to hold off for now.
Why? People on 32-bit machines can just run Zend PHP.
On Mar 28, 2011, at 5:28 PM, Aryeh Gregor wrote:
... and Facebook ignores that and adds what it thinks would be useful?
Facebook already has features Zend does not:
https://github.com/facebook/hiphop-php/blob/master/doc/extension.new_functio...
Stuff like: * Parallel RPC - MySQL, HTTP, .. * Background execution, post-send execution, pagelet server etc
Domas
On 29/03/11 01:28, Aryeh Gregor wrote:
What happens when the feature lists start diverging, because Zend adds what it thinks would be useful and Facebook ignores that and adds what it thinks would be useful? Then we can't use any new features from either.
We can use features from both, using function_exists(), like what we do now with PHP modules.
If you compile PHP with no zlib, you can't compress anything, but the rest of MediaWiki still works. In the future we may use HipHop's parallel execution features. If you don't have HipHop, the work will be done in serial. I quandaries will be very rare.
-- Tim Starling
On Mon, Mar 28, 2011 at 10:47 AM, Tim Starling tstarling@wikimedia.org wrote:
We can use features from both, using function_exists(), like what we do now with PHP modules.
Well, yes, if there's some reasonable fallback. It doesn't work for features that are useless if you have to write a fallback, like various types of syntactic sugar. For example, the first features from PHP 5.3 release notes include namespaces, late static binding, lambda functions and closures, NOWDOC, a ternary operator shortcut, limited goto, and __callStatic. If Facebook didn't implement some of those new features in Hiphop by the time we could feasibly require PHP 5.3, we wouldn't be able to use them. (Some look really nice, like anonymous functions -- one of the things I really like about JavaScript.)
Granted, this sort of thing is rarely very essential, and maybe Hiphop will keep up with all of PHP's new syntactic sugar. Overall, I'm all in favor of trying out Hiphop on Wikimedia -- I was just wondering what would happen if Hiphop doesn't incorporate all of PHP's new features over time. Which might be groundless, if Facebook plans to incorporate all of PHP's new syntactic features over time.
On 29/03/11 09:40, Aryeh Gregor wrote:
On Mon, Mar 28, 2011 at 10:47 AM, Tim Starling tstarling@wikimedia.org wrote:
We can use features from both, using function_exists(), like what we do now with PHP modules.
Well, yes, if there's some reasonable fallback. It doesn't work for features that are useless if you have to write a fallback, like various types of syntactic sugar. For example, the first features from PHP 5.3 release notes include namespaces, late static binding, lambda functions and closures, NOWDOC, a ternary operator shortcut, limited goto, and __callStatic. If Facebook didn't implement some of those new features in Hiphop by the time we could feasibly require PHP 5.3, we wouldn't be able to use them. (Some look really nice, like anonymous functions -- one of the things I really like about JavaScript.)
Yes, that's true, and that's part of the reason I'm flagging this change on the mailing list. Domas says that the HipHop team is working on PHP 5.3 support, so maybe the issue won't come up. But yes, in principle, I am saying that we should support HipHop even when it means not using new features from PHP.
PHP 5.3 might be cool, but so is cutting our power usage by half (pun intended).
-- Tim Starling
On Mon, Mar 28, 2011 at 9:33 PM, Tim Starling tstarling@wikimedia.org wrote:
Yes, that's true, and that's part of the reason I'm flagging this change on the mailing list. Domas says that the HipHop team is working on PHP 5.3 support, so maybe the issue won't come up. But yes, in principle, I am saying that we should support HipHop even when it means not using new features from PHP.
PHP 5.3 might be cool, but so is cutting our power usage by half (pun intended).
Okay, then I'm all in favor.
On Tue, Mar 29, 2011 at 3:28 PM, Aryeh Gregor Simetrical+wikilist@gmail.com wrote:
On Mon, Mar 28, 2011 at 9:33 PM, Tim Starling tstarling@wikimedia.org wrote:
Yes, that's true, and that's part of the reason I'm flagging this change on the mailing list. Domas says that the HipHop team is working on PHP 5.3 support, so maybe the issue won't come up. But yes, in principle, I am saying that we should support HipHop even when it means not using new features from PHP.
PHP 5.3 might be cool, but so is cutting our power usage by half (pun intended).
Okay, then I'm all in favor.
Plus, free C++ MediaWiki parser ;-)
Seriously, there should be a way to turn the entire package into a (huge) library; maybe transpile it and then replace the C++ code for index.php with a manually written library interface?
Offline readers, scientific analysis tools, etc. could profit massively from an always-current, fast C++ library...
Magnus
On 03/30/2011 09:22 AM, Magnus Manske wrote:
Plus, free C++ MediaWiki parser ;-)
Seriously, there should be a way to turn the entire package into a (huge) library; maybe transpile it and then replace the C++ code for index.php with a manually written library interface?
HipHop has a library generation feature. It even has an option to provide public interfaces with human-readable names.
Offline readers, scientific analysis tools, etc. could profit massively from an always-current, fast C++ library...
Speaking of "fast", I did a quick benchmark of the [[Barack Obama]] article with templates pre-expanded. It took 22 seconds in HipHop and 112 seconds in Zend, which is not bad, for a first attempt. I reckon it would do better if a few of the regular expressions were replaced with tight loops.
Also, browsing the generated source turns up silly things like:
if (equal(switch2, (NAMSTR(s_ss34c5c84c, "currentmonth")))) goto case_2_0; if (equal(switch2, (NAMSTR(s_ss55b88086, "currentmonth1")))) goto case_2_1; if (equal(switch2, (NAMSTR(s_ss0ccbf467, "currentmonthname")))) goto case_2_2; if (equal(switch2, (NAMSTR(s_ss513d5737, "currentmonthnamegen")))) goto case_2_3; if (equal(switch2, (NAMSTR(s_ss004d8db5, "currentmonthabbrev")))) goto case_2_4; if (equal(switch2, (NAMSTR(s_ssf9584d41, "currentday")))) goto case_2_5;
71 string comparisons in total, in quite a hot function. A hashtable would probably be better.
-- Tim Starling
On 05/04/11 04:47, Tim Starling wrote:
Speaking of "fast", I did a quick benchmark of the [[Barack Obama]] article with templates pre-expanded. It took 22 seconds in HipHop and 112 seconds in Zend, which is not bad, for a first attempt. I reckon it would do better if a few of the regular expressions were replaced with tight loops.
<snip> I have imported in my local wiki the english [[Barack Obama]] article with all its dependencies. I can not have it parsed under either 256MB max memory or 1 minute max execution time limits. Hiphop helps, but there is still a highly broken code somewhere in our PHP source code. No matter how much hacks we throw at bad code, the algorithm still need to get fixed.
Also, browsing the generated source turns up silly things like:
if (equal(switch2, (NAMSTR(s_ss34c5c84c, "currentmonth")))) goto case_2_0; if (equal(switch2, (NAMSTR(s_ss55b88086, "currentmonth1")))) goto case_2_1;
<snip>
71 string comparisons in total, in quite a hot function. A hashtable would probably be better.
As I understand it, hiphop is just a straight translator from PHP to C language but does not actually enhance the code. Just like you would use Google translator instead of Charles Baudelaire [1].
Your code above comes from Parse.php getVariableValue() which use a long switch() structure to map a string to a method call. If you manage to find unoptimized code in the translated code, fix it in the PHP source code :-b
[1] French poet, probably best known in UK/US for his work on translating Edgar Allan Poe novels from English to French.
On 04/05/2011 04:45 PM, Ashar Voultoiz wrote:
On 05/04/11 04:47, Tim Starling wrote:
Speaking of "fast", I did a quick benchmark of the [[Barack Obama]] article with templates pre-expanded. It took 22 seconds in HipHop and 112 seconds in Zend, which is not bad, for a first attempt. I reckon it would do better if a few of the regular expressions were replaced with tight loops.
<snip> I have imported in my local wiki the english [[Barack Obama]] article with all its dependencies. I can not have it parsed under either 256MB max memory or 1 minute max execution time limits. Hiphop helps, but there is still a highly broken code somewhere in our PHP source code. No matter how much hacks we throw at bad code, the algorithm still need to get fixed.
Let me know when you find that broken code. Try using the profiling feature from xdebug to narrow down the causes of CPU usage.
Also, browsing the generated source turns up silly things like:
if (equal(switch2, (NAMSTR(s_ss34c5c84c, "currentmonth")))) goto case_2_0; if (equal(switch2, (NAMSTR(s_ss55b88086, "currentmonth1")))) goto case_2_1;
<snip> > 71 string comparisons in total, in quite a hot function. A hashtable > would probably be better.
As I understand it, hiphop is just a straight translator from PHP to C language but does not actually enhance the code. Just like you would use Google translator instead of Charles Baudelaire [1].
It's not just a translator, it's also a reimplementation of the bulk of the PHP core.
Your code above comes from Parse.php getVariableValue() which use a long switch() structure to map a string to a method call. If you manage to find unoptimized code in the translated code, fix it in the PHP source code :-b
In Zend PHP, switch statements are implemented by making a hashtable at compile time, and then doing a hashtable lookup at runtime. The HipHop implementation is less efficient. So getVariableValue() is not broken, it's just not optimised for HipHop.
-- Tim Starling
On Tue, Apr 5, 2011 at 7:45 AM, Ashar Voultoiz hashar+wmf@free.fr wrote:
On 05/04/11 04:47, Tim Starling wrote:  > Speaking of "fast", I did a quick benchmark of the [[Barack Obama]]  > article with templates pre-expanded. It took 22 seconds in HipHop and  > 112 seconds in Zend, which is not bad, for a first attempt. I reckon  > it would do better if a few of the regular expressions were replaced  > with tight loops.
<snip> I have imported in my local wiki the english [[Barack Obama]] article with all its dependencies. I can not have it parsed under either 256MB max memory or 1 minute max execution time limits. Hiphop helps, but there is still a highly broken code somewhere in our PHP source code. Â No matter how much hacks we throw at bad code, the algorithm still need to get fixed.
For comparison: WYSIFTW parses [[Barak Obama]] in 3.5 sec on my iMac, and in 4.4 sec on my MacBook (both Chrome 12).
Yes, it doesn't do template/variable replacing, and it's probably full of corner cases that break; OTOH, it's JavaScript running in a browser, which should make it much slower than a dedicated server setup running precompiled PHP.
So, maybe another hard look at the MediaWiki parser is in order?
Cheers, Magnus
For comparison: WYSIFTW parses [[Barak Obama]] in 3.5 sec on my iMac, and in 4.4 sec on my MacBook (both Chrome 12).
Try parsing [[Barack Obama]], 4s spent on parsing a redirect page is quite a lot (albeit it has some vandalism) OTOH, my macbook shows raw wikitext pretty much immediately. Parser is definitely the issue.
Domas
2011/4/5 Magnus Manske magnusmanske@googlemail.com:
For comparison: WYSIFTW parses [[Barak Obama]] in 3.5 sec on my iMac, and in 4.4 sec on my MacBook (both Chrome 12).
Yes, it doesn't do template/variable replacing, and it's probably full of corner cases that break; OTOH, it's JavaScript running in a browser, which should make it much slower than a dedicated server setup running precompiled PHP.
Seriously, the bulk of the time needed to parse these enwiki articles is for template expansion. If you pre-expand them, taking care that also the templates in <ref>...</ref> tags get expanded, MediaWiki can parse the article in a few seconds, 3-4 on my laptop.
On Tue, Apr 5, 2011 at 3:30 PM, Paul Copperman paul.copperman@googlemail.com wrote:
2011/4/5 Magnus Manske magnusmanske@googlemail.com:
For comparison: WYSIFTW parses [[Barak Obama]] in 3.5 sec on my iMac, and in 4.4 sec on my MacBook (both Chrome 12).
Yes, it doesn't do template/variable replacing, and it's probably full of corner cases that break; OTOH, it's JavaScript running in a browser, which should make it much slower than a dedicated server setup running precompiled PHP.
Seriously, the bulk of the time needed to parse these enwiki articles is for template expansion. If you pre-expand them, taking care that also the templates in <ref>...</ref> tags get expanded, MediaWiki can parse the article in a few seconds, 3-4 on my laptop.
So is the time spent with the actual expansion (replacing variables), or getting the wikitext for n-depth template recursion? Or is it the parser functions?
2011/4/5 Magnus Manske magnusmanske@googlemail.com:
So is the time spent with the actual expansion (replacing variables), or getting the wikitext for n-depth template recursion? Or is it the parser functions?
Well, getting the wikitext shouldn't be very expensive as it is cached in several cache layers. Basically it's just expanding many, many preprocessor nodes. A while ago I did a bit of testing with my template tool on dewiki[1] and found that wikimedia servers spend approx. 0.2 ms per expanded node part, although there's of course much variation depending on current load. My tool counts 303,905 nodes when expanding [[Barack Obama]] so that would account for about 60 s of render time. As already said, YMMV.
Paul Copperman
[1] http://de.wikipedia.org/wiki/Benutzer:P.Copp/scripts/templateutil.js, you can test it with javascript:void importScriptURI('http://de.wikipedia.org/w/index.php?action=raw&title=Benutzer:P.Copp/scripts/templateutil.js&ctype=text/javascript') and a click on "Template tools" in the toolbox
On Tue, Apr 5, 2011 at 9:02 AM, Magnus Manske magnusmanske@googlemail.com wrote:
Yes, it doesn't do template/variable replacing, and it's probably full of corner cases that break; OTOH, it's JavaScript running in a browser, which should make it much slower than a dedicated server setup running precompiled PHP.
To the contrary, I'd expect JavaScript in the most recent version of any browser (even IE) to be *much* faster than PHP, maybe ten times faster on real-world tasks. All browsers now use JIT compilation for JavaScript, and have been competing intensively on raw JavaScript speed for the last three years or so. There are no drop-in alternative PHP implementations, so PHP is happy sticking with a ridiculously slow interpreter forever.
Alioth's language benchmarks show JavaScript in Chrome's V8 (they don't say what version) as being at least eight times faster than PHP on most of its benchmarks, although it's slower on two:
http://shootout.alioth.debian.org/u32/benchmark.php?test=all&lang=v8&...
Obviously not very scientific, but it gives you an idea.
On 5 April 2011 21:29, Aryeh Gregor Simetrical+wikilist@gmail.com wrote:
To the contrary, I'd expect JavaScript in the most recent version of any browser (even IE) to be *much* faster than PHP, maybe ten times faster on real-world tasks. Â All browsers now use JIT compilation for JavaScript, and have been competing intensively on raw JavaScript speed for the last three years or so. Â There are no drop-in alternative PHP implementations, so PHP is happy sticking with a ridiculously slow interpreter forever.
So if we machine-translate the parser into JS, we can get the user to do the work and everyone wins! [*]
(Magnus, did you do something like this for WYSIFTW?)
- d.
[*] if they're using a recent browser on a recent computer, etc etc, ymmv.
On Tue, Apr 5, 2011 at 10:07 PM, David Gerard dgerard@gmail.com wrote:
On 5 April 2011 21:29, Aryeh Gregor Simetrical+wikilist@gmail.com wrote:
To the contrary, I'd expect JavaScript in the most recent version of any browser (even IE) to be *much* faster than PHP, maybe ten times faster on real-world tasks. Â All browsers now use JIT compilation for JavaScript, and have been competing intensively on raw JavaScript speed for the last three years or so. Â There are no drop-in alternative PHP implementations, so PHP is happy sticking with a ridiculously slow interpreter forever.
So if we machine-translate the parser into JS, we can get the user to do the work and everyone wins! [*]
(Magnus, did you do something like this for WYSIFTW?)
Nope, all hand-rolled, just like my last 50 or so parser wannabe implementations ;-)
(this one has the advantage of a "dunno what this is, just keep the wikitext" fallback, which helped a lot)
Magnus
On 04/05/2011 05:47 AM, Tim Starling wrote:
Speaking of "fast", I did a quick benchmark of the [[Barack Obama]] article with templates pre-expanded. It took 22 seconds in HipHop and 112 seconds in Zend, which is not bad, for a first attempt. I reckon it would do better if a few of the regular expressions were replaced with tight loops.
Hmm, does HipHop precompile regexen?
On 04/05/2011 10:39 PM, Ilmari Karonen wrote:
On 04/05/2011 05:47 AM, Tim Starling wrote:
Speaking of "fast", I did a quick benchmark of the [[Barack Obama]] article with templates pre-expanded. It took 22 seconds in HipHop and 112 seconds in Zend, which is not bad, for a first attempt. I reckon it would do better if a few of the regular expressions were replaced with tight loops.
Hmm, does HipHop precompile regexen?
No. Its regex handling is the same as Zend's. It uses PCRE with a cache of compiled regexes, generated at runtime.
-- Tim Starling
On Mon, Mar 28, 2011 at 10:28 AM, Aryeh Gregor Simetrical+wikilist@gmail.com wrote:
On Mon, Mar 28, 2011 at 9:42 AM, Chad innocentkiller@gmail.com wrote:
I also don't know if they've actually merged the 32bit work into their mainline yet--I know a volunteer was working on it. If they're lacking 32bit support in the main release still, that might be a reason to hold off for now.
Why? Â People on 32-bit machines can just run Zend PHP.
I meant that as more for developers looking to help in the effort but might still be on a 32bit system :)
-Chad
Tim Starling wrote:
I think we should migrate MediaWiki to target HipHop [1] as its primary high-performance platform. I think we should continue to support Zend, for the benefit of small installations. But we should additionally support HipHop, use it on Wikimedia, and optimise our algorithms for it.
In cases where an algorithm optimised for HipHop would be excessively slow when running under Zend, we can split the implementations by subclassing.
I was skeptical about HipHop at first, since the road is littered with the bodies of dead PHP compilers. But it looks like Facebook is pretty well committed to this one, and they have the resources to maintain it. I waited and watched for a while, but I think the time has come to make a decision on this.
Facebook now write their PHP code to target HipHop exclusively, so by trying to write code that works on both platforms, we'll be in new territory, to some degree. Maybe that's scary, but I think it can work.
Who's with me?
-- Tim Starling
I was expecting this the week hip-hop hit. What would be required "to target hip-hop"? How does that differ from working from Zend?
On 03/29/2011 10:48 AM, Platonides wrote:
I was expecting this the week hip-hop hit. What would be required "to target hip-hop"? How does that differ from working from Zend?
I've explored the issues and made some initial changes to my working copy. I'm now waiting for it to compile, and once it's tested, I'll commit it.
There is a list of things that differ here:
https://github.com/facebook/hiphop-php/blob/master/doc/inconsistencies
Unfortunately it seems to leave out the most important differences.
It seems incredible, and I'm hoping someone will correct me, but it seems that file inclusion has to be completely different in HipHop. Even the simplest script won't work. I put this in foo.php:
<?php class Foo { static function bar() { print "Hello\n"; } } ?>
And this in test.php:
<?php include 'foo.php'; Foo::bar(); ?>
This gives "HipHop Fatal error: Cannot redeclare class Foo" at runtime. All classes which are compiled exist from startup, and trying to declare them produces this error. This means that it is no longer possible to mix class and function declarations with code we want to execute. My working copy has fixes for the most important instances of this, such as in Setup.php and WebStart.php.
There are two exceptions to this. One is the interpreter. HipHop has an interpreter, which is used for eval() and for include() on a file with a fully-qualified path. We can use this to allow us to change LocalSettings.php without recompiling.
If you want to do include() and have it execute compiled code, you need to use a path which is relative to the base of the compiled code. My working copy has some functions which allow this to be done in a self-documenting way.
The other exception is volatile mode, which unfortunately appears to be completely broken, at least in the RPMs that I'm using. It's so broken that calling class_exists() on a literal string will break the class at compile time, making it impossible to use, with no way to repair it. My working copy has a wrapper for class_exists() which doesn't suffer from this problem.
Another undocumented difference is that HipHop does not use php.ini or anything like it, so most instances of ini_get() and ini_set() are broken. The functions exist, but only have stub functionality. HipHop has its own configuration files, but they aren't like php.ini.
When I'm ready to write all this up properly, the following page will appear on mediawiki.org:
http://www.mediawiki.org/wiki/HipHop
-- Tim Starling
On Sun, Apr 3, 2011 at 5:38 PM, Tim Starling tstarling@wikimedia.orgwrote:
There is a list of things that differ here:
https://github.com/facebook/hiphop-php/blob/master/doc/inconsistencies
Unfortunately it seems to leave out the most important differences.
Ain't that always the way ;)
[various more scary things mentioned]
When I'm ready to write all this up properly, the following page will appear on mediawiki.org:
Wheeeeee! So far it sounds like most of these are things we can work around reasonably sensibly, so mostly good news. Any remaining issues with 'scary reference stuff' like stub objects, or do those semantics actually already work for us?
-- brion
On 04/04/2011 12:11 PM, Brion Vibber wrote:
Wheeeeee! So far it sounds like most of these are things we can work around reasonably sensibly, so mostly good news. Any remaining issues with 'scary reference stuff' like stub objects, or do those semantics actually already work for us?
I'm not expecting any problems with stub objects.
One piece of good news that I neglected to mention is that MediaWiki appears to work almost unmodified under the HipHop command-line interpreter, hphpi. I did some page views and edits in it. I think this must be what Inez is doing, judging by his very short patch. If there were any problems with things like references, you'd expect them to show up there.
Stub objects would probably break with the compiler option "AllDynamic" off, but so would a lot of things. That's why it's on in the build scripts I've written.
-- Tim Starling
Hi Tim,
I have no problem running foo.php and test.php that you sent in hphpi, but also it compiles and run without any problems with hphp. What command exactly do you use to compile and then execute?
Inez
On Sun, Apr 3, 2011 at 5:38 PM, Tim Starling tstarling@wikimedia.orgwrote:
On 03/29/2011 10:48 AM, Platonides wrote:
I was expecting this the week hip-hop hit. What would be required "to target hip-hop"? How does that differ from working from Zend?
I've explored the issues and made some initial changes to my working copy. I'm now waiting for it to compile, and once it's tested, I'll commit it.
There is a list of things that differ here:
https://github.com/facebook/hiphop-php/blob/master/doc/inconsistencies
Unfortunately it seems to leave out the most important differences.
It seems incredible, and I'm hoping someone will correct me, but it seems that file inclusion has to be completely different in HipHop. Even the simplest script won't work. I put this in foo.php:
<?php class Foo { static function bar() { print "Hello\n"; } } ?>
And this in test.php:
<?php include 'foo.php'; Foo::bar(); ?>
This gives "HipHop Fatal error: Cannot redeclare class Foo" at runtime. All classes which are compiled exist from startup, and trying to declare them produces this error. This means that it is no longer possible to mix class and function declarations with code we want to execute. My working copy has fixes for the most important instances of this, such as in Setup.php and WebStart.php.
There are two exceptions to this. One is the interpreter. HipHop has an interpreter, which is used for eval() and for include() on a file with a fully-qualified path. We can use this to allow us to change LocalSettings.php without recompiling.
If you want to do include() and have it execute compiled code, you need to use a path which is relative to the base of the compiled code. My working copy has some functions which allow this to be done in a self-documenting way.
The other exception is volatile mode, which unfortunately appears to be completely broken, at least in the RPMs that I'm using. It's so broken that calling class_exists() on a literal string will break the class at compile time, making it impossible to use, with no way to repair it. My working copy has a wrapper for class_exists() which doesn't suffer from this problem.
Another undocumented difference is that HipHop does not use php.ini or anything like it, so most instances of ini_get() and ini_set() are broken. The functions exist, but only have stub functionality. HipHop has its own configuration files, but they aren't like php.ini.
When I'm ready to write all this up properly, the following page will appear on mediawiki.org:
http://www.mediawiki.org/wiki/HipHop
-- Tim Starling
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
On 04/05/2011 04:31 AM, Inez Korczynski wrote:
Hi Tim,
I have no problem running foo.php and test.php that you sent in hphpi, but also it compiles and run without any problems with hphp. What command exactly do you use to compile and then execute?
To compile:
hphp --target=cpp --format=exe --input-dir=. \ -i class-test.php -i class-test-2.php \ -i class-test-3.php -i define-test.php \ -c../compiler.conf --parse-on-demand=true \ --program=test --output-dir=build --log=4
where ../compiler.conf is the configuration file I checked in to subversion in maintenance/hiphop, class-test*.php are the various class declaration tests.
To execute:
build/test -f class-test.php
etc.
-- Tim Starling
Tim Starling wrote:
On 03/29/2011 10:48 AM, Platonides wrote:
I was expecting this the week hip-hop hit. What would be required "to target hip-hop"? How does that differ from working from Zend?
I've explored the issues and made some initial changes to my working copy. I'm now waiting for it to compile, and once it's tested, I'll commit it.
Having to create a reflection class and look for exceptions just to check for class existance is really ugly. However, looking at https://github.com/facebook/hiphop-php/issues/314 it seems to be a declaration before use problem (even though class_exists shouldn't be declaring it). I suppose that we could work around that by including all classes at the beginning instead of using the AutoLoader, which shouldn't be needed for compiled code.
On 04/05/2011 07:31 AM, Platonides wrote:
Having to create a reflection class and look for exceptions just to check for class existance is really ugly. However, looking at https://github.com/facebook/hiphop-php/issues/314 it seems to be a declaration before use problem (even though class_exists shouldn't be declaring it). I suppose that we could work around that by including all classes at the beginning instead of using the AutoLoader, which shouldn't be needed for compiled code.
You can't include the class files for compiled classes, they all exist at startup and you get a redeclaration error if you try. I explained this in the documentation page on mediawiki.org which has now appeared.
http://www.mediawiki.org/wiki/HipHop
The autoloader is not used. It doesn't matter where the class_exists() is. HipHop scans the entire codebase for class_exists() at compile time, and breaks any classes it finds, whether the or not the class_exists() is reachable.
-- Tim Starling
On 04/05/2011 02:42 AM, Tim Starling wrote:
On 04/05/2011 07:31 AM, Platonides wrote:
Having to create a reflection class and look for exceptions just to check for class existance is really ugly. However, looking at https://github.com/facebook/hiphop-php/issues/314 it seems to be a declaration before use problem (even though class_exists shouldn't be declaring it). I suppose that we could work around that by including all classes at the beginning instead of using the AutoLoader, which shouldn't be needed for compiled code.
You can't include the class files for compiled classes, they all exist at startup and you get a redeclaration error if you try. I explained this in the documentation page on mediawiki.org which has now appeared.
http://www.mediawiki.org/wiki/HipHop
The autoloader is not used. It doesn't matter where the class_exists() is. HipHop scans the entire codebase for class_exists() at compile time, and breaks any classes it finds, whether the or not the class_exists() is reachable.
These both really sound like bugs in HipHop. I've no idea how hard it would be to fix them, but are we reporting them at least?
On 04/05/2011 10:43 PM, Ilmari Karonen wrote:
On 04/05/2011 02:42 AM, Tim Starling wrote:
You can't include the class files for compiled classes, they all exist at startup and you get a redeclaration error if you try. I explained this in the documentation page on mediawiki.org which has now appeared.
http://www.mediawiki.org/wiki/HipHop
The autoloader is not used. It doesn't matter where the class_exists() is. HipHop scans the entire codebase for class_exists() at compile time, and breaks any classes it finds, whether the or not the class_exists() is reachable.
These both really sound like bugs in HipHop. I've no idea how hard it would be to fix them, but are we reporting them at least?
I reported the class_exists() thing. The fact that classes exist on startup is more of a design decision than a bug. I don't want to annoy the HipHop devs with frivolous bug reports, we need a lot of favours from them.
-- Tim Starling
On Sun, Mar 27, 2011 at 11:21 PM, Tim Starling tstarling@wikimedia.org wrote:
Facebook now write their PHP code to target HipHop exclusively, so by trying to write code that works on both platforms, we'll be in new territory, to some degree. Maybe that's scary, but I think it can work.
Who's with me?
*grabs a battle axe* I'm with you!
I went ahead and compiled hiphop last night on a fresh VM. Couple of notes for anyone trying to join us.
For those of you on Ubuntu or other flavors of Debian, the guide at [0] wil pretty much walk you through it pain-free. One little gotcha: you need a libmemcached of at least 0.39, and the latest version in 10.04 and below is 0.31, so you'll either need to do a manual build, grab it from the newer repo, or go ahead and bite the bullet and upgrade. Oh, and run make from a screen and walk away for awhile, it's not the fastest build ever.
I finished building around 1am last night, didn't get to the next stage yet.
I might try building on OSX today. I couldn't get it to work ~6 months ago, but those issues may well be resolved by now.
-Chad
[0] https://github.com/facebook/hiphop-php/wiki/Building-and-Installing-on-Ubunt...
On 03/30/2011 12:51 AM, Chad wrote:
For those of you on Ubuntu or other flavors of Debian, the guide at [0] wil pretty much walk you through it pain-free. One little gotcha: you need a libmemcached of at least 0.39, and the latest version in 10.04 and below is 0.31, so you'll either need to do a manual build, grab it from the newer repo, or go ahead and bite the bullet and upgrade. Oh, and run make from a screen and walk away for awhile, it's not the fastest build ever.
I saw that there are RPMs for CentOS, so I installed CentOS inside a chroot inside Ubuntu 10.10 x86-64. Surprisingly, this was quite easy. I put some notes at:
http://www.mediawiki.org/wiki/User:Tim_Starling/HipHop_in_CentOS_chroot
Of course, the downside is that you then have to work inside a chroot. It's probably tolerable if you use the bind mount for /home that schroot provides by default to store your files.
-- Tim Starling
On 3/30/11 12:26 AM, Tim Starling wrote:
On 03/30/2011 12:51 AM, Chad wrote:
For those of you on Ubuntu or other flavors of Debian, the guide at [0] wil pretty much walk you through it pain-free. One little gotcha: you need a libmemcached of at least 0.39, and the latest version in 10.04 and below is 0.31, so you'll either need to do a manual build, grab it from the newer repo, or go ahead and bite the bullet and upgrade. Oh, and run make from a screen and walk away for awhile, it's not the fastest build ever.
I saw that there are RPMs for CentOS, so I installed CentOS inside a chroot inside Ubuntu 10.10 x86-64. Surprisingly, this was quite easy.
Also, I've been told that there are VMs with HipHop already set up which would save the pain of compiling it yourself. I haven't tried any of them myself yet, but a quick Google search led me to this:
http://www.virtcloud.eu/?page=hiphop
And there are no doubt others.
Arthur
Hello,
I'm working on migration to HipHop at Wikia (we run on MediaWiki 1.16.2 with tons of our custom extensions and skins).
At this point I'm testing and benchmarking all different use cases and so far I didn't run into any serious problems - however there were memory corruptions when using DOMDocument (Preprocessor_DOM) under heavy load, but it is already fixed.
Btw. I had to apply this patch http://pastebin.com/qJNcwp99 to MediaWiki code to make it work (commenting preg_replace is just temporary change).
Very likely in our approach we will mostly target HipHop (not Zend) with future development, since we want to switch our developers to work with HipHop as well.
Inez
On Sun, Mar 27, 2011 at 8:21 PM, Tim Starling tstarling@wikimedia.org wrote:
I think we should migrate MediaWiki to target HipHop [1] as its primary high-performance platform. I think we should continue to support Zend, for the benefit of small installations. But we should additionally support HipHop, use it on Wikimedia, and optimise our algorithms for it.
In cases where an algorithm optimised for HipHop would be excessively slow when running under Zend, we can split the implementations by subclassing.
I was skeptical about HipHop at first, since the road is littered with the bodies of dead PHP compilers. But it looks like Facebook is pretty well committed to this one, and they have the resources to maintain it. I waited and watched for a while, but I think the time has come to make a decision on this.
Facebook now write their PHP code to target HipHop exclusively, so by trying to write code that works on both platforms, we'll be in new territory, to some degree. Maybe that's scary, but I think it can work.
Who's with me?
-- Tim Starling
[1] https://github.com/facebook/hiphop-php/wiki/
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
I am surely not the only person who gets Blondie's "Rapture" stuck in their head whenever they see the topic for this thread?
No?
Ugh.
I need to get back to work on reviews and deployments and other such strategic things. I hope the work I've done on HipHop support is enough to get the project started.
Initial benchmarks are showing a 5x speedup for article parsing, which is better than we had hoped for. So there are big gains to be had here, both for small users and for large websites like Wikimedia and Wikia.
There's a list of things that still need doing under the "to do" heading at:
http://www.mediawiki.org/wiki/HipHop
I'll be available to support any developers who want to work on this.
-- Tim Starling
On Tue, Apr 5, 2011 at 12:58 AM, Tim Starling tstarling@wikimedia.org wrote:
I need to get back to work on reviews and deployments and other such strategic things. I hope the work I've done on HipHop support is enough to get the project started.
Initial benchmarks are showing a 5x speedup for article parsing, which is better than we had hoped for. So there are big gains to be had here, both for small users and for large websites like Wikimedia and Wikia.
There's a list of things that still need doing under the "to do" heading at:
http://www.mediawiki.org/wiki/HipHop
I'll be available to support any developers who want to work on this.
-- Tim Starling
This is something I'm interested in and hope to put a little more time into in the near future once the 1.17 release is behind us.
-Chad
wikitech-l@lists.wikimedia.org