I made a late checkin on the 1.4 branch today.
The problem was with Hooks.php, the event-handling mechanism for optional functionality and third-party extensions. wfRunHooks(), the interface for mainline code to raise events for hooks to handle, was using PHP's variable-argument syntax so you could give different parameter sets to different events:
wfRunHooks('Event1', $param1, $param2, $param3); # three params wfRunHooks('Event2', $param4, $param5); # just two params
The problem with this is that the var args syntax silently changes pass-by-reference semantics to pass-by-value semantics. So events that should allow modifying parameters do not. For example:
$wgEventHooks['Event3'][] = 'MyEvent3Handler'; # ... function MyEvent3Handler(&$str) { $str .= " and so on."; return true; } # ... $somestring = "Eggs, bananas, bacon"; # NOTE: OLD CALLING FORMAT if (wfRunHooks('Event3', &$somestring)) { echo $somestring; # prints "Eggs, bananas, bacon" }
After some consulting with PHP gurus, the best solution I could come up with was to pack the params in an array when passing them to wfRunHooks(). Now the last few lines are:
if (wfRunHooks('Event3', array(&$somestring))) { echo $somestring; # prints "Eggs, bananas, bacon and so on." }
Adding a reference to an array keeps its "reference-ness".
I wanted to make sure that the hooks technology was at least worth experimenting with in the 1.4 branch, and that the interface (from both sides) was relatively stable. Hooks need to be able to modify their parameters to be useful, so I made these changes.
I've done pretty extensive testing of all the points in 1.4 where wfRunHooks() is called, and it seems to be stable.
~Evan
Evan Prodromou wrote:
I made a late checkin on the 1.4 branch today.
I've reverted it for the moment. Could you check it into the mainline development branch please?
If it looks ok and doesn't break the actual called hook functions (or if it does, and we have some warning to fix them first), *then* go ahead check it into REL1_4.
-- brion vibber (brion @ pobox.com)
On Sat, 2005-12-03 at 13:41 -0800, Brion Vibber wrote:
Evan Prodromou wrote:
I made a late checkin on the 1.4 branch today.
I've reverted it for the moment. Could you check it into the mainline development branch please?
Sure, but it should really go out for 1.4 final, as the hooks architecture is read-only without it (counter its documentation).
If it looks ok and doesn't break the actual called hook functions (or if it does, and we have some warning to fix them first), *then* go ahead check it into REL1_4.
I've tested every single one of the hook events, both with and without the Syslog sample extension. This was in the REL1_4 branch; I think it probably makes better sense to test in the 1.4 branch than with the HEAD branch (since that's what's actually going out), but I'll do that too.
~Evan
Evan Prodromou wrote:
On Sat, 2005-12-03 at 13:41 -0800, Brion Vibber wrote:
If it looks ok and doesn't break the actual called hook functions (or if it does, and we have some warning to fix them first), *then* go ahead check it into REL1_4.
I've tested every single one of the hook events, both with and without the Syslog sample extension. This was in the REL1_4 branch; I think it probably makes better sense to test in the 1.4 branch than with the HEAD branch (since that's what's actually going out), but I'll do that too.
Unless you're checking in something that will never ever be supported in the future, or is a hack specifically for a problem in the 1.4 branch that does not exist in the next version, it needs to go in HEAD.
If it doesn't go into HEAD, it will not be there in the next version. It would be pretty lame if we fix a bunch of bugs in 1.4 and have them come back in 1.5, wouldn't it? :)
I generally prefer to see things checked in on HEAD *first* and *then* on REL1_4. If you prepare the code on both and check them in a few minutes apart in the other order, that's fine too; I do that pretty frequently. But if somebody makes a change out of the blue on the release branch, and many hours layer there's no sign of it on the development branch, big warning signs flash in my head and my inclination is to pull it pending further work -- if it's not ready for the development branch, it's probably not ready for production.
Keep in mind that we're pulling updates from REL1_4 (manually, but frequently) on our live wikis. It's not a testing branch; it's in production.
-- brion vibber (brion @ pobox.com)
On Sun, 2005-13-03 at 06:55 -0800, Brion Vibber wrote:
Unless you're checking in something that will never ever be supported in the future, or is a hack specifically for a problem in the 1.4 branch that does not exist in the next version, it needs to go in HEAD.
If it doesn't go into HEAD, it will not be there in the next version. It would be pretty lame if we fix a bunch of bugs in 1.4 and have them come back in 1.5, wouldn't it? :)
I thought things worked differently.
It's a common version-control practise to make a stable branch before a release, and do only bug fixes, stabilizing changes, and release-related mods (version number, release notes, etc.) in that branch. Then, after the release milestone and hoopla, at your leisure you merge all changes back into HEAD as one big packet. You can also continue to make bug fixes in the stable branch and merge each group of changes back to HEAD after a minor release.
I'm not saying MediaWiki has to do it that way; I'm just saying that that's a pretty common thing to do.
I generally prefer to see things checked in on HEAD *first* and *then* on REL1_4.
I don't think that's a good idea. Not that my opinion on the subject matters, but here's my thinking: if you're doing stabilizing bug fixes, you should write and test them for the stable environment, not for a moving-target development environment.
Also, merging between branches is an inherently destabilizing action; one should inject that entropy from the stable branch into the unstable branch, not the other way around. In other words, you should pay the price in the dev branch, not in the stable branch.
But, hey: you da boss. I'll dev in HEAD then port to RELX_Y in the future, if I need to. And I'll merge this changeset from HEAD to REL1_4 in a day or two.
~Evan
Evan Prodromou wrote:
It's a common version-control practise to make a stable branch before a release, and do only bug fixes, stabilizing changes, and release-related mods (version number, release notes, etc.) in that branch. Then, after the release milestone and hoopla, at your leisure you merge all changes back into HEAD as one big packet.
In my experience, merging anything as one big packet is a big pain in the butt. :) Better to merge the changes up into HEAD when they're still fresh in your mind.
-- brion vibber (brion @ pobox.com)
On Sat, 2005-12-03 at 13:41 -0800, Brion Vibber wrote:
Evan Prodromou wrote:
I made a late checkin on the 1.4 branch today.
I've reverted it for the moment. Could you check it into the mainline development branch please?
Done.
If it looks ok and doesn't break the actual called hook functions (or if it does, and we have some warning to fix them first), *then* go ahead check it into REL1_4.
OK, I've checked the change in to HEAD, and I've re-tested to my own satisfaction (ran over all the places in the code where events are raised, trying to vary as many things as I could). I've also added a bug report:
http://bugzilla.wikimedia.org/show_bug.cgi?id=1694
I'll merge back to REL1_4 after a few days if I don't hear screams of bloody murder anywhere. Let me know if you want to do some more formal acceptance criteria.
~Evan
This is the same problem I encountered back in December when I tried using the hooks to insert extra information into the template's context
http://mail.wikipedia.org/pipermail/wikitech-l/2004-December/026533.html
In that email, I suggested using a named array rather than a simple array to pass the parameters. Do you think this might provide more clarity?
Also, what about including this actual setupTemplate processing hook that I proposed?
/Jonah
On Sat, 2005-03-12 at 09:46 -0500, Evan Prodromou wrote:
--=-HwYMtKXwosVZ95m4MJOb Content-Type: text/plain Content-Transfer-Encoding: quoted-printable
I made a late checkin on the 1.4 branch today.
The problem was with Hooks.php, the event-handling mechanism for optional functionality and third-party extensions. wfRunHooks(), the interface for mainline code to raise events for hooks to handle, was using PHP's variable-argument syntax so you could give different parameter sets to different events:
wfRunHooks('Event1', $param1, $param2, $param3); # three params wfRunHooks('Event2', $param4, $param5); # just two params
The problem with this is that the var args syntax silently changes pass-by-reference semantics to pass-by-value semantics. So events that should allow modifying parameters do not. For example:
$wgEventHooks['Event3'][] =3D 'MyEvent3Handler'; # ... function MyEvent3Handler(&$str) { $str .=3D " and so on."; return true;
} # ... $somestring =3D "Eggs, bananas, bacon"; # NOTE: OLD CALLING FORMAT if (wfRunHooks('Event3', &$somestring)) { echo $somestring; # prints "Eggs, bananas, bacon" }
After some consulting with PHP gurus, the best solution I could come up with was to pack the params in an array when passing them to wfRunHooks(). Now the last few lines are:
if (wfRunHooks('Event3', array(&$somestring))) { echo $somestring; # prints "Eggs, bananas, bacon and so on." }
Adding a reference to an array keeps its "reference-ness".
I wanted to make sure that the hooks technology was at least worth experimenting with in the 1.4 branch, and that the interface (from both sides) was relatively stable. Hooks need to be able to modify their parameters to be useful, so I made these changes.
I've done pretty extensive testing of all the points in 1.4 where wfRunHooks() is called, and it seems to be stable.
~Evan
--=20 Evan Prodromou evan@bad.dynu.ca
--=-HwYMtKXwosVZ95m4MJOb-- _______________________________________________ Wikitech-l mailing list Wikitech-l@wikimedia.org http://mail.wikipedia.org/mailman/listinfo/wikitech-l
On Sat, 2005-12-03 at 17:31 -0500, Jonah Bossewitch wrote:
This is the same problem I encountered back in December when I tried using the hooks to insert extra information into the template's context
Thanks for reporting the problem; that's one of the reasons it's been fixed.
In that email, I suggested using a named array rather than a simple array to pass the parameters. Do you think this might provide more clarity?
It's a good idea, but I don't think it'll improve clarity. I think the current format is easier to write and read than having named params. Viz:
# current wfRunHooks('Event', array($param1, &$param2, $param3)); # ... function MyHookFunction($param1, &$param2, $param3) { $param2 = $param1 + $param3; return true; }
versus
# proposed wfRunHooks('Event', array('param1' => $param1, 'param2' => &$param2, 'param3' => $param3)); # ... function MyHookFunction($args) { $args['param2'] = $args['param1'] + $args['param3']; return true; }
I also don't think that named-arg passing makes defining the event parameters any simpler; you just have to look up the names of the parameters rather than their position. Either way, there's a documentation requirement that won't go away.
And wfRunHooks() itself would be slightly more complicated.
Also, what about including this actual setupTemplate processing hook that I proposed?
I'll try to get to it, but it's a separate issue. I'd also like to set output events at a really fine grain.
~Evan
On Sun, 13 Mar 2005 10:14:49 -0500, Evan Prodromou evan@bad.dynu.ca wrote:
I also don't think that named-arg passing makes defining the event parameters any simpler; you just have to look up the names of the parameters rather than their position. Either way, there's a documentation requirement that won't go away.
This is just a comment, in that you've done the hard work of coding the thing, so far be it from me to criticise from such limited experience, but I think there are advantages to named arguments:
* it makes *altering* the event parameters in the future easier to cope with - if you have named parameters they can always remain as meaning the same thing; if an extra one is added, it gets an extra name; if one is lost, functions which didn't use it anyway will still work. With un-named params, you might be able to get away with adding an extra one on the end, but you certainly couldn't safely remove one, and adding one in the middle because it was a more logical ordering would also be a no-no.
* if a function only needs to refer to one of the params, it's kind of annoying to define it as something like myFunction($ignore1, $ignore2, $ignore3, [etc,] $the_arg_we_want). Kind of a minor point, but definitely more convenient with named args, where you'd just never mention the others, even though they'd been passed.
* Similarly, people might mistakenly think that the name of the arguments is important, or even find themselves with garbled results because they thought the name of the arguments was *sufficient*. But I guess people *that* new to programming are going to have multiple problems anyway, and will need someone to call on for advice who'd see through that.
In that email, I suggested using a named array rather than a simple array to pass the parameters. Do you think this might provide more clarity?
It's a good idea, but I don't think it'll improve clarity. I think the current format is easier to write and read than having named params.
I guess its partially a matter of opinion, but I still think that the second example, where you pull out the parameters you care about by name is the most clear.
As Rowan pointed out, this matters the most when you want to alter the parameters getting passed through. When you find yourself requiring a new parameter, all of a sudden the coherent ordering and groupings of parameters makes no sense, and with positional arrays, your only option would be to tack something new on the end. This inevitably makes the positional ordering haphazard and illogical over time.
Named parameters are really common in both Perl and Python:
http://www.unix.org.ua/orelly/perl/perlnut/ch10_04.htm http://docs.python.org/tut/node6.html#SECTION006720000000000000000 (actually built into the language as part of the syntax)
but for whatever reason, not so much in php.
I think there are many arguments out there for their advantages over positional arrays.
And wfRunHooks() itself would be slightly more complicated.
Really? I would think it would make wfRunHooks simpler. No more var arg processing. A fixed, identical signature for every hook function:
myHookFunction ($parameterArray)
and that array would be the only thing that wfRunHooks would ever need to pass through to the hook function.
Its just more work on the receiving end of each function to unpack the parameterArray.
Maybe I am missing something...
Also, what about including this actual setupTemplate processing hook that I proposed?
I'll try to get to it, but it's a separate issue. I'd also like to set output events at a really fine grain.
Well even introducing at least one hook lets someone add new variables to the context of the template. At the moment there is a ton of code you need to carry around if you just want to add even one new variable to a template. The "cleanest" way I have found to do this without hooks, is by overriding buildNavUrls in my subclassed skin, but I end up having to copy/paste a ton of code just to add a few more variables to the templates context.
Thanks, Jonah
On Sun, 13 Mar 2005 17:16:58 -0500, Jonah Bossewitch mrenoch@phantomcynthetics.com wrote:
Its just more work on the receiving end of each function to unpack the parameterArray.
Indeed, there's no real need for the *programmer* to worry about unpacking the array - as far as a simple function is concerned, the variables can just be used as ever. PHP may have to deal with it differently, but from a coders point of view, it's:
function foo($a, $b, $c) { doMagicStuffWith $b; }
versus
function bar($params) { doMagicStuffWith $params['b']; }
On Sun, Mar 13, 2005 at 05:16:58PM -0500, Jonah Bossewitch wrote:
It's a good idea, but I don't think it'll improve clarity. I think the current format is easier to write and read than having named params.
I guess its partially a matter of opinion,
Let me say that I understand that there could be some value in using a named-arg array. I understand that it would make adding new params less invasive, although I think that that's a less-than-frequent problem that can be handled other ways (another event, or adding params at the end).
We disagree which calling method would be better. Can we agree that named args are not _sufficiently_better_ that we should change this part of the software just before a release? Or to keep event hooks out of the release altogether? Because that's where this conversation is going, and I really don't like that direction.
~Evan
Sure, having the hooks with arrays is definitely better than not having them at all. They won't be easy to change later, as people will have begun to rely on them the way they are, which is why its risky to rush into them.
If you have the system redone as arrays, it would be really minimal to refactor that into named arrays. I would certainly be willing to contribute to the effort. Either way.
Wouldn't life be dandy if there were units tests written to reassure us that everything was still working the same before and after the change. That's what I should really be working on.
Anyway, as long as the references work, I'll drop the issue. I think having the hooks be part of the release is important, as it will help the mediawiki software grow in different directions. Let me know how I can help.
/Jonah
----- Original Message ----- From: "Evan Prodromou" evan@bad.dynu.ca To: "Wikimedia developers" wikitech-l@wikimedia.org Sent: Monday, March 14, 2005 11:22 AM Subject: Re: [Wikitech-l] Late checkin on 1.4 branch
On Sun, Mar 13, 2005 at 05:16:58PM -0500, Jonah Bossewitch wrote:
It's a good idea, but I don't think it'll improve clarity. I think the current format is easier to write and read than having named params.
I guess its partially a matter of opinion,
Let me say that I understand that there could be some value in using a named-arg array. I understand that it would make adding new params less invasive, although I think that that's a less-than-frequent problem that can be handled other ways (another event, or adding params at the end).
We disagree which calling method would be better. Can we agree that named args are not _sufficiently_better_ that we should change this part of the software just before a release? Or to keep event hooks out of the release altogether? Because that's where this conversation is going, and I really don't like that direction.
~Evan
Wikitech-l mailing list Wikitech-l@wikimedia.org http://mail.wikipedia.org/mailman/listinfo/wikitech-l
wikitech-l@lists.wikimedia.org