For various reasons I have noticed that several files independently compute the value of $IP. For example, maintenance/Command.inc and includes/WebStart.php both calculate its value. One would expect this value to be computed in one place only and used globally. The logical place is LocalSettings.php.
Sprinkling the computation of $IP all over the place is just looking for trouble. At some point the code used to make this computation may diverge and you will have bugs introduced. My first reaction to this problem was to wonder why these files didn't just require LocalSettings.php. However, since it is a fairly complex file doing so might not be desirable because: 1) there are values in LocalSettings.php that would interfere with values in these files, 2) there is some ordering problem that might occur, or 3) there are performance considerations.
If it isn't possible and or desirable to replace the distributed computation of $IP with require_once('LocalSettings.php'), then I suggest breaking LocalSettings into two parts, say LocalSettingsCore.php and LocalSettingsNonCore.php (I am sure someone can come up with better names). LocalSettingsCore.php would contain only those calculations and definitions that do not interfere with the core MW files. LocalSettingsNonCore.php would contain everything else now in LocalSettings.php. Obviously, the first candidate for inclusion in LocalSettingsCore.php is the computation of $IP. Once such a separation is carried out, files like maintenance/Command.inc and includes/WebStart.php can require_once('LocalSettingsCore.php') instead of independently computing $IP.
dan nessett wrote:
Sprinkling the computation of $IP all over the place is just looking for trouble. At some point the code used to make this computation may diverge and you will have bugs introduced. My first reaction to this problem was to wonder why these files didn't just require LocalSettings.php. However, since it is a fairly complex file doing so might not be desirable because: 1) there are values in LocalSettings.php that would interfere with values in these files, 2) there is some ordering problem that might occur, or 3) there are performance considerations.
No, the reason is because LocalSettings.php is in the directory pointed to by $IP, so you have to work out what $IP is before you can include it.
Web entry points need to locate WebStart.php, and command line scripts need to locate maintenance/commandLine.inc. Then either of those two entry scripts can locate the rest of MediaWiki.
-- Tim Starling
--- On Mon, 8/10/09, Tim Starling tstarling@wikimedia.org wrote:
No, the reason is because LocalSettings.php is in the directory pointed to by $IP, so you have to work out what $IP is before you can include it.
Web entry points need to locate WebStart.php, and command line scripts need to locate maintenance/commandLine.inc. Then either of those two entry scripts can locate the rest of MediaWiki.
Fair enough, but consider the following.
I did a global search over the phase3 directory and got these hits for the string "$IP =" :
.../phase3/config/index.php:30: $IP = dirname( dirname( __FILE__ ) ); .../phase3/config/index.php:1876: $IP = MW_INSTALL_PATH; .../phase3/config/index.php:1878: $IP = dirname( __FILE__ ); .../phase3/includes/WebStart.php:61: $IP = getenv( 'MW_INSTALL_PATH' ); .../phase3/includes/WebStart.php:63: $IP = realpath( '.' ); .../phase3/js2/mwEmbed/php/noMediaWikiConfig.php:11: $IP = realpath(dirname(__FILE__).'/../'); .../phase3/LocalSettings.php:17: $IP = MW_INSTALL_PATH; .../phase3/LocalSettings.php:19: $IP = dirname( __FILE__ ); .../phase3/maintenance/language/validate.php:16: $IP = dirname( __FILE__ ) . '/../..'; .../phase3/maintenance/Maintenance.php:336: $IP = strval( getenv('MW_INSTALL_PATH') ) !== ''
So, it appears that $IP computation is occurring in 6 files. In addition, $IP is adjusted by the relative place of the file in the MW source tree (e.g., in validate.php, $IP is set to dirname( __FILE__ ) . '/../..';) Adjusting paths according to where a file exists in a source tree is fraught with danger. If you ever move the file for some reason, the code breaks.
Why not isolate at least $IP computation in a single function? (Perhaps breaking up LocalSettings.php into two parts is overkill, but certainly cleaning up $IP computation isn't too radical an idea.) Of course, there is the problem of locating the file of the function that does this. One approach is to recognize that php.ini already requires potential modification for MW use. Specifically, the path to PEAR must occur in 'include_path'. It would be a simple matter to add another search directory for locating the initialization code.
Or maybe there is a better way of locating MW initialization code. How its done is an open issue. I am simply arguing that computing the value of $IP by relying on the position of the php file in a source tree is not good software architecture. Experience shows that this kind of thing almost always leads to bugs.
On Tue, Aug 11, 2009 at 10:59 AM, dan nessettdnessett@yahoo.com wrote:
--- On Mon, 8/10/09, Tim Starling tstarling@wikimedia.org wrote:
No, the reason is because LocalSettings.php is in the directory pointed to by $IP, so you have to work out what $IP is before you can include it.
Web entry points need to locate WebStart.php, and command line scripts need to locate maintenance/commandLine.inc. Then either of those two entry scripts can locate the rest of MediaWiki.
Fair enough, but consider the following.
I did a global search over the phase3 directory and got these hits for the string "$IP =" :
.../phase3/config/index.php:30: $IP = dirname( dirname( __FILE__ ) ); .../phase3/config/index.php:1876: $IP = MW_INSTALL_PATH; .../phase3/config/index.php:1878: $IP = dirname( __FILE__ ); .../phase3/includes/WebStart.php:61: $IP = getenv( 'MW_INSTALL_PATH' ); .../phase3/includes/WebStart.php:63: $IP = realpath( '.' ); .../phase3/js2/mwEmbed/php/noMediaWikiConfig.php:11: $IP = realpath(dirname(__FILE__).'/../'); .../phase3/LocalSettings.php:17: $IP = MW_INSTALL_PATH; .../phase3/LocalSettings.php:19: $IP = dirname( __FILE__ ); .../phase3/maintenance/language/validate.php:16: $IP = dirname( __FILE__ ) . '/../..'; .../phase3/maintenance/Maintenance.php:336: $IP = strval( getenv('MW_INSTALL_PATH') ) !== ''
So, it appears that $IP computation is occurring in 6 files. In addition, $IP is adjusted by the relative place of the file in the MW source tree (e.g., in validate.php, $IP is set to dirname( __FILE__ ) . '/../..';) Adjusting paths according to where a file exists in a source tree is fraught with danger. If you ever move the file for some reason, the code breaks.
Why not isolate at least $IP computation in a single function? (Perhaps breaking up LocalSettings.php into two parts is overkill, but certainly cleaning up $IP computation isn't too radical an idea.) Of course, there is the problem of locating the file of the function that does this. One approach is to recognize that php.ini already requires potential modification for MW use. Specifically, the path to PEAR must occur in 'include_path'. It would be a simple matter to add another search directory for locating the initialization code.
Or maybe there is a better way of locating MW initialization code. How its done is an open issue. I am simply arguing that computing the value of $IP by relying on the position of the php file in a source tree is not good software architecture. Experience shows that this kind of thing almost always leads to bugs.
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
The problem with "putting it in a single function" is you still have to find where that function is to begin with (I'd assume either GlobalFunctions or install-utils would define this). At which point you're back to the original problem: defining $IP yourself so you can find this.
Yes, we should probably do this all a little more cleanly (at least one unified style would be nice), but constructing it manually is pretty much a given for anything trying to find an entry point, as Tim points out.
-Chad
--- On Tue, 8/11/09, Chad innocentkiller@gmail.com wrote:
The problem with "putting it in a single function" is you still have to find where that function is to begin with (I'd assume either GlobalFunctions or install-utils would define this). At which point you're back to the original problem: defining $IP yourself so you can find this.
Yes, we should probably do this all a little more cleanly (at least one unified style would be nice), but constructing it manually is pretty much a given for anything trying to find an entry point, as Tim points out.
I'm probably missing something since I have only been programming in PHP for about 4 weeks, but if you set include_path in php.ini so it includes the root of the MW tree, put a php file at that level that has a function (or a method in a class) that returns the MW root path, wouldn't that work? For example, if you modified include_path in php.ini to include <pathname to MW root>, added the file MWInit.php to the MW root directory and in MWInit.php put a function MWInit() that computes and returns $IP, wouldn't that eliminate the necessity of manually figuring out the value of $IP [each place where you now compute $IP could require_once('MWInit.php') and call MWInit()]?
Of course, it may be considered dangerous for the MW installation software to fool around with php.ini. But, even if you require setting the MW root manually in php.ini::include_path (abusing the php namespace disambiguation operator here) that would be an improvement. You should only have to do this once and could upgrade MW without disturbing this binding.
On Tue, Aug 11, 2009 at 12:00 PM, dan nessettdnessett@yahoo.com wrote:
--- On Tue, 8/11/09, Chad innocentkiller@gmail.com wrote:
The problem with "putting it in a single function" is you still have to find where that function is to begin with (I'd assume either GlobalFunctions or install-utils would define this). At which point you're back to the original problem: defining $IP yourself so you can find this.
Yes, we should probably do this all a little more cleanly (at least one unified style would be nice), but constructing it manually is pretty much a given for anything trying to find an entry point, as Tim points out.
I'm probably missing something since I have only been programming in PHP for about 4 weeks, but if you set include_path in php.ini so it includes the root of the MW tree, put a php file at that level that has a function (or a method in a class) that returns the MW root path, wouldn't that work? For example, if you modified include_path in php.ini to include <pathname to MW root>, added the file MWInit.php to the MW root directory and in MWInit.php put a function MWInit() that computes and returns $IP, wouldn't that eliminate the necessity of manually figuring out the value of $IP [each place where you now compute $IP could require_once('MWInit.php') and call MWInit()]?
Of course, it may be considered dangerous for the MW installation software to fool around with php.ini. But, even if you require setting the MW root manually in php.ini::include_path (abusing the php namespace disambiguation operator here) that would be an improvement. You should only have to do this once and could upgrade MW without disturbing this binding.
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
That requires you setting the include path, at which point you _STILL_ have to figure out where $IP is to begin with. $IP stands for "include path" and is the path to the wiki root where index.php, etc resides. We need this value to set in include_path, not expect it to already be there.
Nor can we expect users to edit php.ini just to add MW to their include path just to avoid us constructing $IP manually.
-Chad
* dan nessett dnessett@yahoo.com [Tue, 11 Aug 2009 09:00:50 -0700 (PDT)]:
I'm probably missing something since I have only been programming in
PHP
for about 4 weeks, but if you set include_path in php.ini so it
includes
the root of the MW tree, put a php file at that level that has a function (or a method in a class) that returns the MW root path, wouldn't that work? For example, if you modified include_path in
php.ini
to include <pathname to MW root>, added the file MWInit.php to the MW root directory and in MWInit.php put a function MWInit() that computes and returns $IP, wouldn't that eliminate the necessity of manually figuring out the value of $IP [each place where you now compute $IP could require_once('MWInit.php') and call MWInit()]?
Of course, it may be considered dangerous for the MW installation software to fool around with php.ini. But, even if you require setting the MW root manually in php.ini::include_path (abusing the php
namespace
disambiguation operator here) that would be an improvement. You should only have to do this once and could upgrade MW without disturbing this binding.
Sorry for interrupting the conversation, but not everyone have root rights to change php.ini freely. (MediaWiki can be used at shared hosting sometimes). I'd better define a shell variable for that ($MW_INSTALL_PATH). Dmitriy
Brion Vibber brion@wikimedia.org wrote:
Unless there's some reason to do otherwise, I'd recommend dropping the $IP from the autogen'd LocalSettings.php and pulling in DefaultSettings.php from the level above. (Keeping in mind that we should retain compat with existing LocalSettings.php files that are still pulling it.)
Better, but what about /config/index.php, noMediaWikiConfig.php, validate.php and Maintenance.php? Having only two different places where $IP is computed is a definite improvement (assuming you fix the 4 files just mentioned), but it still means the code in WebStart.php and Command.inc is file position dependent. If this is the best that can be done, then that is that. However, it would really be better if all position dependent code could be eliminated.
Dan
On Tue, Aug 11, 2009 at 1:40 PM, dan nessettdnessett@yahoo.com wrote:
Brion Vibber brion@wikimedia.org wrote:
Unless there's some reason to do otherwise, I'd recommend dropping the $IP from the autogen'd LocalSettings.php and pulling in DefaultSettings.php from the level above. (Keeping in mind that we should retain compat with existing LocalSettings.php files that are still pulling it.)
Better, but what about /config/index.php, noMediaWikiConfig.php, validate.php and Maintenance.php? Having only two different places where $IP is computed is a definite improvement (assuming you fix the 4 files just mentioned), but it still means the code in WebStart.php and Command.inc is file position dependent. If this is the best that can be done, then that is that. However, it would really be better if all position dependent code could be eliminated.
Dan
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
FWIW, commandLine.inc should be considered the same as Maintenance.php for these purposes, as the latter is slated the replace the former over time.
-Chad
"dan nessett" dnessett@yahoo.com wrote in message news:271560.93058.qm@web32504.mail.mud.yahoo.com...
Brion Vibber brion@wikimedia.org wrote:
Unless there's some reason to do otherwise, I'd recommend dropping the $IP from the autogen'd LocalSettings.php and pulling in DefaultSettings.php from the level above. (Keeping in mind that we should retain compat with existing LocalSettings.php files that are still pulling it.)
Better, but what about /config/index.php, noMediaWikiConfig.php, validate.php and Maintenance.php? Having only two different places where $IP is computed is a definite improvement (assuming you fix the 4 files just mentioned), but it still means the code in WebStart.php and Command.inc is file position dependent. If this is the best that can be done, then that is that. However, it would really be better if all position dependent code could be eliminated.
To be honest, I don't know why there is more than one entry point at all. Surely it would be better if all requests went through index.php:
<?php //index.php - the only MW entry point
// First, do all stuff that is always required. $IP = realpath(dirname(__FILE__)); include($IP . "/LocalSettings.php"); //...and any other actions that should be performed in all cases...
if ( !isset( $_SERVER ) || !array_key_exists( 'REQUEST_METHOD', $_SERVER ) ) { include($IP . "/maintenance/commandLine.inc"); return; }
// Rest of index.php as current.
?>
Any maintenance scripts would then include("../index.php") instead of commandLine.inc, and voila - all code-duplication at the various entry points has been removed.
- Mark Clements (HappyDog)
On 20/08/2009, at 10:29 AM, Mark Clements (HappyDog) wrote:
To be honest, I don't know why there is more than one entry point at all. Surely it would be better if all requests went through index.php:
<?php //index.php - the only MW entry point // First, do all stuff that is always required. $IP = realpath(dirname(__FILE__)); include($IP . "/LocalSettings.php"); //...and any other actions that should be performed in all cases... if ( !isset( $_SERVER ) || !array_key_exists( 'REQUEST_METHOD', $_SERVER ) ) { include($IP . "/maintenance/commandLine.inc"); return; } // Rest of index.php as current. ?>
That's totally bizarre, why not just have both index.php and command- line scripts include another file, like, say, Setup.php?
... Which is what we already do (although it's done from commandLine.inc, because that has some useful stuff for command-line scripts in it).
What problem are you trying to solve here?
-- Andrew Garrett agarrett@wikimedia.org http://werdn.us/
"Andrew Garrett" agarrett@wikimedia.org wrote in message news:DD4407E7-E181-49E9-A097-4B5B5B6D0BF8@wikimedia.org...
That's totally bizarre, why not just have both index.php and command- line scripts include another file, like, say, Setup.php?
Because then both files have to have a piece of code which is able to **find** Setup.php. Leaving us exactly where we started.
--HM
On 22/08/2009, at 12:51 AM, Happy-melon wrote:
"Andrew Garrett" agarrett@wikimedia.org wrote in message news:DD4407E7-E181-49E9-A097-4B5B5B6D0BF8@wikimedia.org...
That's totally bizarre, why not just have both index.php and command- line scripts include another file, like, say, Setup.php?
Because then both files have to have a piece of code which is able to **find** Setup.php. Leaving us exactly where we started.
Yes, this is where we started because this is the status quo. What I was describing is how it's done now.
-- Andrew Garrett agarrett@wikimedia.org http://werdn.us/
--- On Fri, 8/21/09, Andrew Garrett agarrett@wikimedia.org wrote:
Yes, this is where we started because this is the status quo. What I was describing is how it's done now.
Is maintaining the status quo really desirable? Look at the extensions directory. It currently has ~400 extension sub-directories. If you wanted to reorganize this so there is some sort of logic to it (e.g., first cut - put command line extensions in one directory and hook based extensions in another) you would have to change a lot of code so the computation of $IP by some extensions is correct.
How about this. When you run php from the command line, there is a flag -d defined as: "foo[=bar] Define INI entry foo with value 'bar'". For those extensions that run from the command line you could require callers to include a value for the php "include_path". This value would be the value in php.ini appended with the value of a directory containing the simple MWInit.php file provided in my message of Aug. 11, 9:57AM. Command line extensions could then call MWInit() to get MW root.
I just tried this and it worked. It would fix the problem for at least command line utilities.
Dan
dan nessett wrote:
Is maintaining the status quo really desirable? Look at the extensions directory. It currently has ~400 extension sub-directories. If you wanted to reorganize this so there is some sort of logic to it (e.g., first cut - put command line extensions in one directory and hook based extensions in another) you would have to change a lot of code so the computation of $IP by some extensions is correct.
How about this. When you run php from the command line, there is a flag -d defined as: "foo[=bar] Define INI entry foo with value 'bar'". For those extensions that run from the command line you could require callers to include a value for the php "include_path". This value would be the value in php.ini appended with the value of a directory containing the simple MWInit.php file provided in my message of Aug. 11, 9:57AM. Command line extensions could then call MWInit() to get MW root.
I just tried this and it worked. It would fix the problem for at least command line utilities.
Dan
How's that better than MW_CONFIG_PATH environment variable?
--- On Sat, 8/22/09, Platonides Platonides@gmail.com wrote:
How's that better than MW_CONFIG_PATH environment variable?
My understanding is that the administrators of certain installations cannot set environmental variables (I am taking this on faith, because that seems like a very very restricted site). What I suggested takes no work at all by installers or wiki administrators. MWInit.php (or whatever people want to name it) is part of the distribution. When you run a command line utility you have to be able to type something like "php <some name>.php". If you can do that, you should be able to type "php -d <directory where MWINI.php is located> <some name>.php".
If I am wrong and for some sites even that is not possible, then I am not sure how you would use command line utlities at all.
Dan
On 23/08/2009, at 2:17 AM, dan nessett wrote:
--- On Sat, 8/22/09, Platonides Platonides@gmail.com wrote:
How's that better than MW_CONFIG_PATH environment variable?
My understanding is that the administrators of certain installations cannot set environmental variables (I am taking this on faith, because that seems like a very very restricted site). What I suggested takes no work at all by installers or wiki administrators. MWInit.php (or whatever people want to name it) is part of the distribution. When you run a command line utility you have to be able to type something like "php <some name>.php". If you can do that, you should be able to type "php -d <directory where MWINI.php is located> <some name>.php".
$ MW_INSTALL_PATH=/var/wiki/mediawiki php maintenance/update.php
-- Andrew Garrett agarrett@wikimedia.org http://werdn.us/
--- On Sun, 8/23/09, Andrew Garrett agarrett@wikimedia.org wrote:
$ MW_INSTALL_PATH=/var/wiki/mediawiki php/maintenance/update.php
I don't understand the point you are making. If an MW administrator can set environmental variables, then, of course, what you suggests works. However, Brion mentions in his Tues, Aug 11, 10:09 email that not every MW installation admin can set environmental variables and Aryeh states in his Tues, Aug. 11, 10:09am message that some MW administrators only have FTP access to the installations they manage. So, as I understand it some administrators cannot use the tactic you describe.
An important issue is whether these admins have access to command line utilities at all. If not, then the use of file position dependent code in command line utilities can be eliminated by substituting:
$IP = getenv( 'MW_INSTALL_PATH' ); if ( $IP === false ) die();
for (taken from dumpHTML.php):
$IP = getenv( 'MW_INSTALL_PATH' ); if ( $IP === false ) { $IP = dirname(__FILE__).'/../..';
This works if only admins who can set environmental variables can execute MW command line utilities.
If there are administrators who can execute command lines, but cannot set environmental variables (e.g., they are confined to use a special shell), then what I suggested in the previous email eliminates file position dependency. That is, the command line would be:
php -d include_path=<include_path in php.ini>:<directory to MWInit.php> <utility>.php
If an admin can execute php <utility>.php, he should be able to execute the prior command.
Dan
--- On Sun, 8/23/09, dan nessett dnessett@yahoo.com wrote:
In my last email, I quoted Andrew Garret:
$ MW_INSTALL_PATH=/var/wiki/mediawiki php/maintenance/update.php
This was incorrect. I fumbled some of the editing in my reply. What he proposed was:
$ MW_INSTALL_PATH=/var/wiki/mediawiki php maintenance/update.php
Dan
On Sun, Aug 23, 2009 at 8:48 PM, dan nessettdnessett@yahoo.com wrote:
I don't understand the point you are making. If an MW administrator can set environmental variables, then, of course, what you suggests works. However, Brion mentions in his Tues, Aug 11, 10:09 email that not every MW installation admin can set environmental variables and Aryeh states in his Tues, Aug. 11, 10:09am message that some MW administrators only have FTP access to the installations they manage. So, as I understand it some administrators cannot use the tactic you describe.
If they can run commands on the command line, then they can use environment variables. If they can't, then your suggestion doesn't help.
If there are administrators who can execute command lines, but cannot set environmental variables (e.g., they are confined to use a special shell)
There aren't. That would make no sense.
--- On Sun, 8/23/09, Aryeh Gregor Simetrical+wikilist@gmail.com wrote:
If they can run commands on the command line, then they can use environment variables. If they can't, then your suggestion doesn't help.
If there are administrators who can execute command
lines, but cannot set environmental variables (e.g., they are confined to use a special shell)
There aren't. That would make no sense.
Thanks for clarifying the situation. Given this information I suggest changing all code in command line utilities of the form:
$IP = getenv( 'MW_INSTALL_PATH' ); if ( $IP === false ) { $IP = dirname(__FILE__).'/../..'; }
to:
$IP = getenv( 'MW_INSTALL_PATH' ); if ( $IP === false ) { echo "Error. The environmental variable MW_INSTALL_PATH must be set to the root of the MW distribution. Exiting.\n"; die(); }
This would eliminate file position dependent code from the command line utilities, making them easier to maintain (i.e., they can be moved in the distribution without breaking them).
Dan
Why skip trying to find the location? If MW_INSTALL_PATH is already missing, what have we got to lose from trying to guess the location? The vast majority of people don't screw with the default structure, so it should be just fine.
-Chad
On Aug 24, 2009 11:20 AM, "dan nessett" dnessett@yahoo.com wrote:
--- On Sun, 8/23/09, Aryeh Gregor <Simetrical+wikilist@gmail.comSimetrical%2Bwikilist@gmail.com> wrote: > If they can run commands... Thanks for clarifying the situation. Given this information I suggest changing all code in command line utilities of the form:
$IP = getenv( 'MW_INSTALL_PATH' ); if ( $IP === false ) { $IP = dirname(__FILE__).'/../..'; } to:
$IP = getenv( 'MW_INSTALL_PATH' ); if ( $IP === false ) { echo "Error. The environmental variable MW_INSTALL_PATH must be set to the root of the MW distribution. Exiting.\n"; die(); }
This would eliminate file position dependent code from the command line utilities, making them easier to maintain (i.e., they can be moved in the distribution without breaking them).
Dan
_______________________________________________ Wikitech-l mailing list Wikitech-l@lists.w...
On 24/08/2009, at 12:19 PM, dan nessett wrote:
--- On Sun, 8/23/09, Aryeh Gregor Simetrical+wikilist@gmail.com wrote:
If they can run commands on the command line, then they can use environment variables. If they can't, then your suggestion doesn't help.
If there are administrators who can execute command
lines, but cannot set environmental variables (e.g., they are confined to use a special shell)
There aren't. That would make no sense.
Thanks for clarifying the situation. Given this information I suggest changing all code in command line utilities of the form:
$IP = getenv( 'MW_INSTALL_PATH' ); if ( $IP === false ) { $IP = dirname(__FILE__).'/../..'; }
to:
$IP = getenv( 'MW_INSTALL_PATH' ); if ( $IP === false ) { echo "Error. The environmental variable MW_INSTALL_PATH must be set to the root of the MW distribution. Exiting.\n"; die(); }
This would eliminate file position dependent code from the command line utilities, making them easier to maintain (i.e., they can be moved in the distribution without breaking them).
Yes, if you change this, they will not break if they are moved within the distribution. Instead, they will be broken regardless of whether or not they have been moved.
That's just plain stupid.
-- Andrew Garrett agarrett@wikimedia.org http://werdn.us/
On 8/23/09 9:48 PM, dan nessett wrote:
--- On Sun, 8/23/09, Andrew Garrettagarrett@wikimedia.org wrote:
$ MW_INSTALL_PATH=/var/wiki/mediawiki php/maintenance/update.php
I don't understand the point you are making. If an MW administrator can set environmental variables, then, of course, what you suggests works. However, Brion mentions in his Tues, Aug 11, 10:09 email that not every MW installation admin can set environmental variables and Aryeh states in his Tues, Aug. 11, 10:09am message that some MW administrators only have FTP access to the installations they manage.
For a little background --
There are basically two ways to get at 'mediawiki stuff':
1) Through the web (PHP run via web server) -- this is the main MediaWiki user interface.
2) On the command line (PHP run from a login shell) -- the various maintenance scripts.
Everybody can do the web, but configuration may be limited:
* environment variables can be set in web server configuration, but this might not be available in a limited shared hosting environment
* environment variables can be set with 'putenv' within a PHP script... but this may be disabled in some shared hosting environments.
Some folks also have no command-line shell access to their server, in which case they can't run any of the maintenance scripts -- so have no place to set an environment variable there either.
-- brion
"Brion Vibber" brion@wikimedia.org wrote in message news:4A92DDBB.50300@wikimedia.org...>
Some folks also have no command-line shell access to their server, in which case they can't run any of the maintenance scripts -- so have no place to set an environment variable there either.
I have this problem on several of my hosts; I'd actually say it's "most" or at least "many" that have to hack around with the installer to do updates (or run schema changes using phpMyAdmin, which is a Bad Thing). A web-based interface for running maintenance scripts is an essential step, but that's hampered by the fact that currently scripts actually *check* and die if they appear on the web interface (presumably for security). Things like extensions/Maintenance are hackish ways to circumvent that by copy- and-pasting the code from each script you want to run; I have run scripts in the past in that way, but it's really nasty.
We do need a web-based maintenance interface and especially a web- based updater; IIRC there is already some work on this? Would having a single access point to MediaWiki aid this? I expect it would.
--HM
I just tried the following two approaches. I created the file MWInit.php in my .../phase3 directory with the following code:
<?php
function MWInit() {
$IP = strval( getenv('MW_INSTALL_PATH') ) !== '' ? getenv('MW_INSTALL_PATH') : realpath( dirname( __FILE__ ));
return $IP; }
I then modified include_path in php.ini to include my ../phase3 directory. In phase3/maintenance I created the file GetIP.php with the following code:
<?php
require_once('MWInit.php');
$IP = MWInit(); ?>
This worked. $IP was set to the correct value.
However, Dmitriy Sintsov makes the excellent point that not everyone can change php.ini. So, I then put MWInit.php into the PEAR directory and modified it as follows:
<?php
function MWInit() {
$IP = "/Volumes/Macintosh 2nd HD/DataBucket/Wiki Work/Code Development/LatestTrunkVersion/phase3";
return $IP; }
I removed my ../phase3 directory from include_path in php.ini and once again ran GetIP.php. $IP was returned successfully.
The question may arise, how do you get the value of the MW root into MWInit.php in the first place? This can occur on MW installation. Of course, there may be a problem getting MWInit.php into the PEAR directory, but I assume that is an easier task than modifying php.ini. However, if even this is a problem, then we can do what Dmitry suggests, set an environmental variable to the MW root path.
As I pointed out before, there may be better ways of automatically getting the value of $IP. However, I don't think it requires any manual effort.
* dan nessett dnessett@yahoo.com [Tue, 11 Aug 2009 09:57:28 -0700 (PDT)]:
I just tried the following two approaches. I created the file
MWInit.php
in my .../phase3 directory with the following code:
<?php function MWInit() { $IP = strval( getenv('MW_INSTALL_PATH') ) !== '' ? getenv('MW_INSTALL_PATH') : realpath( dirname( __FILE__ )); return $IP; } I then modified include_path in php.ini to include my ../phase3 directory. In phase3/maintenance I created the file GetIP.php with the following code: <?php require_once('MWInit.php'); $IP = MWInit(); ?>
This worked. $IP was set to the correct value.
However, Dmitriy Sintsov makes the excellent point that not everyone
can
change php.ini. So, I then put MWInit.php into the PEAR directory and modified it as follows:
<?php
function MWInit() {
$IP = "/Volumes/Macintosh 2nd HD/DataBucket/Wiki Work/Code
Development/LatestTrunkVersion/phase3";
return $IP; }
I removed my ../phase3 directory from include_path in php.ini and once again ran GetIP.php. $IP was returned successfully.
The question may arise, how do you get the value of the MW root into MWInit.php in the first place? This can occur on MW installation. Of course, there may be a problem getting MWInit.php into the PEAR directory, but I assume that is an easier task than modifying php.ini. However, if even this is a problem, then we can do what Dmitry
suggests,
set an environmental variable to the MW root path.
As I pointed out before, there may be better ways of automatically getting the value of $IP. However, I don't think it requires any
manual
effort.
MW_INSTALL_PATH should already work and it's very useful for command line scripts. For some extensions, like Semantic MediaWiki, which has it's own complex maintenance scripts, it's very much recommended to set this variable (so the extension can easily figure out the path). It allows to run from any directory, no need to cd to maintenance dir. The one problem I've had once (a rare case), when I was looking at then new MW 1.15 and yet wasn't ready to upgrade the whole farm from v1.14 (a long task), I've installed MW 1.15 to separate directory instead of upgrading MW 1.14 and the new one (1.15) didn't work because BOTH of MediaWikis were reading the same MW_INSTALL_PATH, which pointed out to the farm dir ! Though that's a rare case. Dmitriy
On Tue, Aug 11, 2009 at 12:57 PM, dan nessettdnessett@yahoo.com wrote:
However, Dmitriy Sintsov makes the excellent point that not everyone can change php.ini. So, I then put MWInit.php into the PEAR directory and modified it as follows:
Um, I'm assuming the PEAR directory is not writable to the user of an average shared webhost. Such users commonly have only FTP access to some subdirectory of the web root. Even if it was writable, it's not reasonable to require users to copy files into several different directories if there's not a clear benefit. You should be able to untar in your web root, run config/index.php, and have everything work.
The question may arise, how do you get the value of the MW root into MWInit.php in the first place? This can occur on MW installation.
Then you're doing almost exactly the same thing we're doing now, except with MWInit.php instead of LocalSettings.php. $IP is normally set in LocalSettings.php for most page views. Some places still must figure it out independently in either case, e.g., config/index.php.
Of course, there may be a problem getting MWInit.php into the PEAR directory, but I assume that is an easier task than modifying php.ini.
It's not, compared to not having to modify anything outside your web root.
However, if even this is a problem, then we can do what Dmitry suggests, set an environmental variable to the MW root path.
How?
--- On Tue, 8/11/09, Aryeh Gregor Simetrical+wikilist@gmail.com wrote:
Then you're doing almost exactly the same thing we're doing now, except with MWInit.php instead of LocalSettings.php. $IP is normally set in LocalSettings.php for most page views. Some places still must figure it out independently in either case, e.g., config/index.php.
I want to avoid seeming obsessed by this issue, but file position dependent code is a significant generator of bugs in other software. The difference between MWInit.php and LocalSettings.php is if you get the former into a directory that PHP uses for includes, you have a way of getting the root path of MW without the caller knowing anything about the relative structure of the code distribution tree hierarchy. As you pointed out previously, the reason you need to compute $IP before including/requiring LocalSettings is you don't know where it is.
Dan
On Tue, Aug 11, 2009 at 1:50 PM, dan nessettdnessett@yahoo.com wrote:
--- On Tue, 8/11/09, Aryeh Gregor Simetrical+wikilist@gmail.com wrote:
Then you're doing almost exactly the same thing we're doing now, except with MWInit.php instead of LocalSettings.php. $IP is normally set in LocalSettings.php for most page views. Some places still must figure it out independently in either case, e.g., config/index.php.
I want to avoid seeming obsessed by this issue, but file position dependent code is a significant generator of bugs in other software. The difference between MWInit.php and LocalSettings.php is if you get the former into a directory that PHP uses for includes, you have a way of getting the root path of MW without the caller knowing anything about the relative structure of the code distribution tree hierarchy. As you pointed out previously, the reason you need to compute $IP before including/requiring LocalSettings is you don't know where it is.
Dan
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
But you still have to figure out where MWInit is, which is basically the same problem.
-Chad
On Tue, 11 Aug 2009, dan nessett wrote:
--- On Tue, 8/11/09, Aryeh Gregor Simetrical+wikilist@gmail.com wrote:
Then you're doing almost exactly the same thing we're doing now, except with MWInit.php instead of LocalSettings.php. $IP is normally set in LocalSettings.php for most page views. Some places still must figure it out independently in either case, e.g., config/index.php.
I want to avoid seeming obsessed by this issue, but file position dependent code is a significant generator of bugs in other software. The difference between MWInit.php and LocalSettings.php is if you get the former into a directory that PHP uses for includes, you have a way of getting the root path of MW without the caller knowing anything about the relative structure of the code distribution tree hierarchy. As you pointed out previously, the reason you need to compute $IP before including/requiring LocalSettings is you don't know where it is.
Dan
Placing it in the include path could make it hard to run more than one version of the MW code on the same server, since both would probably find the same file and one of them would likely end up using the other one's $IP.
Another way of putting it is, is it really better to hard-code the absolute position of the MW root rather than its position relative to the files in it?
lw
* lee worden wonder@riseup.net [Tue, 11 Aug 2009 11:14:26 -0700 (PDT)]:
Placing it in the include path could make it hard to run more than one version of the MW code on the same server, since both would probably find the same file and one of them would likely end up using the other
one's
$IP.
Very true, I've had once such case.
Another way of putting it is, is it really better to hard-code the absolute position of the MW root rather than its position relative to the files in it?
I've heard that relative paths in require() and include() can be tricky for nested inclusions in PHP (it works correctly only for very few levels of nesting when paths go "up" and "down"). Dmitriy
--- On Tue, 8/11/09, lee worden wonder@riseup.net wrote:
Placing it in the include path could make it hard to run more than one version of the MW code on the same server, since both would probably find the same file and one of them would likely end up using the other one's $IP.
That is a good point. However, I don't think it is insurmountable. Callers to MWInit() could pass their path (which they can get calling "realpath( '.' )"). In a multi-MW environment MWInit() could disambiguate the root path by searching the provided path against those of all installed root paths.
Another way of putting it is, is it really better to hard-code the absolute position of the MW root rather than its position relative to the files in it?
Well, I think so. Hardcoding the absolute position of the MW root occurs at install time. Using file position dependent code is a development time dependency. Files are not moved around once installed, but could be moved around during the development process. So, bugs that depend on file position are normally not caused by installation activity, but rather by development activity.
Dan
On 8/11/09 12:45 PM, dan nessett wrote:
--- On Tue, 8/11/09, lee wordenwonder@riseup.net wrote:
Placing it in the include path could make it hard to run more than one version of the MW code on the same server, since both would probably find the same file and one of them would likely end up using the other one's $IP.
That is a good point. However, I don't think it is insurmountable. Callers to MWInit() could pass their path (which they can get calling "realpath( '.' )"). In a multi-MW environment MWInit() could disambiguate the root path by searching the provided path against those of all installed root paths.
Another way of putting it is, is it really better to hard-code the absolute position of the MW root rather than its position relative to the files in it?
Well, I think so. Hardcoding the absolute position of the MW root occurs at install time. Using file position dependent code is a development time dependency. Files are not moved around once installed, but could be moved around during the development process. So, bugs that depend on file position are normally not caused by installation activity, but rather by development activity.
Dan
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
What seems to be being discussed here are particular offensive areas of MediaWiki - however if you really get to know MediaWiki you will likely find tons of these things everywhere... So are we proposing a specific change that will provide a solution to a problem or just nit-picking?
I ask cause I'm wondering if I should ignore this thread or not (an others are probably wondering the same) - and I'm sort of feeling like this is becoming one of those threads where the people discussing things spend more time and effort battling each other than just fixing the code.
- Trevor
--- On Tue, 8/11/09, Trevor Parscal tparscal@wikimedia.org wrote:
Not to worry. I've given up on this issue, at least for the moment.
Dan
What seems to be being discussed here are particular offensive areas of MediaWiki - however if you really get to know MediaWiki you will likely find tons of these things everywhere... So are we proposing a specific change that will provide a solution to a problem or just nit-picking?
I ask cause I'm wondering if I should ignore this thread or not (an others are probably wondering the same) - and I'm sort of feeling like this is becoming one of those threads where the people discussing things spend more time and effort battling each other than just fixing the code.
On Tue, Aug 11, 2009 at 1:50 PM, dan nessettdnessett@yahoo.com wrote:
I want to avoid seeming obsessed by this issue, but file position dependent code is a significant generator of bugs in other software.
It has never generated any bugs for us that I can recall. A poorly-written testing framework that no one ever used anyway doesn't count.
The difference between MWInit.php and LocalSettings.php is if you get the former into a directory that PHP uses for includes, you have a way of getting the root path of MW without the caller knowing anything about the relative structure of the code distribution tree hierarchy. As you pointed out previously, the reason you need to compute $IP before including/requiring LocalSettings is you don't know where it is.
If the user has root access, then we can install MediaWiki into a definite location. If they don't, we have to figure out the location, because it could be anywhere. We can't centralize this code because to do so we'd have to already know where the include path is. It's really very simple; there's no way out of it. The problem is only a few lines of code duplicated, it's really not very serious.
On 8/11/09 9:57 AM, dan nessett wrote:
The question may arise, how do you get the value of the MW root into MWInit.php in the first place? This can occur on MW installation. Of course, there may be a problem getting MWInit.php into the PEAR directory, but I assume that is an easier task than modifying php.ini. However, if even this is a problem, then we can do what Dmitry suggests, set an environmental variable to the MW root path.
You shouldn't assume you can set environment variables either. :)
Really, MediaWiki is built to assume it's being run from the base directory where its files are found.
I'm not sure there's a compelling reason to even have $IP set in LocalSettings.php anymore; the base include path should probably be autodetected in all cases, which is already being done in WebStart.php and commandLine.inc, the web and CLI initialization includes based on their locations in the file tree.
Unless there's some reason to do otherwise, I'd recommend dropping the $IP from the autogen'd LocalSettings.php and pulling in DefaultSettings.php from the level above. (Keeping in mind that we should retain compat with existing LocalSettings.php files that are still pulling it.)
Having $IP in there is a frequent source of upgrade problems as people move their filesystem paths around and then discover that their new MediaWiki install is trying to load files from the old one in another directory.
-- brion
--- On Tue, 8/11/09, Brion Vibber brion@wikimedia.org wrote:
I'm not sure there's a compelling reason to even have $IP set in LocalSettings.php anymore; the base include path should probably be autodetected in all cases, which is already being done in WebStart.php and commandLine.inc, the web and CLI initialization includes based on their locations in the file tree.
I started this thread because two of the fixes in the patch for bug ticket 20112 (those for Database.t and Global.t) move the require of LocalSettings.php before the require of AutoLoader.php. This is necessary because AutoLoader.php eventually executes: require_once("$IP/js2/mwEmbed/php/jsAutoloadLocalClasses.php").
This is a perfect example of how file position dependent code can introduce bugs. If $IP computation is eliminated from LocalSettings.php, then both of these tests will once again fail. The tests in phase3/t/inc are not executed as the result of a web request or through a command line execution path that includes maintenance/Command.inc. They normally are executed by typing at a terminal: "prove t/inc -r" or, e.g., "prove t/inc/Global.t". "prove" is a TAP protocol consumer that digests and displays the results of the tests, which are TAP protocol producers.
So, eliminating $IP computation from LocalSettings would require the development of new code for these tests. That would mean there would be 4 places where $IP is computed: WebStart.php, Command.inc, /config/index.php and the t test place. Not good.
On 8/11/09 11:33 AM, dan nessett wrote:
--- On Tue, 8/11/09, Brion Vibberbrion@wikimedia.org wrote:
I'm not sure there's a compelling reason to even have $IP set in LocalSettings.php anymore; the base include path should probably be autodetected in all cases, which is already being done in WebStart.php and commandLine.inc, the web and CLI initialization includes based on their locations in the file tree.
I started this thread because two of the fixes in the patch for bug ticket 20112 (those for Database.t and Global.t) move the require of LocalSettings.php before the require of AutoLoader.php. This is necessary because AutoLoader.php eventually executes: require_once("$IP/js2/mwEmbed/php/jsAutoloadLocalClasses.php").
These scripts should simply be updated to initialize the framework properly instead of trying to half-ass it and load individual classes.
-- brion
--- On Tue, 8/11/09, Brion Vibber brion@wikimedia.org wrote:
These scripts should simply be updated to initialize the framework properly instead of trying to half-ass it and load individual classes.
I agree, which is why I am trying to figure out how to consolidate the tests in /tests/ and /t/. [The example I gave was to illustrate how bugs can pop up when you use code that depends on the position of files in a distribution tree, not because I think the tests are in good shape. The bug fixes are only intended to make these tests available again, not to declare them finished.]
I could use some help on test system architecture - you do wear the systems architect hat :-). It doesn't seem right to use WebStart.php to initialize the tests. For one thing, WebStart starts up profiling, which doesn't seem relevant for a test. That leaves Command.inc. However, the t tests stream TAP protocol text to "prove", a PERL script that normally runs them. I have no way of running these tests through prove because my IDE doesn't support PERL, so if I changed the tests to require Command.inc, it would be hard to debug any problems.
I researched other TAP consumers and didn't find anything other than prove. I was hoping that one written in PHP existed, but I haven't found anything. So, I am in kind of a bind. We could just dump the t tests, but at least one (Parser.t, which runs parserTests) is pretty useful. Furthermore, TAP has an IETF standardization effort and phpunit can produce TAP output. This suggests that TAP is a good candidate for test system infrastructure.
So, what are your thoughts on this?
Dan
On Tue, Aug 11, 2009 at 3:35 PM, dan nessettdnessett@yahoo.com wrote:
--- On Tue, 8/11/09, Brion Vibber brion@wikimedia.org wrote:
These scripts should simply be updated to initialize the framework properly instead of trying to half-ass it and load individual classes.
I agree, which is why I am trying to figure out how to consolidate the tests in /tests/ and /t/. [The example I gave was to illustrate how bugs can pop up when you use code that depends on the position of files in a distribution tree, not because I think the tests are in good shape. The bug fixes are only intended to make these tests available again, not to declare them finished.]
I could use some help on test system architecture - you do wear the systems architect hat :-). It doesn't seem right to use WebStart.php to initialize the tests. For one thing, WebStart starts up profiling, which doesn't seem relevant for a test. That leaves Command.inc. However, the t tests stream TAP protocol text to "prove", a PERL script that normally runs them. I have no way of running these tests through prove because my IDE doesn't support PERL, so if I changed the tests to require Command.inc, it would be hard to debug any problems.
I researched other TAP consumers and didn't find anything other than prove. I was hoping that one written in PHP existed, but I haven't found anything. So, I am in kind of a bind. We could just dump the t tests, but at least one (Parser.t, which runs parserTests) is pretty useful. Furthermore, TAP has an IETF standardization effort and phpunit can produce TAP output. This suggests that TAP is a good candidate for test system infrastructure.
So, what are your thoughts on this?
Dan
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
To be perfectly honest, I'm of the opinion that tests/ and t/ should be scrapped and it should all be done over, properly.
What we need is an easy and straightforward way to write test cases, so people are encouraged to write them. Right now, nobody understands wtf is going on in tests/ and t/, so they get ignored and the /vast/ majority of the code isn't tested.
What we need is something similar to parser tests, where it's absurdly easy to pop new tests in with little to no coding required at all. Also, extensions having the ability to inject their own tests into the framework is a must IMHO.
-Chad
Le 11 août 09 à 22:03, Chad a écrit :
To be perfectly honest, I'm of the opinion that tests/ and t/ should be scrapped and it should all be done over, properly.
What we need is an easy and straightforward way to write test cases, so people are encouraged to write them. Right now, nobody understands wtf is going on in tests/ and t/, so they get ignored and the /vast/ majority of the code isn't tested.
What we need is something similar to parser tests, where it's absurdly easy to pop new tests in with little to no coding required at all. Also, extensions having the ability to inject their own tests into the framework is a must IMHO.
-Chad
+1, we could maybe write our own test system that can be based on the new Maintenance class, since we already some test scripts in / maintenance/ (cdb-test.php, fuzz-tester.php, parserTests.php, preprocessorFuzzTest.php and syntaxChecker.php). Porting tests such as parser to PHPUnit is a pain, since it has no native way to write a test suite that has a "unknow" number of tests to run.
Alexandre Emsenhuber (ialex)
--- On Tue, 8/11/09, Alexandre Emsenhuber alex.emsenhuber@bluewin.ch wrote:
+1, we could maybe write our own test system that can be based on the new Maintenance class, since we already some test scripts in / maintenance/ (cdb-test.php, fuzz-tester.php, parserTests.php, preprocessorFuzzTest.php and syntaxChecker.php). Porting tests such as parser to PHPUnit is a pain, since it has no native way to write a test suite that has a "unknow" number of tests to run.
Rewriting parserTests as php unit tests would be a horrible waste of time. parserTests works and it provides a reasonable service. One problem, however, is how do we fix the parser? It seems it is a pretty complex code system (when I ran a MacGyver test on parserTests, 141 files were accessed, most of which are associated with the parser). I have been thinking about this, but those thoughts are not yet sufficiently clear to make public yet.
On the other hand, taking the parserTests route and doing all of our own test infrastructure would also be a good deal of work. There are tools out there (phpuint and prove) that are useful. In my view creating a test infrastructure from scratch would unnecessarily waste time and resources.
Dan
Le 11 août 09 à 22:51, dan nessett a écrit :
--- On Tue, 8/11/09, Alexandre Emsenhuber alex.emsenhuber@bluewin.ch wrote:
+1, we could maybe write our own test system that can be based on the new Maintenance class, since we already some test scripts in / maintenance/ (cdb-test.php, fuzz-tester.php, parserTests.php, preprocessorFuzzTest.php and syntaxChecker.php). Porting tests such as parser to PHPUnit is a pain, since it has no native way to write a test suite that has a "unknow" number of tests to run.
Rewriting parserTests as php unit tests would be a horrible waste of time. parserTests works and it provides a reasonable service. One problem, however, is how do we fix the parser? It seems it is a pretty complex code system (when I ran a MacGyver test on parserTests, 141 files were accessed, most of which are associated with the parser). I have been thinking about this, but those thoughts are not yet sufficiently clear to make public yet.
On the other hand, taking the parserTests route and doing all of our own test infrastructure would also be a good deal of work. There are tools out there (phpuint and prove) that are useful. In my view creating a test infrastructure from scratch would unnecessarily waste time and resources.
Dan
My idea is the move the "backend" of ParserTest (parserTests.txt file processing, result reporting, ...) and the TestRecorder stuff to something like a MediaWikiTests class that extends Maintenance and move the rest in a file in /maintenance/tests/ (to be created) and re- use the backend to have files that have the same format, but test's input could be raw PHP code (a bit like PHP core's tests) with a new config variable that's like $wgParserTestFiles but for these kind of test. This mostly concerns the actual tests in /tests/ and /t/inc/). We can also port cdb-test.php, fuzz-tester.php, preprocessorFuzzTest.php and syntaxChecker.php to this new system and then create a script in /maintenance/ that runs all the tests in / maintenance/tests/. This allows to also upload all the tests to CodeReview, not only the parser tests. A benefit is that we can get ride of /tests/ and /t/.
Alexandre Emsenhuber (ialex)
--- On Tue, 8/11/09, Alexandre Emsenhuber alex.emsenhuber@bluewin.ch wrote:
My idea is the move the "backend" of ParserTest (parserTests.txt file processing, result reporting, ...) and the TestRecorder stuff to something like a MediaWikiTests class that extends Maintenance and move the rest in a file in /maintenance/tests/ (to be created) and re- use the backend to have files that have the same format, but test's input could be raw PHP code (a bit like PHP core's tests) with a new config variable that's like $wgParserTestFiles but for these kind of test. This mostly concerns the actual tests in /tests/ and /t/inc/). We can also port cdb-test.php, fuzz-tester.php, preprocessorFuzzTest.php and syntaxChecker.php to this new system and then create a script in /maintenance/ that runs all the tests in / maintenance/tests/. This allows to also upload all the tests to CodeReview, not only the parser tests. A benefit is that we can get ride of /tests/ and /t/.
One of the beauties of open source code development is he who does the work wins the prize. Of course, I am sure senior developers have discretionary power on what goes into a release and what does not. But, if you want to do the work, go for it (says the guy [me] who just joined the group).
However, I think you should consider the following:
* parserTests is designed to test parsing, which is predominantly a text manipulation task. Other parts of MW do not necessarily provide text processing markers that can be used to decide whether they are working correctly or not.
* Sometimes testing the action of a module requires determining whether a series of actions provide the correct behavior. As far as I am aware, parserTests has no facility to tie together a set of actions into a single test.
For example, consider two MW files in phase3/includes: 1) AutoLoader.php and 2) Hooks.php. In AutoLoader, the method loadAllExtensions() loads all extensions specified in $wgAutoloadClasses. It takes no parameters and has no return value. It simply walks through the entries specified in $wgAutoloadClasses and if the class specified as the key exists, executes a require of the file specified in the value. I don't see how you would specify a test of this method using the syntax of parserTests.txt.
In Hooks.php, there is a single function wfRunHooks(). It looks up hooks previously set and calls user code for them. It throws exceptions in certain error conditions and testing it requires setting a hook and seeing if it is called appropriately. I don't see how you could describe this behavior with parserTests.txt syntax.
Of course, you could create new syntax and behavior for the parserTests software components, but that is a lot of work that other infrastructure has already done. For example, see the set of assertions for phpunit (http://www.phpunit.de/manual/2.3/en/api.html#api.assert.tables.assertions).
Dan
On Tue, Aug 11, 2009 at 6:39 PM, dan nessettdnessett@yahoo.com wrote:
For example, consider two MW files in phase3/includes: 1) AutoLoader.php and 2) Hooks.php. In AutoLoader, the method loadAllExtensions() loads all extensions specified in $wgAutoloadClasses. It takes no parameters and has no return value. It simply walks through the entries specified in $wgAutoloadClasses and if the class specified as the key exists, executes a require of the file specified in the value. I don't see how you would specify a test of this method using the syntax of parserTests.txt.
In Hooks.php, there is a single function wfRunHooks(). It looks up hooks previously set and calls user code for them. It throws exceptions in certain error conditions and testing it requires setting a hook and seeing if it is called appropriately. I don't see how you could describe this behavior with parserTests.txt syntax.
Neither of these need to be tested directly. If AutoLoader breaks, then some other class won't load, and the tests for that class will fail. If wfRunHooks() fails, then some hook won't work, and any test of that hook will fail.
I think what's needed for decent test usage for MediaWiki is:
1) Some test suite is picked. PHPUnit is probably fine, if it runs out of the box and doesn't need some extra module to be installed.
2) The test suite is integrated into CodeReview with nag e-mails for broken tests.
3) A moderate number of tests are written for the test suite. Existing parser tests could be autoconverted, possibly. Maybe someone paid could be assigned to spend a day or two on this.
4) A new policy requires that everyone write tests for all their bug fixes and enhancements. Commits that don't add enough tests will be flagged as fixme, and reverted if not fixed.
(4) is critical here.
While we're at it, it would be nice if we instituted some other iron-clad policies. Here's a proposal:
* All new functions (including private helper functions, functions in JavaScript includes, whatever) must have function-level documentation that explains the purpose of the function and describes its parameters. The documentation must be enough that no MediaWiki developer should ever have to read the function's source code to be able to use it correctly. Exception: if a method is overridden which is already documented in the base class, it doesn't need to be documented again in the derived class, since the semantics should be the same. * All new classes must have high-level documentation that explains their purpose and structure, and how you should use them. The documentation must be sufficient that any MediaWiki developer could understand why they might want to use the class in another file, and how they could do so, without reading any of the source code. Of course, developers would still have to read the function documentation to learn how to use specific functions. There are no exceptions, but a derived class might only need very brief documentation. * All new config variables must have documentation explaining what they do in terms understandable to end-users. They must describe what values are accepted, and if the values are complicated (like associative arrays), must provide at least one example that can be copy-pasted. There are no exceptions. * If any code is changed so as to make a comment incorrect, the comment must be updated to match. There are no exceptions.
Or whatever. We have *way* too few high-level comments in our code. We have entire files -- added quite recently, mind you, by established developers -- that have no or almost no documentation on either the class or function level. We can really do better than this! If we had a clear list of requirements for comments in new code, we could start fixme'ing commits that don't have adequate comments. I think that would be enough to get people to add sufficient comments, for the most part. Without clear rules, though, backed up by the threat of reverting, I don't think the situation will improve here.
(Wow, this kind of turned into a thread hijack. :D)
On Tue, Aug 11, 2009 at 8:10 PM, Aryeh GregorSimetrical+wikilist@gmail.com wrote:
On Tue, Aug 11, 2009 at 6:39 PM, dan nessettdnessett@yahoo.com wrote:
For example, consider two MW files in phase3/includes: 1) AutoLoader.php and 2) Hooks.php. In AutoLoader, the method loadAllExtensions() loads all extensions specified in $wgAutoloadClasses. It takes no parameters and has no return value. It simply walks through the entries specified in $wgAutoloadClasses and if the class specified as the key exists, executes a require of the file specified in the value. I don't see how you would specify a test of this method using the syntax of parserTests.txt.
In Hooks.php, there is a single function wfRunHooks(). It looks up hooks previously set and calls user code for them. It throws exceptions in certain error conditions and testing it requires setting a hook and seeing if it is called appropriately. I don't see how you could describe this behavior with parserTests.txt syntax.
Neither of these need to be tested directly. If AutoLoader breaks, then some other class won't load, and the tests for that class will fail. If wfRunHooks() fails, then some hook won't work, and any test of that hook will fail.
I think what's needed for decent test usage for MediaWiki is:
- Some test suite is picked. PHPUnit is probably fine, if it runs
out of the box and doesn't need some extra module to be installed.
- The test suite is integrated into CodeReview with nag e-mails for
broken tests.
- A moderate number of tests are written for the test suite.
Existing parser tests could be autoconverted, possibly. Maybe someone paid could be assigned to spend a day or two on this.
- A new policy requires that everyone write tests for all their bug
fixes and enhancements. Commits that don't add enough tests will be flagged as fixme, and reverted if not fixed.
(4) is critical here.
While we're at it, it would be nice if we instituted some other iron-clad policies. Here's a proposal:
- All new functions (including private helper functions, functions in
JavaScript includes, whatever) must have function-level documentation that explains the purpose of the function and describes its parameters. The documentation must be enough that no MediaWiki developer should ever have to read the function's source code to be able to use it correctly. Exception: if a method is overridden which is already documented in the base class, it doesn't need to be documented again in the derived class, since the semantics should be the same.
- All new classes must have high-level documentation that explains
their purpose and structure, and how you should use them. The documentation must be sufficient that any MediaWiki developer could understand why they might want to use the class in another file, and how they could do so, without reading any of the source code. Of course, developers would still have to read the function documentation to learn how to use specific functions. There are no exceptions, but a derived class might only need very brief documentation.
- All new config variables must have documentation explaining what
they do in terms understandable to end-users. They must describe what values are accepted, and if the values are complicated (like associative arrays), must provide at least one example that can be copy-pasted. There are no exceptions.
- If any code is changed so as to make a comment incorrect, the
comment must be updated to match. There are no exceptions.
Or whatever. We have *way* too few high-level comments in our code. We have entire files -- added quite recently, mind you, by established developers -- that have no or almost no documentation on either the class or function level. We can really do better than this! If we had a clear list of requirements for comments in new code, we could start fixme'ing commits that don't have adequate comments. I think that would be enough to get people to add sufficient comments, for the most part. Without clear rules, though, backed up by the threat of reverting, I don't think the situation will improve here.
(Wow, this kind of turned into a thread hijack. :D)
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
On the whole "new code" front. Can we all /please/ remember that we're writing PHP5 here. Visibility on all new functions and variables should also be a must.
-Chad
--- On Tue, 8/11/09, Chad innocentkiller@gmail.com wrote:
Neither of these need to be tested directly. If
AutoLoader breaks,
then some other class won't load, and the tests for
that class will
fail. If wfRunHooks() fails, then some hook won't
work, and any test
of that hook will fail.
I will pass on commenting about these for the moment because:
I think what's needed for decent test usage for
MediaWiki is:
- Some test suite is picked. PHPUnit is probably
fine, if it runs
out of the box and doesn't need some extra module to
be installed.
- The test suite is integrated into CodeReview with
nag e-mails for
broken tests.
- A moderate number of tests are written for the test
suite.
Existing parser tests could be autoconverted,
possibly. Maybe someone
paid could be assigned to spend a day or two on this.
- A new policy requires that everyone write tests for
all their bug
fixes and enhancements. Commits that don't add
enough tests will be
flagged as fixme, and reverted if not fixed.
(4) is critical here.
All good stuff, especially (4) - applause :-D.
While we're at it, it would be nice if we instituted
some other
iron-clad policies. Here's a proposal:
- All new functions (including private helper
functions, functions in
JavaScript includes, whatever) must have
function-level documentation
that explains the purpose of the function and
describes its
parameters. The documentation must be enough that no
MediaWiki
developer should ever have to read the function's
source code to be
able to use it correctly. Exception: if a method is
overridden which
is already documented in the base class, it doesn't
need to be
documented again in the derived class, since the
semantics should be
the same.
- All new classes must have high-level documentation
that explains
their purpose and structure, and how you should use
them. The
documentation must be sufficient that any MediaWiki
developer could
understand why they might want to use the class in
another file, and
how they could do so, without reading any of the
source code. Of
course, developers would still have to read the
function documentation
to learn how to use specific functions. There are no
exceptions, but
a derived class might only need very brief
documentation.
- All new config variables must have documentation
explaining what
they do in terms understandable to end-users. They
must describe what
values are accepted, and if the values are complicated
(like
associative arrays), must provide at least one example
that can be
copy-pasted. There are no exceptions.
- If any code is changed so as to make a comment
incorrect, the
comment must be updated to match. There are no
exceptions.
Or whatever. We have *way* too few high-level
comments in our code.
We have entire files -- added quite recently, mind
you, by established
developers -- that have no or almost no documentation
on either the
class or function level. We can really do better
than this! If we
had a clear list of requirements for comments in new
code, we could
start fixme'ing commits that don't have adequate
comments. I think
that would be enough to get people to add sufficient
comments, for the
most part. Without clear rules, though, backed up by
the threat of
reverting, I don't think the situation will improve
here.
Wonderful stuff - more applause.
(Wow, this kind of turned into a thread hijack. :D)
Who cares. It needs to be said.
On the whole "new code" front. Can we all /please/ remember that we're writing PHP5 here. Visibility on all new functions and variables should also be a must.
OK, I must admit I didn't understand that, probably because I'm new to PHP. Can you make this more explicit?
Dan
On Tue, Aug 11, 2009 at 8:29 PM, dan nessettdnessett@yahoo.com wrote:
--- On Tue, 8/11/09, Chad innocentkiller@gmail.com wrote:
Neither of these need to be tested directly. If
AutoLoader breaks,
then some other class won't load, and the tests for
that class will
fail. If wfRunHooks() fails, then some hook won't
work, and any test
of that hook will fail.
I will pass on commenting about these for the moment because:
I think what's needed for decent test usage for
MediaWiki is:
- Some test suite is picked. PHPUnit is probably
fine, if it runs
out of the box and doesn't need some extra module to
be installed.
- The test suite is integrated into CodeReview with
nag e-mails for
broken tests.
- A moderate number of tests are written for the test
suite.
Existing parser tests could be autoconverted,
possibly. Maybe someone
paid could be assigned to spend a day or two on this.
- A new policy requires that everyone write tests for
all their bug
fixes and enhancements. Commits that don't add
enough tests will be
flagged as fixme, and reverted if not fixed.
(4) is critical here.
All good stuff, especially (4) - applause :-D.
While we're at it, it would be nice if we instituted
some other
iron-clad policies. Here's a proposal:
- All new functions (including private helper
functions, functions in
JavaScript includes, whatever) must have
function-level documentation
that explains the purpose of the function and
describes its
parameters. The documentation must be enough that no
MediaWiki
developer should ever have to read the function's
source code to be
able to use it correctly. Exception: if a method is
overridden which
is already documented in the base class, it doesn't
need to be
documented again in the derived class, since the
semantics should be
the same.
- All new classes must have high-level documentation
that explains
their purpose and structure, and how you should use
them. The
documentation must be sufficient that any MediaWiki
developer could
understand why they might want to use the class in
another file, and
how they could do so, without reading any of the
source code. Of
course, developers would still have to read the
function documentation
to learn how to use specific functions. There are no
exceptions, but
a derived class might only need very brief
documentation.
- All new config variables must have documentation
explaining what
they do in terms understandable to end-users. They
must describe what
values are accepted, and if the values are complicated
(like
associative arrays), must provide at least one example
that can be
copy-pasted. There are no exceptions.
- If any code is changed so as to make a comment
incorrect, the
comment must be updated to match. There are no
exceptions.
Or whatever. We have *way* too few high-level
comments in our code.
We have entire files -- added quite recently, mind
you, by established
developers -- that have no or almost no documentation
on either the
class or function level. We can really do better
than this! If we
had a clear list of requirements for comments in new
code, we could
start fixme'ing commits that don't have adequate
comments. I think
that would be enough to get people to add sufficient
comments, for the
most part. Without clear rules, though, backed up by
the threat of
reverting, I don't think the situation will improve
here.
Wonderful stuff - more applause.
(Wow, this kind of turned into a thread hijack. :D)
Who cares. It needs to be said.
On the whole "new code" front. Can we all /please/ remember that we're writing PHP5 here. Visibility on all new functions and variables should also be a must.
OK, I must admit I didn't understand that, probably because I'm new to PHP. Can you make this more explicit?
Dan
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
This is slightly OT and I don't want to hijack the main point (getting our shit together for testing), but...
PHP4 did not have proper object oriented support. There was no concept of visibility--all member variables and functions were public. PHP5 fixed this by adding the familiar public/protected/private structure, as well as a bunch of other things that a proper OO language should be able to do.
My point is that not declaring visibility on new code (ie: leaving everything public) is poor form. It usually doesn't take very long to make the decision about what scope to put a function or variable in, and it can always be changed later if the choice was wrong.
-Chad
On Wed, Aug 12, 2009 at 2:38 AM, Chadinnocentkiller@gmail.com wrote:
My point is that not declaring visibility on new code (ie: leaving everything public) is poor form. It usually doesn't take very long to make the decision about what scope to put a function or variable in, and it can always be changed later if the choice was wrong.
Well, you can always go up (private>protected>public), but not necessarily go down without possibly breaking things you are unaware of.
Bryan
"Bryan Tong Minh" bryan.tongminh@gmail.com wrote in message news:fd5886130908120137h13b0aad9xdd2f05c8a362cc4e@mail.gmail.com...
On Wed, Aug 12, 2009 at 2:38 AM, Chadinnocentkiller@gmail.com wrote:
My point is that not declaring visibility on new code (ie: leaving everything public) is poor form. It usually doesn't take very long to make the decision about what scope to put a function or variable in, and it can always be changed later if the choice was wrong.
Well, you can always go up (private>protected>public), but not necessarily go down without possibly breaking things you are unaware of.
True - and not specifying a scope means the visibility defaults to public. Therefore requiring a scope to be declared can only be a good thing.
- Mark Clements (HappyDog)
On 12/08/2009, at 1:16 AM, Chad wrote:
On the whole "new code" front. Can we all /please/ remember that we're writing PHP5 here. Visibility on all new functions and variables should also be a must.
I honestly don't think this is that important. We should err on the side of public, not private or protected.
Very frequently, I've had to change visibility on functions because somebody hasn't anticipated a case where a class function might be usefully called from outside that class.
-- Andrew Garrett agarrett@wikimedia.org http://werdn.us/
On Wed, Aug 12, 2009 at 8:31 AM, Andrew Garrettagarrett@wikimedia.org wrote:
On 12/08/2009, at 1:16 AM, Chad wrote:
On the whole "new code" front. Can we all /please/ remember that we're writing PHP5 here. Visibility on all new functions and variables should also be a must.
I honestly don't think this is that important. We should err on the side of public, not private or protected.
Very frequently, I've had to change visibility on functions because somebody hasn't anticipated a case where a class function might be usefully called from outside that class.
-- Andrew Garrett agarrett@wikimedia.org http://werdn.us/
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
When writing maintenance, I tried to make very concious decisions about what visibility to put on each method. For clearly defined public interfaces, I made the methods public. Some of the setup functions don't need accessing and I probably don't want children implementing them, so I made them private. I think making careful decisions when coding can lead to well-written classes with proper visibility. If a particular method/variable needs to become public at a later point, it can easily be changed.
My main point is not so much "make things private" but at least look like we're writing code in the last 4 years.
-Chad
On Wed, Aug 12, 2009 at 8:31 AM, Andrew Garrettagarrett@wikimedia.org wrote:
I honestly don't think this is that important. We should err on the side of public, not private or protected.
Very frequently, I've had to change visibility on functions because somebody hasn't anticipated a case where a class function might be usefully called from outside that class.
On the other hand, I've often wanted to refactor code but been unable to because the functions were public or protected and I didn't want to break any third-party extensions that might call them. If a function is private and could usefully be public, then just make it public, no problem.
The only people who are harmed by too many private functions are people without commit access, who can't just make the functions public. Since these are also the only people whose code could be broken by changing a public function, though -- if the code is committed, then the one who changes the public function can just change the caller -- you can make a good case that it's better to make everything public, and let extension authors rely on the public functions if they want without any guarantee that the functions will remain stable. They'd probably prefer that to having the function private and not being able to even try to use it.
Generally speaking, I always make my functions private if I don't explicitly intend to make them part of the class interface for third-party use. But I can see the other side too, so I don't know if a firm policy is the best course of action.
On Wed, Aug 12, 2009 at 9:24 AM, Victor Vasilievvasilvv@gmail.com wrote:
What comment would you except to "public function getArticleText()"?
Well, I can't write a very good comment since I've never read that function or even any significant part of that class until now. (I'm looking at RawPage::getArticleText() here.) But maybe something like:
/** * Returns the parsed HTML corresponding to this page for action=raw output, * and outputs some HTTP headers. * * @return string Parsed HTML suitable for echoing */ private function getArticleText() {
The documentation is cursory because it's really just a helper method that's designed to be called in exactly one place, so anyone who wants to use it is already making changes to RawPage and will as likely as not want to read and/or change the source of the method.
If you don't want to make it private for some reason, then I'd add @private to the documentation and put a note saying what method you actually want to call. It has various side effects and RawPage-specific logic that make it very unlikely to be useful to any other classes.
If you were referring to a hypothetical public method in, say, Revision, then I'd expect the comment to include all information that callers might like to know, such as:
* Is the text parsed or unparsed? If parsed, what method should be used to get unparsed text? If unparsed, what method to get parsed text? * What's the performance impact like? Does this do any batching or caching? Does every call result in a database query? Does it call the parser? Does it use memcached? Is it possible that this will return outdated text, or will it definitely be up-to-date? * Is any kind of permissions checking done? Will this retrieve text for revisions that are deleted, oversighted, rev_deleted, or otherwise hidden? Does it check whether the current user has read permission? * What happens if there's an error, like the revision doesn't exist? Is an exception thrown, or some magic value returned? What sorts of errors should we expect to see? Can we assume there are no errors? E.g., a low-level function that always returns the text even if deleted, and where the revision object was created with a constructor that verified that the revision was legit, might never return an error unless the database couldn't be reached or was manually tampered with (in which case the request should probably just die anyway).
The answers don't have to be lengthy, and they can refer to documentation of other methods within reason. But they *should* be there, in a form where most MediaWiki developers can understand them. I think all those questions could be answered in about three or four lines, say, depending on what the answers are. I really have to go now, or else I'd commit a good comment for Revision::get(Raw)Text() to show you what I mean.
In fact, Revision::getText() already has 11 lines of comments. Revision::getRawText() has only 2. I think both could be improved a bit.
By the way, is there a policy which describes where should we use get*/set* methods and where should we use public fields?
I always use private member variables. Some other developers always use public member variables. It's pretty much the same story as with public/private methods.
Aryeh Gregor wrote:
Or whatever. We have *way* too few high-level comments in our code. We have entire files -- added quite recently, mind you, by established developers -- that have no or almost no documentation on either the class or function level. We can really do better than this! If we had a clear list of requirements for comments in new code, we could start fixme'ing commits that don't have adequate comments. I think that would be enough to get people to add sufficient comments, for the most part. Without clear rules, though, backed up by the threat of reverting, I don't think the situation will improve here.
What comment would you except to "public function getArticleText()"?
By the way, is there a policy which describes where should we use get*/set* methods and where should we use public fields?
--vvv
Brion Vibber brion@wikimedia.org wrote:
Starting about a week ago, parser test results are now included in code review listings for development trunk:
http://www.mediawiki.org/w/index.php?title=Special:Code/MediaWiki/path&p... Ftrunk%2Fphase3
Regressions are now quickly noted and fixed up within a few revisions -- something which didn't happen when they were only being run manually by a few folks here and there.
Is this the sort of thing you're thinking of?
-- brion
Yes. Absolutely. Sight is critical for action and running parserTests on each revision in the development trunk is a good first step on improving code quality.
Dan
--- On Tue, 8/11/09, Chad innocentkiller@gmail.com wrote:
To be perfectly honest, I'm of the opinion that tests/ and t/ should be scrapped and it should all be done over, properly.
What we need is an easy and straightforward way to write test cases, so people are encouraged to write them. Right now, nobody understands wtf is going on in tests/ and t/, so they get ignored and the /vast/ majority of the code isn't tested.
What we need is something similar to parser tests, where it's absurdly easy to pop new tests in with little to no coding required at all. Also, extensions having the ability to inject their own tests into the framework is a must IMHO.
There is a way to easily add tests, but it requires some community discipline. phpunit has a --skeleton command (actually two variations of it) that automatically generates unit tests. (see http://www.phpunit.de/manual/current/en/skeleton-generator.html). All developers have to do is add assertions (which have the appearance of comments) to their code and call phpunit with the --skeleton flag. If you want even more hand holding, Netbeans will do it for you.
This is all wonderful, but there are problems:
* Who is going to go back and create all of the assertions for existing code? Not me (at least not alone). This is too big a job for one person. In order for this to work, you need buy in from at least a reasonable number of developers. So far, the number of developers expressing an interest in code quality and testing is pretty small.
* What motivation is there for those creating new code to do the work to add assertions with good code coverage? So far I haven't seen anything in the MW code development process that would encourage a developer to do this. Without some carrots (and maybe a few sticks) this approach has failure written all over it.
* Even if we get a bunch of Unit tests, how are they integrated into a useful whole? That requires some decisions on test infrastructure. This thread begins the discussion on that, but it really needs a lot more.
* MW has a culture problem. Up to this point people just sling code into trunk and think they are done. As far as I can tell, very few feel they have any responsibility for ensuring their code won't break the product. [Perhaps I am being unkind on this. Without any testing tools available, it is quite possible that developers want to ensure the quality of their code, but don't have the means of doing so.]
I realize these observations may make me unpopular. However, someone has to make them. If everyone just gets mad, it doesn't solve the problem. It just pushes it out to a time when it is even more serious.
Dan
On 8/11/09 1:39 PM, dan nessett wrote:
- MW has a culture problem. Up to this point people just sling code
into trunk and think they are done. As far as I can tell, very few feel they have any responsibility for ensuring their code won't break the product. [Perhaps I am being unkind on this. Without any testing tools available, it is quite possible that developers want to ensure the quality of their code, but don't have the means of doing so.]
Starting about a week ago, parser test results are now included in code review listings for development trunk:
http://www.mediawiki.org/w/index.php?title=Special:Code/MediaWiki/path&p...
Regressions are now quickly noted and fixed up within a few revisions -- something which didn't happen when they were only being run manually by a few folks here and there.
Is this the sort of thing you're thinking of?
-- brion
Hi,
On 2009-08-11, dan nessett wrote:
There is a way to easily add tests, but it requires some community discipline. phpunit has a --skeleton command (actually two variations of it) that automatically generates unit tests. (see http://www.phpunit.de/manual/current/en/skeleton-generator.html). All developers have to do is add assertions (which have the appearance of comments) to their code and call phpunit with the --skeleton flag. If you want even more hand holding, Netbeans will do it for you.
This is all wonderful, but there are problems:
- Who is going to go back and create all of the assertions for
existing code? Not me (at least not alone). This is too big a job for one person. In order for this to work, you need buy in from at least a reasonable number of developers. So far, the number of developers expressing an interest in code quality and testing is pretty small.
This doesn't need to be done straight away, providing all new code is has proper unit tests built for it the old stuff will get done eventually or when it breaks.
- What motivation is there for those creating new code to do the work
to add assertions with good code coverage? So far I haven't seen anything in the MW code development process that would encourage a developer to do this. Without some carrots (and maybe a few sticks) this approach has failure written all over it.
We don't do unit tests properly yet, that's probably why. If there is a decent infrastructure in place then it might be possible to encourage developers to do this.
- Even if we get a bunch of Unit tests, how are they integrated into a
useful whole? That requires some decisions on test infrastructure. This thread begins the discussion on that, but it really needs a lot more.
We have started integrating parser tests in to code review, so something like this could be done for the unit tests so they won't stagnate, etc.
- MW has a culture problem. Up to this point people just sling code
into trunk and think they are done. As far as I can tell, very few feel they have any responsibility for ensuring their code won't break the product. [Perhaps I am being unkind on this. Without any testing tools available, it is quite possible that developers want to ensure the quality of their code, but don't have the means of doing so.]
I completely agree, blatant syntax errors are regularly being checked in - and now there is a post-commit review process very few developers actually get someone else to look over their patches before commiting them which is far from ideal IMO.
I realize these observations may make me unpopular. However, someone has to make them. If everyone just gets mad, it doesn't solve the problem. It just pushes it out to a time when it is even more serious.
I don't think you have anything to worry about, you bring up some valid points that are more often that not, ignored.
Dan
Please can you properly break your lines in e-mail though, to 73(?) characters per a line - should be possible to specify this in your client.
On Tue, Aug 11, 2009 at 2:55 PM, Robert Leverington robert@rhl.me.ukwrote:
Please can you properly break your lines in e-mail though, to 73(?) characters per a line - should be possible to specify this in your client.
That's a joke right? If you are using pine on a monitor built in the 1980s then you are fully expected to BYOLB (Bring Your Own Line Breaks).
2009/8/11 Brian Brian.Mingus@colorado.edu:
On Tue, Aug 11, 2009 at 2:55 PM, Robert Leverington robert@rhl.me.ukwrote:
Please can you properly break your lines in e-mail though, to 73(?) characters per a line - should be possible to specify this in your client.
That's a joke right? If you are using pine on a monitor built in the 1980s then you are fully expected to BYOLB (Bring Your Own Line Breaks).
^X to word-wrap the para, isn't it?
- d.
--- On Tue, 8/11/09, Robert Leverington robert@rhl.me.uk wrote:
Please can you properly break your lines in e-mail though, to 73(?) characters per a line - should be possible to specify this in your client.
I'm using the web interface provided by yahoo. If you can point me in the right direction for setting up yahoo to do this I'll be happy to (I've done this manually on this message).
Dan
On 8/11/09 1:03 PM, Chad wrote:
To be perfectly honest, I'm of the opinion that tests/ and t/ should be scrapped and it should all be done over, properly.
What we need is an easy and straightforward way to write test cases, so people are encouraged to write them. Right now, nobody understands wtf is going on in tests/ and t/, so they get ignored and the /vast/ majority of the code isn't tested.
What we need is something similar to parser tests, where it's absurdly easy to pop new tests in with little to no coding required at all. Also, extensions having the ability to inject their own tests into the framework is a must IMHO.
If we don't want to use straight PHPUnit or similar -- where you're writing PHP source code with a tiny bit of structure -- we could either adapt another surrounding structure or make up our own (scarrry!)
Note that PHP's own test suite uses a file-per-test-case structure which is similar to the individual chunks of our parser test definition file, with delimited sections listing the name, some options, the source to run, and the expected output...
php-5.2.5/tests/strings/bug22592.phpt:
--TEST-- Bug #22592 (Cascading assignments to strings with curly braces broken) --FILE-- <?php $wrong = $correct = 'abcdef';
$t = $x[] = 'x';
var_dump($correct); var_dump($wrong);
$correct[1] = '*'; $correct[3] = '*'; $correct[5] = '*';
// This produces the $wrong[1] = $wrong[3] = $wrong[5] = '*';
var_dump($correct); var_dump($wrong);
?> --EXPECT-- string(6) "abcdef" string(6) "abcdef" string(6) "a*c*e*" string(6) "a*c*e*"
It might be pretty handy actually to use a similar structure for all our tests, whether run through the parser tester or as PHP code; we could provide a standard setup/teardown environment (eg an implied commandLine.inc include and switch over to standard options and test database) so the code segment doesn't need to be a standalone PHP file.
Note that some of the parser test cases rely on setup such as saving some pages into the test database; tests such as a search test could also require custom database setup or the use of mock objects, and we should consider this in planning out potential new test system requirements.
-- brion
Chad innocentkiller@gmail.com wrote:
[...] What we need is something similar to parser tests, where it's absurdly easy to pop new tests in with little to no coding required at all. [...]
Is it really so much more difficult to write
| TestAddArticle ("Main Page", "blah blah");
than
| !! article | Main Page | !! text | blah blah | !! endarticle
or
| TestParser ("External links: trail", "Linktrails should not work for external links: [http://example.com link]s", "<p>Linktrails should not work for external links: <a href="http://example.com%5C" class="external text" rel="nofollow">link</a>s\n</p>");
than
| !! test | External links: trail | !! input | Linktrails should not work for external links: [http://example.com link]s | !! result | <p>Linktrails should not work for external links: <a href="http://example.com" class="external text" rel="nofollow">link</a>s | </p> | !! end
I think the motivation for using standard techniques is to lower the bar for newcomers to the code and not have them to master another ("fairly self-explanatory") syntax.
Tim
On Wed, Aug 12, 2009 at 6:40 PM, Tim Landscheidttim@tim-landscheidt.de wrote:
Chad innocentkiller@gmail.com wrote:
[...] What we need is something similar to parser tests, where it's absurdly easy to pop new tests in with little to no coding required at all. [...]
Is it really so much more difficult to write
| TestAddArticle ("Main Page", "blah blah");
than
| !! article | Main Page | !! text | blah blah | !! endarticle
or
| TestParser ("External links: trail", "Linktrails should not work for external links: [http://example.com link]s", "<p>Linktrails should not work for external links: <a href="http://example.com%5C" class="external text" rel="nofollow">link</a>s\n</p>");
than
| !! test | External links: trail | !! input | Linktrails should not work for external links: [http://example.com link]s | !! result | <p>Linktrails should not work for external links: <a href="http://example.com" class="external text" rel="nofollow">link</a>s | </p> | !! end
I think the motivation for using standard techniques is to lower the bar for newcomers to the code and not have them to master another ("fairly self-explanatory") syntax.
Tim
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
I don't disagree here. I don't really care what the particular format is. My main point (on which I think we agree) was that it needs to be standardized, easy to learn and use, and very flexible to a variety of tests we could potentially want.
-Chad
On 8/11/09 12:35 PM, dan nessett wrote:
I could use some help on test system architecture - you do wear the systems architect hat :-). It doesn't seem right to use WebStart.php to initialize the tests. For one thing, WebStart starts up profiling, which doesn't seem relevant for a test. That leaves Command.inc.
commandLine.inc would be correct, as they are command-line, not web scripts.
-- brion
"dn" == dan nessett dnessett@yahoo.com writes:
dn> suggest breaking LocalSettings into two parts
I've always wanted a public part, say linked from Special:Version, "view public initialization settings", with a standard URL, title=Special:PublicLocalSettings perhaps, and all without needing to install an extension.
And a private part, containing passwords, secret spam filters expressions etc.
E.g., take your average hook. One can let users see its name on Special:Version, but to let users see its content... well one needs to do something like
$ perl -nlwe 'print if /#START PUBLIC/../#END PUBLIC/' \ LocalSettings.php > PublicLocalSettings.txt
often, and link to that. I.e., _you are not making being open source easy_.
If one could easily see a wiki's public LocalSettings portion, one could quickly answer many messages here on this list "No wonder, Clodsworth, you have set $wgNorfblatz='maybe' which is an illegal value".
And on http://www.mediawiki.org/wiki/Manual:Wiki_family#Ultimate_minimalist_solutio... I could provide an up to the minute link 'see the real settings behind that wiki', without needing to cut and paste, nor use the above perl solution.
On Wed, Aug 12, 2009 at 1:42 PM, jidanni@jidanni.org wrote:
"dn" == dan nessett dnessett@yahoo.com writes:
dn> suggest breaking LocalSettings into two parts
I've always wanted a public part, say linked from Special:Version, "view public initialization settings", with a standard URL, title=Special:PublicLocalSettings perhaps, and all without needing to install an extension.
And a private part, containing passwords, secret spam filters expressions etc.
E.g., take your average hook. One can let users see its name on Special:Version, but to let users see its content... well one needs to do something like
$ perl -nlwe 'print if /#START PUBLIC/../#END PUBLIC/' \ LocalSettings.php > PublicLocalSettings.txt
often, and link to that. I.e., _you are not making being open source easy_.
If one could easily see a wiki's public LocalSettings portion, one could quickly answer many messages here on this list "No wonder, Clodsworth, you have set $wgNorfblatz='maybe' which is an illegal value".
And on http://www.mediawiki.org/wiki/Manual:Wiki_family#Ultimate_minimalist_solutio... I could provide an up to the minute link 'see the real settings behind that wiki', without needing to cut and paste, nor use the above perl solution.
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
We just tried to do the opposite by getting rid of AdminSettings. If wikis want to segment their LocalSettings to have public and private parts, they're more than welcome to. I for one won't commit any code that encourages splitting LocalSettings.
-Chad
2009/8/12 jidanni@jidanni.org:
often, and link to that. I.e., _you are not making being open source easy_.
This has absolutely nothing to do with MediaWiki being open source or not. It's not like there's closed-source wikis out there of equivalent quality that don't have this nuisance.
On topic: I agree with Chad that it's trivial to segment LocalSettings yourself if you really want to.
Roan Kattouw (Catrope)
"RK" == Roan Kattouw roan.kattouw@gmail.com writes:
RK> 2009/8/12 jidanni@jidanni.org:
often, and link to that. I.e., _you are not making being open source easy_...
RK> This has absolutely nothing to do with MediaWiki being open source ...for hook authors.
wikitech-l@lists.wikimedia.org