I realize that mediawiki-l is a good list for reporting bugs in mediawiki, however, this posting addresses organizational elements of the 1.7 release that I feel need to be taken seriously by the Wikimedia Foundation and some changes made in your processes. The statements that follow are the result of my 20+ years experience as a 1) CEO of high tech companies 2) Chief Scientist of high tech companies 3) Wikipedia developer 4) Wikigadugi Language Project leader. Wikimedia can take or leave the comments -- they are for your benefit and consideration.
1. Great that MediaWiki 1.7.1 is released. 2. Has Dependencies on PHP 5.0.0 (5.1.4). php 5 is a bug infested experimental release which does not have consistent installs on the commerical Linux distributions, including Red Hat ES 4, Suse 10, etc, crashes on apache, is poorly documented, and has a slew of defects, is difficult to install and configure. 3. Wikitext and MediaWiki has no dependencies for "wikitext" markup on php versions, so requiring php upgrades is unnecessary and thoughtless, and should be stopped. When you allow the developers of a technology to operate in a vacuum without accountability to customers and end users, the projects get driven by the community ego and not end goals to promote the technology to the public. 4. The biggest failure of community driven programs and open source is a lack of accountablity to end users and customers.
What should happen in the future:
1. MediaWiki needs to post releases against commercial Linux distros and not a "roll your own" with a patchwork of patches and fixes. As it stands, MediaWiki 1.7.1 is not installable or usable on RedHat ES4 or FC5 without major patching and recompiling -- steps outside the skills of most folks who need to use Wikipedia content. 2. Wikimedia needs to enforce MediaWiki releases to allow seamless importing and use of XML dumps and content.
My 2 cents.
I spend currently about 40% of my time getting around MediaWiki problems with Wikipedia XML dumps, and 60% of my time actually working on Native language translations. The Uto-Aztecan language is being added at present as the Ute and Unita tribes have joined the WikiGadugi project. It would be nice if the MediaWiki folks would give more thought to ensuring their software works properly before posting releases, and the Wikimedia Foundation needs to enforce compatibilty between its published XML dumps and make certain they work properly with released MediaWiki versions. As it stands, the current setup almost appears anti-competitive by design to prevent folks from creating Wiki's of Wikipedia content (though I do not believe this is intentional on the part of Wikimedia).
The "we are community driven" excuse only works if the community is also the customer and end consumer of Wikipedia content, which it is not. Anything we can do to make all of our jobs easier and more pleasant is appreciated.
</WP:SOAP>
Jeff
Hi Jeff,
Making MediaWiki usable by 3rd parties was Brion's initiative, but it never produced image of mediawiki that would have same feature set like the one we run on live site.
Of course, extensions used on live site are available to download, but I'm not sure everyone can have say.. math support on every hosting account. Nor mono with our lucene daemon.
On the other hand, we always run a version that is not even released (right now site runs on 1.8), and there are always features on Wikimedia servers that do not exist in rolled out packages.
I don't think php5 is such a menace as you've portrayed, big shops (such as Y!) use it, extensions are more developed for php5 than for php4, it's OO abilities allow more elegant code (and mediawiki is OO'ish), etc.
php5 packages are available for many Linux distributions (some have them in 'testing' branches, though) - I've been running php5 RPM on my Fedora (even FC3) boxes for quite a while.
If you think that distributions should have better support for MediaWiki, you can ask distributions to do that. They started doing mediawiki packages themselves anyway ;-)
Our primary mission is to run website, our secondary mission is to provide software and best practices. And those, surprisingly, include php5 in the list.
Domas
Domas Mituzas wrote:
Making MediaWiki usable by 3rd parties was Brion's initiative, but it never produced image of mediawiki that would have same feature set like the one we run on live site.
Of course, extensions used on live site are available to download, but I'm not sure everyone can have say.. math support on every hosting account. Nor mono with our lucene daemon.
On the other hand, we always run a version that is not even released (right now site runs on 1.8), and there are always features on Wikimedia servers that do not exist in rolled out packages.
While I know there are plenty of logistical issues involved, I think from the point of view of the Foundation's charitable mission this is somewhat of a problem, because the point of producing Wikipedia is not just to put up an encyclopedia at *.wikipedia.org, but to produce encyclopedia articles that are Free and reusable by anyone. If the articles we distribute are only fully renderable by unreleased software or software that's particularly complex to set up and configure, that discourages third-party reuse.
Things like the Lucene search are less of an issue, I think; they're website infrastructure useful for browsing the encyclopedia, but not software that's needed to render the articles correctly.
-Mark
Delirium wrote:
While I know there are plenty of logistical issues involved, I think from the point of view of the Foundation's charitable mission this is somewhat of a problem, because the point of producing Wikipedia is not just to put up an encyclopedia at *.wikipedia.org, but to produce encyclopedia articles that are Free and reusable by anyone. If the articles we distribute are only fully renderable by unreleased software or software that's particularly complex to set up and configure, that discourages third-party reuse.
And if that were true, I might agree.
But PHP 5 is neither unreleased nor particularly complex to set up. It's been in production for years, is widely available on thousands of free and for-pay hosting services, and can be installed on any home computer or server in a manner of minutes using any number of package installers.
-- brion vibber (brion @ pobox.com)
Brion Vibber wrote:
Delirium wrote:
While I know there are plenty of logistical issues involved, I think from the point of view of the Foundation's charitable mission this is somewhat of a problem, because the point of producing Wikipedia is not just to put up an encyclopedia at *.wikipedia.org, but to produce encyclopedia articles that are Free and reusable by anyone. If the articles we distribute are only fully renderable by unreleased software or software that's particularly complex to set up and configure, that discourages third-party reuse.
And if that were true, I might agree.
Brion, thanks for offering to help.
Brion, it is true. The XML dumps are not renderable except on the unreleased Mediwiki. Take a look at the following. I am running 1.6.7 with all of the extensions installed against the 20060702 XMl dumps. The English and Cherokee versions are both broken. Also, SVG images are not rendered on 1.6.7 AT ALL from these dumps due to a bug in GetHeight and GetWidth in Image.php.
http://www.wikigadugi.org/wiki/Cherokee_Language
http://www.wikigadugi.org/wiki/Ute_Tribe
http://www.wikigadugi.org/wiki/Back_To_The_Future
Jeff
But PHP 5 is neither unreleased nor particularly complex to set up. It's been in production for years, is widely available on thousands of free and for-pay hosting services, and can be installed on any home computer or server in a manner of minutes using any number of package installers.
-- brion vibber (brion @ pobox.com)
foundation-l mailing list foundation-l@wikimedia.org http://mail.wikipedia.org/mailman/listinfo/foundation-l
Jeff V. Merkey wrote:
Brion, it is true. The XML dumps are not renderable except on the unreleased Mediwiki. Take a look at the following. I am running 1.6.7 with all of the extensions installed against the 20060702 XMl dumps. The English and Cherokee versions are both broken. Also, SVG images are not rendered on 1.6.7 AT ALL from these dumps due to a bug in GetHeight and GetWidth in Image.php.
http://www.wikigadugi.org/wiki/Cherokee_Language
These pages are most likely affected by the difference in parser treatment of embedded HTML in templates when HTML Tidy is engaged.
You can install HTML Tidy (I'm fairly certain current Fedora versions include a package, if not it's in Extras) and enable its use on your wiki. Unfortunately the bug (that the behavior is different) hasn't been resolved yet; fortunately it should be easy enough for you to work around by configuring your site appropriately.
If the image bug isn't fixed in 1.6.8 already, let me know and I'll merge it for 1.6.9.
(By the way, as mentioned the other day we're pretty offtopic for Foundation-L.)
-- brion vibber (brion @ pobox.com)
Brion Vibber wrote:
Jeff V. Merkey wrote:
Brion, it is true. The XML dumps are not renderable except on the unreleased Mediwiki. Take a look at the following. I am running 1.6.7 with all of the extensions installed against the 20060702 XMl dumps. The English and Cherokee versions are both broken. Also, SVG images are not rendered on 1.6.7 AT ALL from these dumps due to a bug in GetHeight and GetWidth in Image.php.
http://www.wikigadugi.org/wiki/Cherokee_Language
These pages are most likely affected by the difference in parser treatment of embedded HTML in templates when HTML Tidy is engaged.
You can install HTML Tidy (I'm fairly certain current Fedora versions include a package, if not it's in Extras) and enable its use on your wiki. Unfortunately the bug (that the behavior is different) hasn't been resolved yet; fortunately it should be easy enough for you to work around by configuring your site appropriately.
If the image bug isn't fixed in 1.6.8 already, let me know and I'll merge it for 1.6.9.
(By the way, as mentioned the other day we're pretty offtopic for Foundation-L.)
Well, the technical discussion is certainly off topic. The discussion about requiring XML dumps to work on released MediaWiki versions is something Wikimedia folks need to think about and be aware of since its clear folks were under the assumption the XMl dumps work "as is", which we are seeing is not the case.
It's clear to me that everyone is on the same page here on this point, and the thread was useful to establish this for everyone. Let's move the discussion off line and you and I can dialouge about it.
Jeff
-- brion vibber (brion @ pobox.com)
foundation-l mailing list foundation-l@wikimedia.org http://mail.wikipedia.org/mailman/listinfo/foundation-l
Brion Vibber wrote:
Delirium wrote:
While I know there are plenty of logistical issues involved, I think from the point of view of the Foundation's charitable mission this is somewhat of a problem, because the point of producing Wikipedia is not just to put up an encyclopedia at *.wikipedia.org, but to produce encyclopedia articles that are Free and reusable by anyone. If the articles we distribute are only fully renderable by unreleased software or software that's particularly complex to set up and configure, that discourages third-party reuse.
And if that were true, I might agree.
But PHP 5 is neither unreleased nor particularly complex to set up. It's been in production for years, is widely available on thousands of free and for-pay hosting services, and can be installed on any home computer or server in a manner of minutes using any number of package installers.
I was speaking of the use of unreleased versions of MediaWiki and complex/undocumented extension setups (which extensions *are* enabled on Wikipedia anyway? I know LaTeX and Cite are; is that all?). PHP 5 is fine of course.
-Mark
On 7/11/06, Delirium delirium@hackish.org wrote:
I was speaking of the use of unreleased versions of MediaWiki and complex/undocumented extension setups (which extensions *are* enabled on Wikipedia anyway? I know LaTeX and Cite are; is that all?). PHP 5 is fine of course.
-Mark
http://en.wikipedia.org/wiki/Special:Version
Delirium wrote:
I was speaking of the use of unreleased versions of MediaWiki and complex/undocumented extension setups (which extensions *are* enabled on Wikipedia anyway? I know LaTeX and Cite are; is that all?). PHP 5 is fine of course.
Here are the extensions that affect content rendering currently in use:
Cite/Cite.php ParserFunctions/ParserFunctions.php
A few wikis also have: intersection/DynamicPageList.php
Additionally enabled and used in the edittools but not generally in content: CharInsert/CharInsert.php
Math and SVG rendering are also in use on Wikipedia, so should be enabled if taking our current content straight.
$wgLegalTitleChars should have "+" appended to it for compatibility. This will most likely be added to the default in the next revision, unless I forget about it again.
You can find a fuller list of extensions at Special:Version.
-- brion vibber (brion @ pobox.com)
Brion Vibber wrote:
Delirium wrote:
I was speaking of the use of unreleased versions of MediaWiki and complex/undocumented extension setups (which extensions *are* enabled on Wikipedia anyway? I know LaTeX and Cite are; is that all?). PHP 5 is fine of course.
Here are the extensions that affect content rendering currently in use:
Cite/Cite.php ParserFunctions/ParserFunctions.php
A few wikis also have: intersection/DynamicPageList.php
Additionally enabled and used in the edittools but not generally in content: CharInsert/CharInsert.php
Math and SVG rendering are also in use on Wikipedia, so should be enabled if taking our current content straight.
$wgLegalTitleChars should have "+" appended to it for compatibility. This will most likely be added to the default in the next revision, unless I forget about it again.
You can find a fuller list of extensions at Special:Version.
-- brion vibber (brion @ pobox.com)
Brion,
I downloaded 1.6.8 last night and the SVG rendering was still broken. I can hack the script and get it working, but I would rather wait for you to post a 1.6.9 release. Also, the HTML tidy suggestion is not an option -- we should fix our bugs rather than use an external program to cleanup the mess left by the bugs :-).
I got 1.7 installed, however, the page rendering is still trashed. Would you like an account on wikigadugi.org to poke around in my database and MediaWiki setup to see if there's a problem there or for some insights? I am open to giving you one to help me determine how best to solve the issues of "as is" XMl dumps so other folks supporting Wikimedia won't encounter this problem again.
Jeff
Domas Mituzas wrote:
Hi Jeff,
Making MediaWiki usable by 3rd parties was Brion's initiative, but it never produced image of mediawiki that would have same feature set like the one we run on live site.
Thank you for the thoughtful and informative response.
Of course, extensions used on live site are available to download, but I'm not sure everyone can have say.. math support on every hosting account. Nor mono with our lucene daemon.
Yes, I have all of these and they work.
On the other hand, we always run a version that is not even released (right now site runs on 1.8), and there are always features on Wikimedia servers that do not exist in rolled out packages.
This needs to be reviewed. Running 1.8 on your site is not an issue, but creating XML dumps which are incompatible with released versions is. This is what needs to be addressed.
I don't think php5 is such a menace as you've portrayed, big shops (such as Y!) use it, extensions are more developed for php5 than for php4, it's OO abilities allow more elegant code (and mediawiki is OO'ish), etc.
php5 packages are available for many Linux distributions (some have them in 'testing' branches, though) - I've been running php5 RPM on my Fedora (even FC3) boxes for quite a while.
php 5 crashes and is very unstable for a production server without patching and a lot of effort to get it working.
If you think that distributions should have better support for MediaWiki, you can ask distributions to do that. They started doing mediawiki packages themselves anyway ;-)
Our primary mission is to run website, our secondary mission is to provide software and best practices. And those, surprisingly, include php5 in the list.
Well, I would assume that the mission should be to make Wikipedia's content pervasive. One simple fix to this is to provide XML dumps which work with released MediaWiki versions. As it stands, from an external perspective, the current model appears anti-competitive and monopolistic (though I as said before, I do not believe this is intentional).
Please consider providing an XML branch which is compatible with released MediaWiki versions. This is not something that should be negotiable or optional -- it should be mandated as a requirement for publishing dumps .
Jeff
Domas _______________________________________________ foundation-l mailing list foundation-l@wikimedia.org http://mail.wikipedia.org/mailman/listinfo/foundation-l
Jeff V. Merkey wrote:
This needs to be reviewed. Running 1.8 on your site is not an issue, but creating XML dumps which are incompatible with released versions is. This is what needs to be addressed.
The XML dump format has not changed since 1.5, and should remain backwards compatible for the forseeable future.
For wiki-agnostic republishing purposes, we're considering making more easily used HTML dumps available.
If you want the wikitext source, though, you do need to keep in mind that extensions, bug fixes, etc will continue to evolve along with Wikimedia's needs, so the exact rendering of given input text may necessarily change a bit over time.
If you have a specific concern, please let us know.
-- brion vibber (brion @ pobox.com)
Oooh! HTML dumps!
I'll certainly be looking forward to that!
On 7/10/06, Brion Vibber brion@pobox.com wrote:
Jeff V. Merkey wrote:
This needs to be reviewed. Running 1.8 on your site is not an issue, but creating XML dumps which are incompatible with released versions is. This is what needs to be addressed.
The XML dump format has not changed since 1.5, and should remain backwards compatible for the forseeable future.
For wiki-agnostic republishing purposes, we're considering making more easily used HTML dumps available.
If you want the wikitext source, though, you do need to keep in mind that extensions, bug fixes, etc will continue to evolve along with Wikimedia's needs, so the exact rendering of given input text may necessarily change a bit over time.
If you have a specific concern, please let us know.
-- brion vibber (brion @ pobox.com)
foundation-l mailing list foundation-l@wikimedia.org http://mail.wikipedia.org/mailman/listinfo/foundation-l
Jeff V. Merkey wrote:
- Has Dependencies on PHP 5.0.0 (5.1.4). php 5 is a bug infested
experimental release which does not have consistent installs on the commerical Linux distributions, including Red Hat ES 4, Suse 10, etc, crashes on apache, is poorly documented, and has a slew of defects, is difficult to install and configure.
PHP 4 is infested with bugs too, the difference is that the PHP devs refuse to backport fixes for them. PHP 5 has had 18 releases now, over the course of 3 years, and is widely available on commercial web hosts.
It's not the first time we've required PHP updates, we dropped support for PHP 4.1.x and then later 4.2.x.
What should happen in the future:
- MediaWiki needs to post releases against commercial Linux distros
and not a "roll your own" with a patchwork of patches and fixes. As it stands, MediaWiki 1.7.1 is not installable or usable on RedHat ES4 or FC5 without major patching and recompiling -- steps outside the skills of most folks who need to use Wikipedia content. 2. Wikimedia needs to enforce MediaWiki releases to allow seamless importing and use of XML dumps and content.
I wasn't aware that the linux distros weren't including PHP 5 yet. Here's Wikimedia's install script in case it helps:
http://noc.wikimedia.org/~tstarling/install-php
It should work on FC 3, 4 and maybe 5.
My 2 cents.
I spend currently about 40% of my time getting around MediaWiki problems with Wikipedia XML dumps, and 60% of my time actually working on Native language translations. The Uto-Aztecan language is being added at present as the Ute and Unita tribes have joined the WikiGadugi project. It would be nice if the MediaWiki folks would give more thought to ensuring their software works properly before posting releases, and the Wikimedia Foundation needs to enforce compatibilty between its published XML dumps and make certain they work properly with released MediaWiki versions. As it stands, the current setup almost appears anti-competitive by design to prevent folks from creating Wiki's of Wikipedia content (though I do not believe this is intentional on the part of Wikimedia).
The "we are community driven" excuse only works if the community is also the customer and end consumer of Wikipedia content, which it is not. Anything we can do to make all of our jobs easier and more pleasant is appreciated.
MediaWiki is primarily written for Wikimedia, development has always been driven by Wikimedia's needs. It's no coincidence that we dropped PHP 4 support not long after we upgraded the Wikimedia cluster to PHP 5. MediaWiki is not "community driven" if you take the community to be the external sysadmin community. It is driven by the community of Wikimedia users.
I don't know what your problem with the XML dumps is, but I'm sure if there is a problem with the format then it can be fixed. If you tell us what it is, of course.
Nod to Brion and Domas, I seem to be repeating some of the points they have made.
-- Tim Starling
Tim Starling wrote:
Jeff V. Merkey wrote:
- Has Dependencies on PHP 5.0.0 (5.1.4). php 5 is a bug infested
experimental release which does not have consistent installs on the commerical Linux distributions, including Red Hat ES 4, Suse 10, etc, crashes on apache, is poorly documented, and has a slew of defects, is difficult to install and configure.
PHP 4 is infested with bugs too, the difference is that the PHP devs refuse to backport fixes for them. PHP 5 has had 18 releases now, over the course of 3 years, and is widely available on commercial web hosts.
It's not the first time we've required PHP updates, we dropped support for PHP 4.1.x and then later 4.2.x.
What should happen in the future:
- MediaWiki needs to post releases against commercial Linux distros
and not a "roll your own" with a patchwork of patches and fixes. As it stands, MediaWiki 1.7.1 is not installable or usable on RedHat ES4 or FC5 without major patching and recompiling -- steps outside the skills of most folks who need to use Wikipedia content. 2. Wikimedia needs to enforce MediaWiki releases to allow seamless importing and use of XML dumps and content.
I wasn't aware that the linux distros weren't including PHP 5 yet. Here's Wikimedia's install script in case it helps:
http://noc.wikimedia.org/~tstarling/install-php
It should work on FC 3, 4 and maybe 5.
My 2 cents.
I spend currently about 40% of my time getting around MediaWiki problems with Wikipedia XML dumps, and 60% of my time actually working on Native language translations. The Uto-Aztecan language is being added at present as the Ute and Unita tribes have joined the WikiGadugi project. It would be nice if the MediaWiki folks would give more thought to ensuring their software works properly before posting releases, and the Wikimedia Foundation needs to enforce compatibilty between its published XML dumps and make certain they work properly with released MediaWiki versions. As it stands, the current setup almost appears anti-competitive by design to prevent folks from creating Wiki's of Wikipedia content (though I do not believe this is intentional on the part of Wikimedia).
The "we are community driven" excuse only works if the community is also the customer and end consumer of Wikipedia content, which it is not. Anything we can do to make all of our jobs easier and more pleasant is appreciated.
MediaWiki is primarily written for Wikimedia, development has always been driven by Wikimedia's needs. It's no coincidence that we dropped PHP 4 support not long after we upgraded the Wikimedia cluster to PHP 5. MediaWiki is not "community driven" if you take the community to be the external sysadmin community. It is driven by the community of Wikimedia users.
I don't know what your problem with the XML dumps is, but I'm sure if there is a problem with the format then it can be fixed. If you tell us what it is, of course.
Nod to Brion and Domas, I seem to be repeating some of the points they have made.
-- Tim Starling
foundation-l mailing list foundation-l@wikimedia.org http://mail.wikipedia.org/mailman/listinfo/foundation-l
Thanks for the script. This helps and I appreciate it.
Jeff
wikimedia-l@lists.wikimedia.org