Just a reminder of work in progress and general background, for those who might be commenting without being aware of present work...
First, in MediaWiki 1.5 we've made a major schema change, intended to reduce the number of changes to data rows that have to be made and to slim down the amount of data that has to be pulled per-row when scanning non-bulk-text metadata.
Specifically, the 'cur' and 'old' tables are being split into 'page', 'revision', and 'text'. Lists of pages won't be trudging through large page text fields, and operations like renames of heavily-edited pages won't have to touch 15000 records. This will also give us the potential to move the bulk text to a separate replicated object store to keep the core metadata DBs relatively small and limber (and cacheable).
Talk to icee in #mediawiki if interested in the object store; he's working on a prototype for us, to use for image uploads and potentially bulk text storage.
Second, remember that each wiki's database is independent. It's very likely that at some point we'll want to split out some of the larger wikis to separate master servers; aside from localizing disk and cache utilization, this could provide some fault isolation in that a failure in one master would not affect the wikis running off the other master.
Third, we're expecting to have at least two new additional data centers soon in Europe and the US. Initially these are probably going to be squid proxies since that's easy for us to do (we have a small offsite squid farm in France currently in addition to the squids in the main cluster in Florida) but local web server boxen pulling from local slaved databases at least for read-only requests is something we're likely to see, to move more of the load off of the central location.
Finally, people constantly bring up the 'PostgreSQL cures cancer' bugaboo. 1.4 has experimental PostgreSQL support, which I'd like to see as a first-class supported configuration for the 1.5 release. This is only going to happen, though, if people pitch in to help in testing, bug fixing, and of course make some benchmarks and failure mode tests compared to MySQL! If you ever want Wikimedia to consider switching, the software needs to be available to make it work and it needs to be demonstrated as a legitimate improvement with a feasible conversion.
Domas is the PostgreSQL partisan on our team and wrote the existing PostgreSQL support. If you'd like to help you should probably track him down; in #mediawiki you'll usually find him as 'dammit'.
-- brion vibber (brion @ pobox.com)
On Fri, 18 Mar 2005 22:33:19 -0800, Brion Vibber brion@pobox.com wrote:
Just a reminder of work in progress and general background, for those who might be commenting without being aware of present work...
First, in MediaWiki 1.5 we've made a major schema change, intended to reduce the number of changes to data rows that have to be made and to slim down the amount of data that has to be pulled per-row when scanning non-bulk-text metadata.
Specifically, the 'cur' and 'old' tables are being split into 'page', 'revision', and 'text'. Lists of pages won't be trudging through large page text fields, and operations like renames of heavily-edited pages won't have to touch 15000 records. This will also give us the potential to move the bulk text to a separate replicated object store to keep the core metadata DBs relatively small and limber (and cacheable).
Talk to icee in #mediawiki if interested in the object store; he's working on a prototype for us, to use for image uploads and potentially bulk text storage.
Second, remember that each wiki's database is independent. It's very likely that at some point we'll want to split out some of the larger wikis to separate master servers; aside from localizing disk and cache utilization, this could provide some fault isolation in that a failure in one master would not affect the wikis running off the other master.
Third, we're expecting to have at least two new additional data centers soon in Europe and the US. Initially these are probably going to be squid proxies since that's easy for us to do (we have a small offsite squid farm in France currently in addition to the squids in the main cluster in Florida) but local web server boxen pulling from local slaved databases at least for read-only requests is something we're likely to see, to move more of the load off of the central location.
Finally, people constantly bring up the 'PostgreSQL cures cancer' bugaboo. 1.4 has experimental PostgreSQL support, which I'd like to see as a first-class supported configuration for the 1.5 release. This is only going to happen, though, if people pitch in to help in testing, bug fixing, and of course make some benchmarks and failure mode tests compared to MySQL! If you ever want Wikimedia to consider switching, the software needs to be available to make it work and it needs to be demonstrated as a legitimate improvement with a feasible conversion.
Domas is the PostgreSQL partisan on our team and wrote the existing PostgreSQL support. If you'd like to help you should probably track him down; in #mediawiki you'll usually find him as 'dammit'.
Thank you very much for this update. I just have some questions.
Where will deleted articles be? Tables archive_page, archive_revision, archive_text? Will it just be like the current version?
At what stage (beta, rc, etc) of 1.5 will you be using it on Wikimedia sites? Will you wait for 1.4.0 first?
Is a Bugzilla search for the new features possible? Can you provide a link or some instructions?
Will you bring up http://test.wikipedia.org again soon?
Is the HEAD branch in CVS the one which will become REL1_5? (I don't understand CVS very much)
Is that branch currently usable without major developer knowledge? Might I find anything really cool if I tried to use it?
On Fri, 18 Mar 2005 22:33:19 -0800, Brion Vibber brion@pobox.com wrote:
... in MediaWiki 1.5 we've made a major (database) schema change
I'd like to know, whether the new scheme is a permanent solution to "Delete + undelete cycle doesn't preserve old_id" http://bugzilla.wikimedia.org/show_bug.cgi?id=603 .
Is the schema change already done ?
Then I can finally publish my patch for http://bugzilla.wikipedia.org/show_bug.cgi?id=536 For pages on watchlist save last version number and http://bugzilla.wikipedia.org/show_bug.cgi?id=804 Last-visited revision repository .. Tom
Thomas Gries wrote:
On Fri, 18 Mar 2005 22:33:19 -0800, Brion Vibber wrote:
... in MediaWiki 1.5 we've made a major (database) schema change
I'd like to know, whether the new scheme is a permanent solution to "Delete + undelete cycle doesn't preserve old_id" http://bugzilla.wikimedia.org/show_bug.cgi?id=603 .
Is the schema change already done ?
Yes, that change has been made. At this point revision IDs should indeed be preserved across deletion/undeletion.
-- brion vibber (brion @ pobox.com)
Tomer Chachamu wrote:
Thank you very much for this update. I just have some questions.
Where will deleted articles be? Tables archive_page, archive_revision, archive_text? Will it just be like the current version?
At the moment it uses an archive table only slightly modified from 1.4 (it includes a stored revision ID also so this can be restored on undeletion), however I expect to be changing this soon to make it easier to eg show the existence of deleted items in page contributions, or individual revisions in a page's history which have been removed for copyvio etc. A rev_deleted flag is likely; I'm not quite sure yet how best to handle it on the page side.
At what stage (beta, rc, etc) of 1.5 will you be using it on Wikimedia sites?
Generally we've declared "beta" at the point at which we're willing to roll out the software on some of our production wikis for live testing.
Will you wait for 1.4.0 first?
I'm pretty sure 1.4.0 will be out first. :) Just finishing up some language file tweaks etc.
Is a Bugzilla search for the new features possible? Can you provide a link or some instructions?
You can check the RELEASE-NOTES if anyone's updated. Probably not. ;)
Will you bring up http://test.wikipedia.org again soon?
I'm planning to set up a 1.5 test wiki on my personal site soon. I took down the old test.wikipedia.org because we've had a history of having what are in fact security holes checked in on the dev branch and not noticed for a while, and I really don't want that running on our production servers at this stage.
Is the HEAD branch in CVS the one which will become REL1_5? (I don't understand CVS very much)
Right; at some point we'll create a REL1_5 branch which will freeze the state of the trunk.
Is that branch currently usable without major developer knowledge? Might I find anything really cool if I tried to use it?
It's still going through some upheaval, so I wouldn't recommend it.
-- brion vibber (brion @ pobox.com)
Thanks a lot for all this information!
Brion Vibber wrote:
Is the HEAD branch in CVS the one which will become REL1_5? (I don't understand CVS very much)
Right; at some point we'll create a REL1_5 branch which will freeze the state of the trunk.
Do you have a rough plan how long implementations of new features will be accepted before 1.5 goes into strict bug-fixing mode?
Thanks,
MaPhi
-----BEGIN PGP SIGNED MESSAGE-----
Moin,
On Sunday 20 March 2005 11:13, Brion Vibber wrote:
MaPhi Werner wrote:
Do you have a rough plan how long implementations of new features will be accepted before 1.5 goes into strict bug-fixing mode?
Perhaps as long as a month.
Will v1.4 be released as a final before that? I am just asking because I hold of a (internal) wiki for updating because I don't want to update it to a beta/rc version only to find the stable version released the next day :)
(Maybe it would be a good idea to put a roadmap or "planned release date" on www.mediawiki.org)
Best wishes,
Tels
PS: Brion, and everybody else, thanx for all your work, very much appreciated!
- -- Signed on Sun Mar 20 13:06:24 2005 with key 0x93B84C15. Visit my photo gallery at http://bloodgate.com/photos/ PGP key on http://bloodgate.com/tels.asc or per email.
I'm a Wei-wei-wei-wei-wei-wow-wow-wow-wow-wow-wow-wizzzaahrd...
Tels schrieb:
Maybe it would be a good idea to put a roadmap or "planned release date"
-----BEGIN PGP SIGNED MESSAGE-----
Moin,
On Sunday 20 March 2005 11:59, Thomas Gries wrote:
Tels schrieb:
Maybe it would be a good idea to put a roadmap or "planned release date"
Duh! :) (A more prominent link at www.mediawiki.org would be a good idea, though)
Best wishes,
Tels
- -- Signed on Sun Mar 20 12:35:13 2005 with key 0x93B84C15. Visit my photo gallery at http://bloodgate.com/photos/ PGP key on http://bloodgate.com/tels.asc or per email.
"People who are rather more than six feet tall and nearly as broad across the shoulders often have uneventful journeys. People jump out at them from behind rocks then say things like, 'Oh. Sorry. I thought you were someone else.'" -- Terry Pratchett
-----BEGIN PGP SIGNED MESSAGE-----
Moin,
On Sunday 20 March 2005 12:35, Tels wrote:
Moin,
On Sunday 20 March 2005 11:59, Thomas Gries wrote:
Tels schrieb:
Maybe it would be a good idea to put a roadmap or "planned release date"
Duh! :) (A more prominent link at www.mediawiki.org would be a good idea, though)
Btw, roadmap says:
"Note: This page is out of date and was never very accurate."
I think it would be a very good idea to have a fixed release date for version X and then strictly stick to it instead ofhaving prolonged "beta" seasons.
The situation is a bit like with Perl 5.8.0 - it got pushed farther and farther away and meanwhile everybody shipped v5.6.x because 5.8.0 was still labeled as experimental/beta/not read yet.
With Perl v5.8.x there are now fixed release deadlines and they are strictly adhered to, which resulted in quite a bit of released versions - currently 5.8.7 is at the door.
That's IMHO much better than having v1.4 "nearly ready" since 2004-12-03 :)
Best wishes,
Tels
- -- Signed on Sun Mar 20 12:38:09 2005 with key 0x93B84C15. Visit my photo gallery at http://bloodgate.com/photos/ PGP key on http://bloodgate.com/tels.asc or per email.
"Laugh and the world laughs with you, snore and you sleep alone." -- Unknown
Dear Tels,
roadmap says: "Note: This page is out of date and was never very accurate."
I just wanted to point you and the other developers to update this page - if there is time
I think it would be a very good idea to have a fixed release date for version X and then strictly stick to it instead ofhaving prolonged "beta" seasons.
Strict planning is very difficult because of the many issues whithin this multi-site international development.
Tels wrote:
With Perl v5.8.x there are now fixed release deadlines and they are strictly adhered to, which resulted in quite a bit of released versions - currently 5.8.7 is at the door.
Remember we've been doing two to three major version releases a year, which is not absurdly slow. Fashionably 'aggressive' projects like Gnome, Fedora Core, etc are generally targetting similar 6-month release intervals, which people sometimes complain are too fast. :)
mediawiki-stable: 2003-08-29 1.1.0: 2003-12-08 1.2.0: 2004-03-24 1.3.0: 2004-08-12 1.4.0: 2005-03-20 [approx]
That's IMHO much better than having v1.4 "nearly ready" since 2004-12-03 :)
Well, I wouldn't say it's been "nearly ready" since December 3. A *lot* of important bug fixes got done over the next three months, fixes which happened because we rolled it out into use in an environment where it would get a lot of stress testing and where we could deploy our fixes immediately at a point where we wouldn't dare declare it ready for the public. :)
Many people don't keep up with releases well once they've set up their sites, so I'm pretty leery about releasing a new 'stable' version too early.
However I had hoped to get a formal release out a bit earlier than this (and would have done so last month, had I not run into security bugs I wanted to do further testing and investigation on; that lead me to put out a release candidate instead of a final release). I'll go ahead and push out 1.4.0 tomorrow; I'm pretty happy with the current state.
-- brion vibber (brion @ pobox.com)
-----BEGIN PGP SIGNED MESSAGE-----
Moin,
On Sunday 20 March 2005 14:49, Brion Vibber wrote:
Tels wrote:
With Perl v5.8.x there are now fixed release deadlines and they are strictly adhered to, which resulted in quite a bit of released versions - currently 5.8.7 is at the door.
Remember we've been doing two to three major version releases a year, which is not absurdly slow. Fashionably 'aggressive' projects like Gnome, Fedora Core, etc are generally targetting similar 6-month release intervals, which people sometimes complain are too fast. :) mediawiki-stable: 2003-08-29 1.1.0: 2003-12-08 1.2.0: 2004-03-24 1.3.0: 2004-08-12 1.4.0: 2005-03-20 [approx]
That's IMHO much better than having v1.4 "nearly ready" since 2004-12-03 :)
Well, I wouldn't say it's been "nearly ready" since December 3. A *lot* of important bug fixes got done over the next three months, fixes which happened because we rolled it out into use in an environment where it would get a lot of stress testing and where we could deploy our fixes immediately at a point where we wouldn't dare declare it ready for the public. :)
Many people don't keep up with releases well once they've set up their sites, so I'm pretty leery about releasing a new 'stable' version too early.
However I had hoped to get a formal release out a bit earlier than this (and would have done so last month, had I not run into security bugs I wanted to do further testing and investigation on; that lead me to put out a release candidate instead of a final release). I'll go ahead and push out 1.4.0 tomorrow; I'm pretty happy with the current state.
Sorry if I sounded "pushy" - didn't actually want to complain about a slow releases cycle - /me wanted merely a sort of schedule when 1.4 is expected (e.g. next day, next week or next year :)
If one knows the next stable release is 6 months away, one can update now and enjoy the features, if it is due next week, we don't need to bother to upgrade to RC1 because when the upgrade is done, 1.4final is already out :o)
I know it is hard to release stuff, because it is never as ready as one wants it :o)
Thanx again,
Best wishes,
Tels
- -- Signed on Sun Mar 20 14:51:37 2005 with key 0x93B84C15. Visit my photo gallery at http://bloodgate.com/photos/ PGP key on http://bloodgate.com/tels.asc or per email.
"Hi, hi, hitotai, space taxi to the sky"
On Mar 19, 2005 10:33 PM, Brion Vibber brion@pobox.com wrote:
I'm planning to set up a 1.5 test wiki on my personal site soon. I took down the old test.wikipedia.org because we've had a history of having what are in fact security holes checked in on the dev branch and not noticed for a while, and I really don't want that running on our production servers at this stage.
Brion: Are you still planning to do this? When?
To download from CVS and set stuff up is a lot of effort just to see what's new. Of course, if somebody could *tell* me, or link to an appropriate Bugzilla query or log, that would be good too.
Tomer Chachamu wrote:
On Mar 19, 2005 10:33 PM, Brion Vibber brion@pobox.com wrote:
I'm planning to set up a 1.5 test wiki on my personal site soon.
Brion: Are you still planning to do this? When?
Yes, when I get around to it...
To download from CVS and set stuff up is a lot of effort just to see what's new. Of course, if somebody could *tell* me, or link to an appropriate Bugzilla query or log, that would be good too.
Most of the changes are internal schema changes and thus not really visible from the outside.
-- brion vibber (brion @ pobox.com)
To download from CVS and set stuff up is a lot of effort just to see what's new. Of course, if somebody could *tell* me, or link to an appropriate Bugzilla query or log, that would be good too.
You could read this: http://cvs.sourceforge.net/viewcvs.py/*checkout*/wikipedia/phase3/RELEASE-NO...
wikitech-l@lists.wikimedia.org