Hi DanB,
Comments inline:
On Thu, Jan 12, 2012 at 8:31 AM, Daniel Barrett <danb(a)vistaprint.com> wrote:
As MediaWiki 1.19 is getting ready, I'd like to
offer information on how
MediaWiki 1.18.0 was the most difficult MW upgrade I've ever been through.
Some background: my team administers an internal wiki at a major company
with ~2000 users, over 100 extensions (many of them custom/unreleased), and
100K articles. I've been upgrading MW regularly since 1.11 - every release
and patch - and have never had this much trouble before, mainly because of
extensions that broke in 1.18. The typical MW upgrade takes me a day or
two including regression-testing our extensions. But 1.18 has taken me
weeks and I'm still not done.
Ugh...sorry to hear that.
This message is meant to be constructive & helpful, not blameful: it's
quite possible that every issue was "our
fault" for not keeping up on
exactly which functions & globals were being deprecated, etc. I'd just like
to describe what kinds of things broke for a reasonably active wiki run by
well-meaning people, and to document how we fixed them.
This is very helpful, thank you.
I understand your frustration on a lot of these points, and I hope we can
do better in future releases. A lot of the problems you point out here are
issues where we broke backwards compatibility without really good reason to
do so. It's a tough balance, because we also want to reduce our technical
debt, but I think we're probably too haphazard in our approach to nuking
and modifying interfaces.
There's a few things that folks like yourself can do to avoid these
surprises
1. Look through your logs for deprecation warnings now and when you get
1.18 fully running
2. Start testing 1.19 (trunk) *now* rather than waiting for the release.
You may be able to catch a gratuitous interface change while there's still
time to revert it, saving yourself the trouble of updating your code and
saving others from going through the breakage you're experiencing now.
3. Release the source code to your extension, either directly on our site,
or on github/gitorious/wherever in anticipation of being able to mirror
your work in our shiny new Git repo. Our devs are generally pretty good
about updating extensions that are checked into our repository
4. If you can't release the source for whatever reason, help write unit
tests for the APIs that matter to you, so that you can track when they
break or are changed.
5. If you don't have time to help write unit tests, help identify those
APIs you'd like to see have unit tests. I don't know if we have a central
place to collect "most wanted unit tests", but I'm sure something like that
could be started if you're interested in participating at that level.
I vaguely remember some of the changes you outline below, and I think some
of them even stung us during the 1.18 deploy. I'm interested in
understanding better why these changes were made.
More inline:
1. The global variable $action disappeared,
breaking a bunch of our
extensions. I switched to $wgRequest->getVal('action').
I'll assume Chad is correct that this was never intended to be a stable
global.
2. The removal of Xml::hidden() caused one of our
extensions to
break. I switched to Xml::input(..., array('type', 'hidden'))
This one bit us during the 1.18 release cycle, and it looks like we fixed
it for ourselves:
http://www.mediawiki.org/wiki/Special:Code/MediaWiki/97784
...but forgot to also put it back in 1.18 (and trunk for that matter).
Aaron made that fix in the middle of the rather hectic 1.18 deployment
cycle, so I can see why we missed it, but it's still a shame. Given that
this will probably also bite us in 1.19, we should probably backport to
trunk, and REL1_18 for those people that haven't upgraded yet.
3. A few of our older extensions were not ported to ResourceLoader yet
and were adding JS and CSS via $wgOut->add...
calls. They worked in 1.17
and all broke in 1.18. I ported them to use ResourceLoader, but this is not
a good solution yet because of bug 31676 (the 32-stylesheet limit of IE,
https://bugzilla.wikimedia.org/show_bug.cgi?id=31676) which IMHO is a
very serious time-bomb waiting to explode. I hope it makes it into
"1.19wmf deployment" as planned.
Is this all versions of IE?
4. Some of our parser tag extensions had a bug, in that they didn't
return a value in the tag callback. (These tags had
no visual display.)
This didn't cause problems in 1.17 and earlier, but in 1.18.0 it caused a
UNIQ.....QINU string to render on the page. I fixed our extensions to
return the empty string, and the problem went away.
Yup, it's going to be difficult for us to make MediaWiki releases
bug-for-bug compatible on extensions.
5. The removal of $wgMessageCache->addMessage() broke many extensions,
some ours and some from
mediawiki.org like
SimpleForms. Some fixes just
required use of the i18n file. Our more difficult issue was that we were
injecting system messages into articles to add tracking categories. On
advice from this list (thanks!), we used code patterned after
Parser::addTrackingCategory() to inject categories and it works fine,
actually much better than what we had.
I see the change here:
http://www.mediawiki.org/wiki/Special:Code/MediaWiki/81027
...but it doesn't look like there was much in the way of mailing list
discussion about this, and I also don't see that the README was updated
when this change was made.
Chad claims this was on a clear path to deprecation, but I'm not so sure it
was, based on the history of this page:
http://www.mediawiki.org/wiki/Manual:$wgMessageCache
It looks like it was marked for deprecation here:
http://www.mediawiki.org/wiki/Special:Code/MediaWiki/52503
..which means it was deprecated starting in 1.16. That's old, but not
ancient history. Ubuntu 10.04 ships with MediaWiki 1.15, and that's the
current LTS version of Ubuntu. While I understand all of the arguments for
downloading our tarball rather than using the installed package, it's
reasonable to expect that someone would have the same version of MediaWiki
that ships with their LTS box, regardless of where they got it. There may
be a better litmus than this, but this one seems to be a good compromise
between what seems to be the expectation around these parts ("1.16? that's
o-o-o-ld") and other more conservative, less predicable distros (e.g. RHEL
and Debian) that are commonly used in production environments.
This bug appears to be one consequence of this:
https://bugzilla.wikimedia.org/show_bug.cgi?id=32962
I'm not saying that we can never break backwards compatibility with
whatever reasonable litmus we choose, but that we should do so much more
reluctantly than we currently do it. There should be a tangible,
compelling feature (or complicated bug fix) that results from the breakage,
not merely cleanup.
6. The removal of ts_makeSortable() from wikibits.js threw off a bunch
of our JavaScript: we were using the function to sort
on a different column
than the first one on render, and in extensions that create tables within
dialogs. We left the problem unfixed until I can understand the new jQuery
UI way of doing things (jquery.ui.sortable.js).
Yup, we ended up getting hit with some table sorting bugs, too, some of
which may not be solved on our wikis.
7. Nearly 100% of our customizations to
WikiEditor 1.17 broke in
1.18. We had followed the documented rules on
mediawiki.org, using
extensions, ResourceLoader, etc., and everything worked in 1.17.
Nevertheless in 1.18, toolbars and menus disappeared in IE. Menus appeared
multiple times instead of once in Firefox. JavaScript objects in one module
became undefined in others, even with proper dependencies. Some of these
issues are still not worked out, but most were fixed by a variety of
changes.
The dust is probably still settling on Resource Loader in many ways, so
it's not too surprising that there are problems here. It may be that there
were some changes that could have been postponed or done in a more
backwards-compatible way, but without getting into the details, I can't say
that confidently.
The fact that these are problems in a specific browser is indicative of
problems that may just be tough to avoid. We've found it's hard enough
making sure core functionality works between releases in a cross browser
way, let alone trying to make sure arbitrary developer modifications also
work.
8. Our MediaWiki:common.js stopped running on the
login page. I
realize this was a security fix; it just took me by surprise. Fixed by
writing a custom extension using the hook UserLoginForm to inject the few
lines of JS we needed, and I'm evaluating other non-JS solutions for more
security.
Yeah, those are always going to be ugly.
9. The addHandler() function in JavaScript does
not seem to work in
IE8 anymore. We worked around this by using jQuery's "bind" function.
Is there a bug for this problem?
At this point, our test wiki is stable and I am not anticipating any
further large issues, so we should roll out in the
next two weeks or so.
Glad to hear this is still on track!
Thanks for reading, and I hope this helps someone,
Very helpful, and I hope this results in a good conversation.
Rob