New subject: Link table updates (was Re: Longterm software strategy)

9 May 2005

There are a couple possibilities here. One is to rewrite MediaWiki in
some other language entirely, say Python or C# or whathaveyou.

The other is to keep refactoring the PHP codebase (and it's been much
changed since you left it, Lee) and, optionally, rewrite particular
hotspots in another language.

A possible middle road is to rewrite the core wiki engine to a separate
daemon, and adapt the existing PHP user interface to call into it for
much of the backend work that actually touches data.

The disadvantages of a big rewrite are the disadvantages that all
rewrites carry: a long lead time, getting behind in the game as we lose
useful code and break things again that had been solved, not to mention
introducing brand new exciting bugs. Even the phase 3 rewrite, which was
helpful in many areas, broke a lot of features, reintroduced previously
solved bugs, and set back localization work by many months.

Possible advantages of a complete or core rewrite include performance
enhancements (a JITed or precompiled language could perform better on
some tasks than bytecode-interpreted PHP) and a change in server
architecture.

In particular, PHP tends to impose an architecture where each request is
served by an entirely new script invocation: you have to build any
information up from scratch on each hit, and sharing things like
localization tables between invocations is kind of hard. In another
language it might be easier to run MediaWiki as a standlone server,
which can keep shared data in memory and use it transparently from each
thread.

On the other hand, it's not clear that that's the biggest burden, and we
would still have to synchronize changes in cached data between multiple
servers. Even within PHP we could probably make better use of local
shared memory caches if necessary.

Parsing and rendering speed, as Tim has said, really *is* something we
can throw hardware at. A faster rewritten or plug-in parser might let us
do more with the same hardware, but scaling it is straightforward either
way.

Our biggest, nastiest burden is with internal communications: the
database changes too much. We have to wait on things getting sent,
received, applied, and copied around, and lagging databases send
everything into the toilet fast.

We ask the database too often for information that hasn't changed. We
make a lot of roundtrips, making it hard to make efficient use of
remotely hosted web servers. We make huge writes that produce huge lag
in the replication updates, and then we talk to the master too much
because that lagging replication breaks basic things like page views in
annoying ways.

The new schema is designed to help with these things by minimizing what
has to get written to the database. (We still need to fix up the links
tables to really do this right.) A separate cache-purging daemon could
further help in dealing with things like template invalidation in a
consistent way in the background. Better distinguishing between requests
that need to be absolutely current and requests where it's ok to load a
page that's 30 seconds out of date could make much better use of the
slave servers.

As for the code architecture within MediaWiki itself: originally it was
really really hackish, with lots of cut-n-pasted code and SQL queries
embedded in the middle of HTML output generation. Over time we've been
refactoring things bit by bit to a cleaner interface where backend code
is centralized and consistent, and frontend code can concentrate on UI.
This isn't fully done, but it's still proceeding.

If there's a really strong reason for a rewrite, then we should start
planning a (cleanly implemented, *compatible*, complete) rewrite, making
use of the existing parser tests and other tests in existence and yet to
be written to make sure it'll really be able to take over. If we do, we
should probably target the end of this year or early next year for
taking it live. (No need to rush though; if we're going to rewrite, the
point is to plan ahead and do it right.)

I'm not totally convinced that there is a really strong reason, though.

-- brion vibber (brion @ pobox.com)