When I suggested a state-machine-based parser, people laughed. I
guess history counts for something, after all.
Possibly the best word I'd use to describe the current codebase (and
the technologies used) is "sophomoric." This is not to belittle those
responsible for phase3, but like you say, it's a hack. When you're
forced to scale faster than the abilities of the handful of people on
tap, you can't always do everything right, and it's been a bang-up job
of everyone involved to ensure that everything works to this day with
practically zero downtime.
Personally (and this almost always incites a riot when proposed in a
public forum), I'd like to see MediaWiki rewritten in a language other
than PHP, or at least have the critical parsing functions moved to
something more efficient than calling preg_replace inside recursive
functions (that is to say, anything). Yes, people have proposed this
since the beginning of time, to the point that developers habitually
roll their eyes and crack jokes about "the benefits of [language of
the week]" at the mere suggestion, but Wikimedia (read: Wikipedia)
won't scale automagically.
I understand a real object store is in the works, and if done right,
that's a huge hurdle overcome. Table-based SQL servers can only scale
so far, especially when you're using the same one you used for your
high school bulletin board project. Special care has to be paid to
issues like replication, of course, or the ultimate benefit is nil or
negative.
There are myriad issues to consider, and I'd like to see everything on
the table. The WMF development cliche has long been "you can't just
keep throwing resources at the problem," but mantras don't solve
problems by themselves.
Flame away, guys.
--
main(){char*s="2)$%\3404(%\3407!,253\312",i=0;
while(i["]&&down==!up&&s["]&&putchar('@'+(i++)
[s]));return!i;} //
http://www.austinhair.org/