Of course we can split en: (by namespace, by first letter, by whatever you want).
Sure. Such splitting assumes that we'd be able to do multiple-server joins. Wiki has quite a lot of relations inside, and even having interwiki is quite a PITA. On the other hand, if we 'split' and use other database engines, we may not use relational databases at all. Those were designed for relations, we need storage.
What was the purpose of this dismissive remark? It is clearly better than Wikipedia's current state, judging from the server speed and the Frequency of database error messages.
It has different task to deal with. If we had to deal with wiki pages as separate entities without too many connections with outside world, having more async information paths, etc, we could work with far less servers. But it's collaboration platform, and requires what collaboration needs - real time information management. And sure, some more async stuff should come in future, but that could mean moving lots of tasks from relational databases to various event brokers, etc.
And yes, we could serve HTML dump with extreme performance. There'd be no database errors, and you'd love the speed. Single p4 may handle 4000reqs/s :) That's far surpassing our current cluster speed ;-)
Domas
wikitech-l@lists.wikimedia.org