Interesting, if what you say is true. I assumed that limits and indexes were
done by the primary mysql engine, not at the data container level. I also
noted from another email about datadumps that the Wikipedia was moving
filesystem based for article storage - makes sense to extend the
message-handling-from-file code to cover articles too into the future. This
would leave the DB primarily as an index.
Something that occured to me that might be a good idea (TM) is to link up to
subversion for the file repositories. This would have big space benefits, as
only deltas would be stored, and give a more powerful view to the data. Link
this up with file based articles and you have the potential to (relatively)
easily produce a standalone wiki engine that could work remotely, much as
devs do. I guess the benefits for the wikipedia is slight, so it probably
wouldn't happen in MW...
On 1/13/07, Domas Mituzas <midom.lists(a)gmail.com> wrote:
Alex,
I was reading about MySQL's Falcon engine (which appears to have reached
an
alpha branch), and was wondering if anyone had
tried it with Mediawiki,
and
for that matter the whole Wikipedia dataset. Curious to know how it
behaves,
and how much efficiency the compressed storage engine gets...
Falcon isn't suitable for now to run Wikipedia dataset. It doesn't have
covering indexes (all reads hit data rows), it doesn't have 'ORDER BY ...
LIMIT' optimization, it is hungry for filesorts, etc.
Mediawiki's primary engine is InnoDB, then some sites may attempt to use
MyISAM, though that isn't well supported...
For some wikipedias though, BLACKHOLE seems to be the best engine MySQL
has
produced.
Domas
_______________________________________________
Wikitech-l mailing list
Wikitech-l(a)lists.wikimedia.org
http://lists.wikimedia.org/mailman/listinfo/wikitech-l