Interesting, if what you say is true. I assumed that limits and indexes were done by the primary mysql engine, not at the data container level. I also noted from another email about datadumps that the Wikipedia was moving filesystem based for article storage - makes sense to extend the message-handling-from-file code to cover articles too into the future. This would leave the DB primarily as an index.
Something that occured to me that might be a good idea (TM) is to link up to subversion for the file repositories. This would have big space benefits, as only deltas would be stored, and give a more powerful view to the data. Link this up with file based articles and you have the potential to (relatively) easily produce a standalone wiki engine that could work remotely, much as devs do. I guess the benefits for the wikipedia is slight, so it probably wouldn't happen in MW...
On 1/13/07, Domas Mituzas midom.lists@gmail.com wrote:
Alex,
I was reading about MySQL's Falcon engine (which appears to have reached an
alpha branch), and was wondering if anyone had tried it with Mediawiki, and for that matter the whole Wikipedia dataset. Curious to know how it behaves, and how much efficiency the compressed storage engine gets...
Falcon isn't suitable for now to run Wikipedia dataset. It doesn't have covering indexes (all reads hit data rows), it doesn't have 'ORDER BY ... LIMIT' optimization, it is hungry for filesorts, etc.
Mediawiki's primary engine is InnoDB, then some sites may attempt to use MyISAM, though that isn't well supported...
For some wikipedias though, BLACKHOLE seems to be the best engine MySQL has produced.
Domas _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org http://lists.wikimedia.org/mailman/listinfo/wikitech-l