Hi Rob,
If you only need to use MyISAM for full-text matching, then use it for
the "searchindex" table, and use InnoDB for the other tables.
Why should I use InnoDB in this case ? I do not plan to use transactions. Wikipedia tables are only processed / matched on a single computer with just very few (if any) concurrent connections.
Not considering transactions: What are the main advantages of InnoDB: Is it faster, is it more stable, is it space-keeping, is it better for large files/databases like the Wikipedia dumps ... ??
Alex, http://www.meshine.info (very interested in your answers..)
Alexander Hölzel CEO EUTROPA AG
============================ EUTROPA Aktiengesellschaft Oelmüllerstrasse 9, D-82166 Gräfelfing, Tel 089 87130900, Fax 089 87130902 ============================
On 8/7/07, MESHine Team alerts@meshine.info wrote:
Not considering transactions: What are the main advantages of InnoDB: Is it faster, is it more stable, is it space-keeping, is it better for large files/databases like the Wikipedia dumps ... ??
A lot more stable. Even if you don't use transactions, partially-completed individual queries (for whatever reason) will be automatically rolled back rather than leaving the database in an inconsistent state. This means that table corruption is theoretically impossible as long as a) there are no bugs in InnoDB and b) there are no bugs in the OS/hardware (e.g. disk write cache flush returning true when it actually hasn't been flushed).
This is nice especially for large tables, because on MyISAM you have to run REPAIR TABLE if/when it gets corrupted. That can take a couple of hours for a table of just a few million rows, and if the repair isn't optimal (i.e. it's using keycache and not sorting), it can take a couple of days. It scales linearly at best with table size even by sorting, probably linearly-logarithmically (since it is after all a sort). So I'm guessing days and days for even repair by sorting with Wikimedia-sized tables.
So yeah, I've just learned all that the hard way on the forum I run. ;) Of course, the disadvantage of InnoDB is that it's somewhat slower if you don't have a lot of locking (in which case it's obviously much faster due to its more granular locking).
wikitech-l@lists.wikimedia.org