On Sat, 12 Jul 2003, Timwi wrote:
- BLOBs that store article text are combined in the same table as
meta-data (e.g. date, username of a change, change summary, minor flag, etc.). This is bad because variable-length fields like BLOBs negatively affect the performance of reading the table.
How much of a difference does this make when we're usually taking single rows found via an index?
One table, 'userproplist', stores the possible user properties (userprops), and another table, 'userprop', stores what user has what userprop with what value. This way all that is needed for adding a new userprop is adding a single row to the 'userproplist' table.
Potentially interesting...
- BLOBs that fulfill the same function (article text) are scattered
across two tables (cur and old).
Yeah, this is bad mojo. They should be combined in one 'revisions' table.
- You are using a 'recentchanges' table which, I presume, gets also
updated with every edit. This, I assume is the idea behind it, allows the 'Recent Changes' page to quickly grab the most recent changes without having to find them elsewhere in the DB.
There were two reasons for this: first, cur and old being separate and no good way in MySQL to read them together. Second, old mysql couldn't do an inverse sort on an index, so sorting by reverse timestamp was, well, suckage. :) This is remedied already using a hackish 'inverse_timestamp' column, which itself is no longer really needed on mysql 4 which can sort descending on indexes.
Recentchanges would be irrelevent with cur and old combined as they ought to be, plus a key in the revisions table for "first edit" so we know which to mark as "new".
- Store translated website text, so translators don't have to dig
through PHP code and submit a file to the mailing list.
We certainly could do this, though there are performance concerns whenever the idea is brought up. Caching the strings in shared memory may alleviate this.
There's been some talk of adapting the translation system we use at some of Esperanto cxe Interreto's sites, such as http://lernu.net/, so the interface and source-file-scanner doesn't have to be written from scratch. I haven't looked at the table structure used, but I imagine it's a fairly straightforward language-key-string triplet set.
- A global table for bidirectional inter-wiki links. People should not
have to add the same link to so many articles.
There's an experimental table for interwiki links, but it's not entirely the best setup. It's questionable whether bidirectional is really right, though, as there's not always a 1:1 matchup between articles.
Are you all still convinced that adapting the current code to all these radical changes is easier than rewriting it all from scratch? :-)
Yes, certainly.
-- brion vibber (brion @ pobox.com)