RE: [Wikitech-l] Database shema : let's start the fun
by Constans, Camille (C.C.)
>
>Ashar Voultoiz wrote:
>> Hello !
>>
>> Sometime ago there was a lot of discussion about a database shema
>> change. I am wondering if that's a dead-end horse or if we can
>> eventually start discussion again and eventually got the
>changes in for
>> version 1.4 ?
>
>The main holdup so far has been the knowledge that we'll need
>to convert
>the current Wikipedias, so any change needs to be constrained
>to reduce
>downtime.
>
>The biggest changes that we'd like to do are:
>
>* Put current and old revisions together, to avoid having to
>juggle data
>between cur and old all the time and to allow a consistent revision
>numbering scheme (permalinks for current revision as well as the olds)
Personnaly I dont like the idea to put old and cur together, it will be more difficult to setup a dump job :)
>* Separate page information like the title from the individual
>revision
>data, so page renames of longstanding pages don't bog down the server
>touching thousands of rows
Dont really needed if 1 row = 1 page (old+cur) ?
>* Separate the actual *text* out of the revision information table so
>that contribs, history, etc don't pull so much junk into memory.
Yes, separate it, will improve speed of special pages :)
>And possibly:
>* Add language-specific title sort key field
Is it possible ?
>* Add case-insensitive 'canonical title match' field
for desambiguation ?
>Now as you can imagine changing the cur and old tables is
>about the most
>traumatizing thing we could do to the servers. :)
Well, doing it on ariel should be fast, the main pb is space :) We dont have enough actually to create a big temporary table.
>A 'minimal effort' change could be to move all cur revisions into old,
>then rebuild a 'page' table from the cur data and let 'old' be a
>revisions table. I believe Jamesday has been experimenting with seeing
>how slow/fast this kind of conversion might be...
I think all changes should be tested first on suda, and all queries we use should be benchmarked.
shaihulud