Brion Vibber wrote:
On Sat, 12 Jul 2003, Timwi wrote:
- BLOBs that store article text are combined in the same table as
meta-data (e.g. date, username of a change, change summary, minor flag, etc.). This is bad because variable-length fields like BLOBs negatively affect the performance of reading the table.
How much of a difference does this make when we're usually taking single rows found via an index?
Hm. Good point. I haven't thought about it in that much detail -- I've only been "taught" this from experience on LJ. I think it has something to do with hard disk seeks and stuff -- very technical. Regardless, though, it should be clear that at least *updating* a table with variable-length fields is quite a lot more complex than updating one without.
- Store translated website text, so translators don't have to dig
through PHP code and submit a file to the mailing list.
We certainly could do this, though there are performance concerns whenever the idea is brought up. Caching the strings in shared memory may alleviate this.
It's worked perfectly on LJ ever since it was introduced. Additionally, I'm keeping MemCacheD in mind while thinking this all through. It would probably keep all the text in memory all the time.
There's been some talk of adapting the translation system we use at some of Esperanto cxe Interreto's sites, such as http://lernu.net/, so the interface and source-file-scanner doesn't have to be written from scratch.
Well, one of the problems I find with that system is that it is too easily vandalisable. If you vandalise a single Wiki page, that doesn't matter too much, it can be reverted within a few minutes, but if you vandalise, say, the wording of the "Edit this page" link, it would affect everybody who would visit Wikipedia within the minutes it takes someone to revert it.
I can see several ways of doing this:
* Restrict access to assorted people. Of course, this is un-wiki-like and not a real lot better than modifying LanguageXY.php.
* Make changes in the translatable strings not take effect until they have been kept unchanged for 24 hours. Some trusted few could be given the privilege to be able to change the strings directly (in case, for example, a vandalism goes unnoticed for 24 hours).
However, for this to work, the changes in the translation system should appear on the Recent Changes pages with the Wiki pages. Perhaps they *should* be Wiki pages with their own namespace (String:XYZ?). Which in turn would deviate from the concept that lernu.net uses.
I haven't looked at the table structure used, but I imagine it's a fairly straightforward language-key-string triplet set.
I don't know about lernu.net either, but as for LJ, it's a little bit more complex than that. If you're interested, LJ's database is here: http://www.livejournal.com/doc/server/ljp.dbschema.ref.html The tables beginning with "ml_" are the ones pertaining to the translation system.
- A global table for bidirectional inter-wiki links. People should not
have to add the same link to so many articles.
There's an experimental table for interwiki links, but it's not entirely the best setup. It's questionable whether bidirectional is really right, though, as there's not always a 1:1 matchup between articles.
Colour me ignorant, but why shouldn't there always be a 1:1 matchup? Maybe there isn't now, but articles certainly can (and perhaps they should) be changed to comply. Or do you know of a particularly striking example where it should not?
Oh, by the way. How would you prefer to do the conversion from the old database to the new? Myself, I thought perhaps we could have the software do this whenever an article is edited by a user. This way, we don't have to take Wikipedia down for the time it takes to convert the entire database.
Are you all still convinced that adapting the current code to all these radical changes is easier than rewriting it all from scratch? :-)
Yes, certainly.
Okay then. I'll take your word for it and learn some PHP. I'll create a preliminary SQL table-creation script for all this tomorrow. Which is really today, but I should really go to bed first...
Good night, Timwi