I read the mysql book and they said we shouldnt use replication for Read and Write.
We have to set up a master server, only one for writing, and slaves server for reading. We dont need to have a permanent connection between master and slaves, that's fun !
But as far as I understood, we need to lock the master for each new slave, this could take time with Go of data we have....
-----Original Message----- From: Tim Starling [mailto:ts4294967296@hotmail.com] Sent: 22 September 2003 09:24 To: wikipedia-l@wikipedia.org Subject: [Wikipedia-l] Re: Mirroring (was Re: Fundraising)
Tim Starling wrote:
We could even set up a full read-write server on the other
side of the
world, and redirect users to a different domain name as
they arrive,
based on their location. Users could specify their
preferred mirror in
their user preferences. We could even make larousse the default for logged-in users, that way most edits go to a web server
which is close
to the master DB.
Actually, it's not really mirroring then. What would you call it? A distributed cluster?
-- Tim Starling.
Wikipedia-l mailing list Wikipedia-l@Wikipedia.org http://mail.wikipedia.org/mailman/listinfo/wikipedia-l
Constans, Camille (C.C.) wrote:
I read the mysql book and they said we shouldnt use replication for Read and Write.
We have to set up a master server, only one for writing, and slaves server for reading. We dont need to have a permanent connection between master and slaves, that's fun !
Sorry, I was kind of writing in my own special language again. You're quite right CC, in fact that's basically what I meant.
My idea is to have multiple web servers, each with their slave database. There will be one master database, presumably remaining on pliny for the time being. When a web server needs to read the database, it reads its own local copy. When it needs to write, it contacts pliny (whether pliny is in the next rack or on the other side of the world), and sends its updates. Pliny will send all updates to the slaves.
Larousse may also send some read queries to pliny, if pliny isn't too busy.
But as far as I understood, we need to lock the master for each new slave, this could take time with Go of data we have....
We already do this for backup. We would just have to record where in the binary log each backup occurs. Then we would copy the backup dump, set up the database, inform the slave of where it's up to in the binary log, and then start the replication.
Andre Engels wrote:
Would it not be better to keep the mirrors read-only, and have them redirect to the master for write-access? To have writing in several places causes significant overhead in avoiding edit conflicts and such.
I'm not too worried about edit conflicts, due to Brion's great new detection method. It's the "and such" that has me slightly anxious -- i.e. stuff that we haven't thought of yet. At the moment, I don't know of any particular overhead other than network lag.
It will be slower to write from a remote server, which is why I suggested editors be redirected to larousse when they log in. About 90% of editing activity is performed by logged in users.
-- Tim Starling
After writing some articles on animals I noticed two things which I would like to discuss. I would like to know how I can help to improve them.
1) wikipedia seems to duplicate a lot of information which could be avoided. 2) there currently seems to be no way to change information in more than one article at once.
I would like to give a few examples to illustrate this:
Ad 1: Most articles on animals use some kind of table, which is duplicated for each article. Also, many articles may use the same references which are duplicated on many pages. This is seen in a lot of other articles too and is a huge waste of hard disk space and performance.
Ad 2: when I write many articles about closely related species, I may use one article as a template and copy parts of it. If I discover a typo in this copied part, I have to change it for each article by hand. Another example would be if we decide to change the background colour of the table used by most articles on animals. This may be no problem for a few articles, yet with hundreds or thousands of articles it is a huge waste of human resources.
The solution would probably be to use meta data. I tried to discuss this on wikipedia, but the discussion was moved to feature requests. An argument against the use of meta data and templates was that it would make it harder to contribute to wikipedia. An Argument which I do not really understand.
So my real questions are: who is currently working on the database schema for wikipedia and who decides if it would be useful to change this schema?
Cheers,
Jurriaan
wikipedia-l@lists.wikimedia.org