Hi Taku,
we *know* that the site is slow. And we *know* that it's
not because our
server is too small. Our problems are database/lock
related, and putting
stuff on an even bigger server will not help much when
dealing with O(n^x)
problems. What we need to figure out is:
- When are our tables/rows locked and why (this behavior
has changed
drastically with the recent update to InnoDB).
- When are our queries using indices and when are they not,
and why (MySQL
index behavior can be very hard to predict).
Solving these two problems should make Wikipedia very fast.
If we cannot
optimize some queries, we need to think about making them
simpler, or
caching them. (Also note that MySQL now supports
subqueries, which we
don't use yet.) We are dealing with *particular* queries in
*particular*
situations that make Wikipedia slow.
Oh, I see. But are you only talking about searching? I don't think MySQL can be bottleneck of simple displaying pages. Without simply extending server capacity, is it possible to sustain the increase of traffic? (Fortunettely or unfortunettley? the wikipedia seems still to grow.)
Not practical. Too many queries require access to a single
centralized
article database, even an index alone won't suffice. Think
of stuff like
Most wanted, Orphaned pages etc. Besides, it won't make
things any faster
because our problem is not a too small server.
I am not sure I understand you. Whatever algorithm is too stupid, the increase of capacity always makes the site fast, if not . Think of brute-force algorithm. I am not talking about adding just 2 or 3 more servers but possibly hundereds of servers. Maybe I am wrong because I am still not sure how to implement my idea.
Surely we can't have the server that Google or Amazon has. The strength of wikipedia is its democratic structure. Why don't we employ it for hosting too?
Maybe my idea is not practical. Then can you tell me how so?
Bonjour!
The bars drown with ---- are good to separate completely different part of a text, but we may enjoy to use also small bars (for exemple 100 pixel width) to separate paragraph.
What do you think about ----X---- where X is a width in pixel or the screen width percentage.
Wikipedia : ----100---- HTML : <hr align="center" width="100">
Wikipedia : ----50%---- HTML : <hr align="center" width="50%">
Can you please add this feature ? Where is the better place to speak about Wikipedia syntax ?
Aoineko
On Wed, 2003-01-29 at 00:28, Guillaume Blanchard wrote:
Bonjour!
The bars drown with ---- are good to separate completely different part of a text, but we may enjoy to use also small bars (for exemple 100 pixel width) to separate paragraph.
What do you think about ----X---- where X is a width in pixel or the screen width percentage.
Wikipedia : ----100---- HTML : <hr align="center" width="100">
Wikipedia : ----50%---- HTML : <hr align="center" width="50%">
Can you please add this feature ? Where is the better place to speak about Wikipedia syntax ?
It's an interesting idea, but misguided in terms of interface principles (which isn't your fault, since it's been a problem with HTML since the introduction of Mosaic).
If you desire to format articles in a particular style, that should be done with stylesheets, not by explicitly declaring visual elements.
In other words, we should stylistic standards about using horizontal rules in Wikipedia articles first, then worry about implementation.
In general, for many (hopefully self-evident) reasons, Wikipedia style tends to the minimalistic.
Bonjour Guillaume!
What do you think about ----X---- where X is a width in pixel or the
screen
width percentage.
Could you please give an example in which article you need this feature?
I'm not all against adding new wiki tags, but only if we really need them. "Keep it simple!" is our motto :-)
(And the lag of article design features leads to our nice and unique Wikipedia corporate design. Someone in the heise.de article discussion said our site wouldn't be attractive enough for vandals - he meant it literally and as a compliment.)
Kurt
Thank Cunctator & Kurt,
In fact I just start to use it today :oP You can see an exemple in our Bistro (equivalant to english "village_pump"). http://fr.wikipedia.org/wiki/Wikip%E9dia:Le_Bistro
I understand the need of keep the syntax as simple as possible. I don't want to replace the actual ---- by a more complexe syntax, but I may be happy to be able full bar (----) AND small bar. If you prefere, we can use only one small bar as <--> (without parameters)
Wiki: <--> HTML: <hr align="center" width="33%">
or <---->
Aoineko
On Mit, 2003-01-29 at 06:28, Guillaume Blanchard wrote:
Bonjour!
The bars drown with ---- are good to separate completely different part of a text, but we may enjoy to use also small bars (for exemple 100 pixel width) to separate paragraph.
Sorry, but that's a bad idea. One rule of good design is only to use design elements - like fonts, italics etc. - when necessary. Adding more formatting "fluff" will make some people use it excessively, just like the smiley-pictures on messageboards or colors in email clients. This will make Wikipedia more confusing and chaotic. It's bad enough with the normal <hr>, its use should be limited as much as possible.
Regards,
Erik
I didn't solicited a design class ;o) I will use small bar in the french wikipedia and if you don't want to create a WikiSyntax for this I will still using HTML, no problem for me. But I don't understand your argument that give to wikipedian more features will make wikipedia more confuse and chaotic... (same argument that for anchors). You don't trust wikipedians ? Perhaps you ran into trouble on english wikipedia that we don't know on the french one. But, just with basic features, it is possible to create fully unreadable articles ;o) I preview the answer but... can't we add features for one wikipedia in particular (french one for exemple) if all those wikipedians agree ?
Aoineko
On Mit, 2003-01-29 at 06:28, Guillaume Blanchard wrote:
Bonjour!
The bars drown with ---- are good to separate completely different part
of a
text, but we may enjoy to use also small bars (for exemple 100 pixel
width)
to separate paragraph.
Sorry, but that's a bad idea. One rule of good design is only to use design elements - like fonts, italics etc. - when necessary. Adding more formatting "fluff" will make some people use it excessively, just like the smiley-pictures on messageboards or colors in email clients. This will make Wikipedia more confusing and chaotic. It's bad enough with the normal <hr>, its use should be limited as much as possible.
Regards,
Erik
FOKUS - Fraunhofer Insitute for Open Communication Systems Project BerliOS - http://www.berlios.de
Wikitech-l mailing list Wikitech-l@wikipedia.org http://www.wikipedia.org/mailman/listinfo/wikitech-l
On Don, 2003-01-30 at 02:36, Guillaume Blanchard wrote:
But I don't understand your argument that give to wikipedian more features will make wikipedia more confuse and chaotic... (same argument that for anchors). You don't trust wikipedians ?
When it comes to design, no. Wiki tends to develop very chaotically (just look at the many different ways we disambiguate pages), and I would like to see more standards, and perhaps templates, long before new formatting tricks.
But, just with basic features, it is possible to create fully unreadable articles ;o)
Indeed. That's why we don't need more.
I preview the answer but... can't we add features for one wikipedia in particular (french one for exemple) if all those wikipedians agree ?
No way, all Wikipedias run on the same codebase. But you can try to convince other people than myself to support your syntax. I think it's a bad idea.
Regards,
Erik
Erik Moeller wrote:
No way, all Wikipedias run on the same codebase.
And I think this is really important.
But you can try to convince other people than myself to support your syntax. I think it's a bad idea.
The best place to convince other people to support a proposed new syntax is wikipedia-l, rather than wikitech-l, I'd like to add. Wikitech-l is for technical issues, but new syntax falls into the realm of policy issues.
--Jimbo
Thank you for you answers. I understand the need of international harmonization at code level. But each language have its own typographic rules, and perhaps, those rules will need specific technical features. How do you planed to handle that ? For my own problem (be able to use full-bar & small-bar), I will not bother you any more ;o) We can do it with HTML, so if the french wikipedians agree we will still using it. Cheers,
Aoineko
PS : Where the discussion about the WikiSyntax (table,...) are taking place ?
Erik Moeller wrote:
No way, all Wikipedias run on the same codebase.
And I think this is really important.
But you can try to convince other people than myself to support your syntax. I think it's a bad idea.
The best place to convince other people to support a proposed new syntax is wikipedia-l, rather than wikitech-l, I'd like to add. Wikitech-l is for technical issues, but new syntax falls into the realm of policy issues.
--Jimbo _______________________________________________ Wikitech-l mailing list Wikitech-l@wikipedia.org http://www.wikipedia.org/mailman/listinfo/wikitech-l
On Thu, Jan 30, 2003 at 10:53:51AM +0100, Erik Moeller wrote:
On Don, 2003-01-30 at 02:36, Guillaume Blanchard wrote:
I preview the answer but... can't we add features for one wikipedia in particular (french one for exemple) if all those wikipedians agree ?
No way, all Wikipedias run on the same codebase. But you can try to convince other people than myself to support your syntax. I think it's a bad idea.
We will have to make formatting language-specific in the future anyway, to get CJK, right-to-left and other non-Latin languages right.
--- Guillaume Blanchard gblanchard@arcsy.co.jp wrote:
You don't trust wikipedians ? Perhaps you ran into trouble on english wikipedia that we don't know on the french one. But, just with basic features, it is possible to create fully unreadable articles ;o) I preview the answer but... can't we add features for one wikipedia in particular (french one for exemple) if all those wikipedians agree ?
I don't agree :-)))
__________________________________________________ Do you Yahoo!? Yahoo! Mail Plus - Powerful. Affordable. Sign up now. http://mailplus.yahoo.com
--- Guillaume Blanchard gblanchard@arcsy.co.jp wrote:
You don't trust wikipedians ? Perhaps you ran into trouble on english wikipedia that we don't know on the french one. But, just with basic features, it is possible to create fully unreadable articles ;o) I preview the answer but... can't we add features for one wikipedia in particular (french one for exemple) if all those wikipedians agree ?
I don't agree :-)))
__________________________________________________ Do you Yahoo!? Yahoo! Mail Plus - Powerful. Affordable. Sign up now. http://mailplus.yahoo.com
On Mit, 2003-01-29 at 06:05, Takuya Murata wrote:
Oh, I see. But are you only talking about searching? I don't think MySQL can be bottleneck of simple displaying pages.
Displaying a Wikipedia page is far from simple. The pages are stored in the table as wikitext, not as HTML, and are rendered dynamically. Rendering a Wikipedia article requires, for example, to look up all the links contained in it and determine if the pages exist or not.
Without simply extending server capacity, is it possible to sustain the increase of traffic?
Actually, we had a *decrease* in traffic in the last month due to the Google hiccups. We should be able to cope with much higher traffic if we optimize our queries. Note that pure bandwidth is not a problem; the database tarball downloads are very fast. Ask Brion for the server specs and be impressed.
I am not sure I understand you. Whatever algorithm is too stupid, the increase of capacity always makes the site fast,
1) If the algorithm doesn't scale linearly, expanding your server linearly will gain you almost nothing. If an increase in edits by a factor 10 will lead to a 100 decrease in performace, you need to stop buying hardware and look at what you're doing wrong. Of course, if you keep optimizing and you don't gain anything, you need to think about buying hardware, but this is not the case here.
2) You cite Google as an example of a huge centralized database, which is untrue. Google is actually an example of a highly distributed database, using >10,000 Linux servers. When the index is updated, it takes a while for the index updates to populate to all those servers, even though they're essentially in the same building and using high bandwidth connections. The "Google dance": http://www.wikipedia.org/wiki/Google
With a distributed architecture that is actually hosted in different locations with different bandwidth, and with updates not coming from the inside but from the outside *all the time*, in addition to our highly complex queries, this kind of sync operations would be virtually impossible to do properly, unless you move much stuff to a central server, in which case you gain very little by distributing.
Trust me, wiki is *very* hard to decentralize. It's a nice idea, but it will take years until it happens. You need an architecture like Freenet ( http://freenetproject.org ), only scalable (which Freenet is not), plus SQL-like query support.
Regards,
Erik
Erik Moeller wrote:
Trust me, wiki is *very* hard to decentralize. It's a nice idea, but it will take years until it happens. You need an architecture like Freenet ( http://freenetproject.org ), only scalable (which Freenet is not), plus SQL-like query support.
I think Erik is exactly right on this.
On Wed, Jan 29, 2003 at 03:26:38AM -0800, Jimmy Wales wrote:
Erik Moeller wrote:
Trust me, wiki is *very* hard to decentralize. It's a nice idea, but it will take years until it happens. You need an architecture like Freenet ( http://freenetproject.org ), only scalable (which Freenet is not), plus SQL-like query support.
I think Erik is exactly right on this.
Indeed. Wikipedia is becoming more and more a true complex database-oriented application and as such is very likely to actually *suffer* in performance if we start distributing. Can anybody say two phase locking protocol?
Hopefully people will realize that the database is the bottle-neck (and not PHP, for example, programming the stuff in C is a waste of time) and that a good database design (along with good SQL design) is crucial here.
Speaking of which, why on earth was it decided to start using locking? IMO we don't really need it and AFAICT this wreaks havoc on the performance. A very bad decision if you ask me.
-- Jan Hidders
.---------------------------------------------------------------------. | Post-doctoral researcher e-mail: jan.hidders@ua.ac.be | | Dept. Math. & Computer Science tel: (+32) 3 820 2427 | | University of Antwerp, room J1.06 fax: (+32) 3 820 2421 | | (UIA) Universiteitsplein 1, B-2610 Wilrijk - Antwerpen (Belgium) | `---------------------------------------------------------------------'
Speaking of which, why on earth was it decided to start using locking? IMO we don't really need it and AFAICT this wreaks havoc on the performance. A very bad decision if you ask me.
You misunderstood. We are not locking explicitly. MySQL locks implicitly in certain situations, though, and with InnoDB we now use row level locking instead of table locking.
Regards,
Erik
wikitech-l@lists.wikimedia.org