On Friday 12 July 2002 12:01 pm, you wrote:
The developers seem totally clueless about performance issues. There is a constant focus on new functions, and none on performance and response times.
Much is now being done to remedy performance problems -- so I do believe what you said is needlessly rude (even if there is a grain of truth to it). This is an issue that has crept up upon the developers as new features were added -- many of which were asked for by the users.
Software development seems to often work a lot like article development -- many new things are added by different people, but after a point the software or article needs to be heavily edited and reorganized to resore efficiency and flow (although, admittedly, the php software never was /very/ efficient to my recollection -- it still rocks though).
--maveric149
At 02:29 AM 7/13/02 -0700, maveric149 wrote:
Much is now being done to remedy performance problems -- so I do believe
what
you said is needlessly rude (even if there is a grain of truth to it).
My experience with coders has been in MUDs. Coders like to make new areas, tweak races and guilds, etc. Performance issues are not nearly as attractive nor is success in dealing with them easily attainable or recognizable to others. Like a MUD we are relying on voluntary coders so, sure, courtesy is necessary.
But so is focus. Right now Wikipedia is like some Ionesco play with a dead body in the room that everyone inexplicably ignores.
Fred Bauder
Have a look at the new software which (thanks to Lee) will go online on the new server (thanks to Jimbo) today in a week: http://beta.wikipedia.com
For example, http://beta.wikipedia.com/wiki/Performance_tuning is generated by the script in 0.08 seconds (check the bottom of the HTML source). You were saying? ;)
Fred Bauder wrote:
At 02:29 AM 7/13/02 -0700, maveric149 wrote:
Much is now being done to remedy performance problems -- so I do believe
what
you said is needlessly rude (even if there is a grain of truth to it).
My experience with coders has been in MUDs. Coders like to make new areas, tweak races and guilds, etc. Performance issues are not nearly as attractive nor is success in dealing with them easily attainable or recognizable to others. Like a MUD we are relying on voluntary coders so, sure, courtesy is necessary.
But so is focus. Right now Wikipedia is like some Ionesco play with a dead body in the room that everyone inexplicably ignores.
Fred Bauder
[Wikipedia-l] To manage your subscription to this list, please go here: http://www.nupedia.com/mailman/listinfo/wikipedia-l
Hi Fred,
I'm not a Wikipedia coder, I mostly contribute articles. Wearing my contributor hat, I've been really irritated by the disastrous performance in the last couple of months. My style of editing is pointillistic, with lots of tiny edits, even breaking down large articles into a series of smaller commits. I can understand that the poor performance will have discouraged many people.
Having said that, I am helping test the new system at the moment, and I can say that the coders are putting a lot of effort into making the system fast and smooth. I think they've taken the right decision, to give up on maintaining the old system, and work instead on getting the new system up and running really well before putting it into service ASAP. That means testing it really hard before going live.
Here's a progress report.
The new server is a dual-Athlon multiprocessor SMP machine with 2G of RAM, and an ext3 journalling filesystem.
At the moment, the new server is running a hugely expanded database containing over 80,000 'articles', making a total of over 100,000 entries in the database. This consists of a snapshot of the entire Wikipedia, together with about 50,000 machine-generated articles. It's serving around 1.2 articles per second under as realistic a traffic load as we can generate using lots of bots running on external servers - the traffic is nice and bursty, and the system is regualrly peaking at 12 articles/second or more.
77% of pages are served in less than a second: what I'd consider to be 'instant' 89% of pages are served in less than 2 seconds: what I'd consider to be 'quick' 98% of pages are served in less than 5 seconds, about the start of the threshold of irritation 99.3% of pages are served in less than 10 seconds, a significant delay
Only 0.1% of articles take longer than 20 seconds to serve: what I would regard as failures. About 30% of these are very big special pages. I'm trying to do a finer analysis of the rest, and find the exact conditions that seem to trigger the hiccups.
What this means, I hope, is that when the new server comes into service it will perform really well under normal loads, and editing will be smooth and lovely again.
Neil
I don't see how anyone can say that performance issues are being ignored. Jimbo bought a new server specifically for Wikipedia. Mr. Crocker re-designed the PHP script from the ground up, and Neil has been running different kinds of bots on the news server/software to test it under stressful conditions.
Don't you guys read the mailing list? :)
-- Stephen Gilbert
--- Fred Bauder fredbaud@ctelco.net wrote:
At 02:29 AM 7/13/02 -0700, maveric149 wrote:
Much is now being done to remedy performance
problems -- so I do believe what
you said is needlessly rude (even if there is a
grain of truth to it).
My experience with coders has been in MUDs. Coders like to make new areas, tweak races and guilds, etc. Performance issues are not nearly as attractive nor is success in dealing with them easily attainable or recognizable to others. Like a MUD we are relying on voluntary coders so, sure, courtesy is necessary.
But so is focus. Right now Wikipedia is like some Ionesco play with a dead body in the room that everyone inexplicably ignores.
Fred Bauder
[Wikipedia-l] To manage your subscription to this list, please go here: http://www.nupedia.com/mailman/listinfo/wikipedia-l
__________________________________________________ Do You Yahoo!? Yahoo! Autos - Get free new car price quotes http://autos.yahoo.com
On Sun, 14 Jul 2002, Stephen Gilbert wrote:
I don't see how anyone can say that performance issues are being ignored. Jimbo bought a new server specifically for Wikipedia. Mr. Crocker re-designed the PHP script from the ground up, and Neil has been running different kinds of bots on the news server/software to test it under stressful conditions.
The performance problem is a very serious issue, an emergency. I don't know if it's been done already, but the front page should have a prominent notice about what the problem is, and what's being done about it.
Has this been done already? The performance problem prevents me from actually checking... :)
Anyway, such a notice would put a lot of contributers at ease I expect. That _something is being done_ is a very good feeling, almost as good as having it work.
-- Daniel
On Sat, 13 Jul 2002, Daniel Mayer wrote:
Much is now being done to remedy performance problems -- so I do believe what you said is needlessly rude (even if there is a grain of truth to it). This is an issue that has crept up upon the developers as new features were added -- many of which were asked for by the users.
If performance issues creep upon you, then you have not designed your system for performance measurement and monitoring. This means you are clueless. It is like driving a car without a speedometer, and being all surprised when you are caught for speeding.
In the last week, my script made 410 attempts at 20 minute intervals to reach the page http://www.wikipedia.com/wiki/Chemistry Out of these, only 86% were served in less than 5 seconds. Five percent of the calls timed out (my limit is 60 seconds). Now, this is far better than the worst problems that Wikipedia saw in April or May, but it is still pretty miserable. The non-English Wikipedias feature very similar numbers.
The Sevilla project (http://enciclopedia.us.es/) serves 96% of all my attempts in under 2 seconds, and 99% in under five seconds. This should probably be attributed to luck rather than skill, but it helps move people from the Spanish Wikipedia over to the breakout project.
Software development seems to often work a lot like article development --
That's OK, but just like the basic Wiki software defines the concept of an article (it can be written, reviewed, its history tracked, modified, removed), the software should define a framework for new functionality that can measure its impact on performance, and turn it on or off. Think modules.
--- Lars Aronsson lars@aronsson.se wrote:
If performance issues creep upon you, then you have not designed your system for performance measurement and monitoring. This means you are clueless. It is like driving a car without a speedometer, and being all surprised when you are caught for speeding.
Correct. That's why the old script has been abandoned and re-written.
Here's a quick Wikipedia software history for the curious (if there are any errors, please correct them...). When we were still using UseModWiki, there was a general agreement to move to a SQL-based system, but there wasn't a satisfactory option available. Magnus was the only one willing to give it a go, and so he put together the PHP/MySQL combo we're currently using. I believe Magnus is a biologist who does some coding, not a software engineer; his code worked but was not "engineered", so to speak. After the script was deployed, various developers stepped in to help develop the software. However, there were too many performance problems, and so Lee Danial Crocker took up the challenge and redesigned the whole thing. The new server running the new code will replace our current setup as soon as it is ready.
-- Stephen Gilbert
__________________________________________________ Do You Yahoo!? Yahoo! Autos - Get free new car price quotes http://autos.yahoo.com
Here's a quick Wikipedia software history for the curious (if there are any errors, please correct them...). When we were still using UseModWiki, there was a general agreement to move to a SQL-based system, but there wasn't a satisfactory option available. Magnus was the only one willing to give it a go, and so he put together the PHP/MySQL combo we're currently using. I believe Magnus is a biologist who does some coding, not a software engineer; his code worked but was not "engineered", so to speak. After the script was deployed, various developers stepped in to help develop the software. However, there were too many performance problems, and so Lee Danial Crocker took up the challenge and redesigned the whole thing. The new server running the new code will replace our current setup as soon as it is ready.
Besides me being a mere biologist:) IMHO part of the problem of "my" software is this: When I started, the aim was to replace the UseModWiki, and add new functions. But it was unclear which functions, and how to implement them. Some of the UseModWiki functions I implemented were removed (mainly the subpages). Some ideas for new functions were implemented rather early (e.g., namespaces), some later (e.g., orphans). Some became popular (Most Wanted), some were dropped (category functions, AutoWikification). So, the whole script was constantly under development, trying new things, removing others, altering, rewriting etc.
Lee's rewrite does not add many new features on the outside (except the image namespace). The idea was that, now we have a better idea of what the software should actually do, it can be coded more structured and technically sound (I confess, Lee's code looks *a lot* better than mine;) For example, Lee's software uses the database with a special field for each article that marks redirect pages. That is very useful to sort out redirect pages, as it is very fast. When I started my software, I didn't realize how important this distinction would become, and thus my script always makes MySQL compare the article text for something like "#REDIRECT %", which is quite inefficient when used a lot.
This is one of the reasons why Lee's software ("Phase III") currently handles >130.000 pages (>100.000 articles) on the test server, and still generates the Main Page in less than 0.9 seconds.
Magnus
Darn, I was just going to say that the wikipedia performance seems to have improved again... last night I was able to keep on working on entries until I went to bed, when usually it starts timing out at around 7pm. However, it's just become non-responsive again at midday :(
I can't wait for the new version to be installed... hope that it works a LOT better!
Besides me being a mere biologist:)...
Never let it be said that I called you "mere", good sir.
--Stephen Gilbert
__________________________________________________ Do You Yahoo!? Yahoo! Autos - Get free new car price quotes http://autos.yahoo.com
wikipedia-l@lists.wikimedia.org