Wikitech-l July 2002

wikitech-l@lists.wikimedia.org

20 participants
75 discussions

Re: what's being done about performance?
by Mark Christensen 14 Jul '02

14 Jul '02

> This is fine, but you are trying a new car, and I want you to mount a > speedometer so we can monitor the speed while driving, continuously > keeping an eye on the problem instead of making believe we can fix it > once and for all. I agree that we need to have more clear data about performance numbers, and that we need to monitor that data over time. My main point was to say that 1) we've got a faster and more adaptable code base in the wings, and 2) this will give us time to do performance monitoring right, as well as to think clearly about which things to optimize. As I said before, I think everybody is trying to solve this problem, and that the developers are clearly moving in the right direction. I'm no performance expert, but as a jack of all trades and master of none, I'd like to toss out a few thoughts. I'm concerned that we don't try to gather that data in a way which does creates performance problems. My experience is that it is a bad idea to make every function append information to log file. I've never seen someone append to a log file on every function call without a noticeable effect on performance, at least for well designed ASP or JSP sites which have already minimized disk I/O. I assume the same would be true with PHP. However, I having just read Jimbo's e-mail, and it could be the I/O performance of the underling system we were using -- mostly WinNT on slightly older machines -- that was to blame. That said, I'm aware of three common ways to get around this performance bottleneck. Most commonly, people just turn up the detain on their web server reporting and parse those logs for the performance information they can get from there. The problem with this is that it is sometime fails to provide the fine grained information we'd need to see exactly which pages are causing the slowdown. Second, I've seen a lot of people storing log information in a database. This is probably what we should do if we need more precise information than we can get from the server logs. The third solution is the most flexible, which is to set up a thread which asynchronously processes a queue of log data and occasionally writes the information to disk. This is clearly the most complex to implement, and I wouldn't even consider it unless we decide we need realtime performance monitoring, or there's something built into PHP to do this. Regardless of how we choose to get this data, we can and should still be thinking about how to optimize those functions which the server spends a LOT of time doing, or those things which take a LONG time, as wikipedia usage could very well continue to scale up dramatically, and we need to be ready for this ahead of time. On the other hand, there is a cost to optimizations, if only in the added complexity of the code. Complex code is more difficult to update and maintain, and it is less likely to attract new developers over time, so we need to make considered choices about what functions to 1) leave unutilized, 2) optimize, or 3) remove altogether.

1 0

RE: [Wikitech-l] Re: [Wikipedia-l] what's being done about perfor mance?
by Mark Christensen 14 Jul '02

14 Jul '02

> In the last week, my script made 410 attempts at 20 minute > intervals to reach the page http://www.wikipedia.com/wiki/Chemistry > Out of these, only 86% were served in less than 5 seconds. > Five percent of the calls timed out (my limit is 60 seconds). > Now, this is far better than the worst problems that Wikipedia > saw in April or May, but it is still pretty miserable. I believe the performance in the new code is much improved. Even under all the load I can put on it with a T1 line, the beta software is producing average load times of less than 1.5 seconds. A number of people are now working on stress testing this software before it is put into production. And I think there is a general commitment to solving the performance problem, and I see lots of movement in the right direction.

4 3

Dummy page 95200
by Neil Harris 14 Jul '02

14 Jul '02

I've just looked at this page: can you see how the italic markup seems to have propagated into the quickbar? And the W3C validator doesn't like it either: see http://validator.w3.org/check?uri=http%3A%2F%2Fbeta.wikipedia.com%2Fwiki%2F… for details. The cause seems to be unescaped '&'s, at first sight. Neil http://beta.wikipedia.com/wiki/Dummy_page_95200

1 0

Re: [Wikipedia-l] what's being done about performance?
by Neil Harris 14 Jul '02

14 Jul '02

Eliminating 'unsuccessful search' and 'special' pages from the count gives the following stats: Analysing 100,000 lines from the raw log with this filtering gives: bin in seconds, total pages, cumulative percentage 0 57360 83.443651% 1 6929 93.523516% 2 2028 96.473720% 3 1034 97.977917% 4 640 98.908948% 5 314 99.365735% 6 157 99.594129% 7 81 99.711962% 8 61 99.800701% 9 46 99.867619% 10 18 99.893804% 11 12 99.911261% 12 16 99.934537% 13 13 99.953448% 14 6 99.962177% 15 6 99.970905% 16 6 99.979634% 17 2 99.982543% 18 0 99.982543% 19 3 99.986907% 20 2 99.989817% summary 68741 hits in 41366.343 secs, avg = 0.601771039118 only 9 non-special pages took over 20 seconds: here they are: 20020713011714 28.783 /wiki/Historical_anniversaries 20020713012523 20.301 /wiki/Sport 20020713014205 23.161 /wiki/Federal_Standard_1037C 20020713014723 25.357 /w/wiki.phtml?title=Free_On-line_Dictionary_of_Computing/O _-_Q&redirect=no 20020713015936 21.513 /w/wiki.phtml?title=Wikipedia:Bug_reports&action=history 20020713022203 25.252 /wiki/Free_On-line_Dictionary_of_Computing/L_-_N 20020713025105 29.975 /w/wiki.phtml?title=Free_On-line_Dictionary_of_Computing/E_-_H&redirect=no 20020713033140 20.802 /wiki/Feature_requests 20020713043401 41.392 /w/wiki.phtml?title=Complete_list_of_encyclopedia_topics/R&diff=78830&oldid=71983 It's interesting to note that random spidering hits 'special' pages about 30% of the time. Where the page accesses have been binned by the integer part of their service time as recorded in the logs. This is looking really good. ------------------------------------------------- SUGGESTION #1: Looking at the logs suggests that many of the worst results are generated on the special page options with large counts -- particularly the versions with count=5000. Here's my proposal: we should not list the options with count > 500 for users *that are not logged in*. So, at the bottom of the orphans page, a logged in user would see View (previous 50) (next 50) (20 | 50 | 100 | 250 | 500 | 1000 | 2500 | 5000). and an casual browser (and any busy bots or spiders) would see View (previous 50) (next 50) (20 | 50 | 100 | 250 | 500 ). Random selection from the first list will search on average 50+50+20+50+100+250+500+1000+2500+5000 / 10 = 952 pages Random selection from the second list will search on average 50+50+20+50+100+250+500 / 10 = 102 pages a reduction in load of almost an order of magnitude. Removing these big outlier loads may well take some of the strain off ordinary page loads that happen to occur at the same time. ------------------------------------------------ SUGGESTION #2: The 'Unsuccessful search' pages can be enormous. They accumulate all the bad searches in a whole month. As Wikipedia becomes more popular, they have become huge, and they now take a long time to load. We should make these weekly or daily instead of monthly, and perhaps split up the old ones using a script. This will also have the effect of improving the 'most wanted' rating of frequently missed searches, as currently only one instance a month counts. Or perhaps they should be generated as a special page from the database? --------------------------------------------------- Neil

2 2

Saturating the server
by Axel Boldt 14 Jul '02

14 Jul '02

I used the nice hammerhead tool (http://hammerhead.sourceforge.net) to stress test the beta.wikipedia.com server. The tool lets you simulate several simultaneous users repeatedly requesting pages from (or posting to) the server. I have my users request RecentChanges (33% of the time) and issue searches (66% of the time). Here's the average response time of the server: 1 user: 2 sec 5 users: 4 sec 10 users: 8 sec 20 users: 11 sec 100 users: 42 sec These are only the times it took the server to respond; the actual total time to complete a request is not as useful in my case because of my limited bandwidth. Axel

1 0

Better apache server logs
by Axel Boldt 14 Jul '02

14 Jul '02

The apache server allows for customized server log messsages (http://httpd.apache.org/docs/logs.html#accesslog). I think we should include the directive %T, which reports the time it took to serve a request. That way, we could process the server logs to pinpoint the precise conditions which cause requests to take a long time. Adding the line LogFormat "%h %l %u %t \"%r\" %>s %b %T" custom to httpd.conf and changing CustomLog /usr/local/apache/logs/access_log common to CustomLog /usr/local/apache/logs/access_log custom should do the job. Axel

1 0

Re: [Wikipedia-l] what's being done about performance?
by Lars Aronsson 13 Jul '02

13 Jul '02

On Sat, 13 Jul 2002, Daniel Mayer wrote: > Much is now being done to remedy performance problems -- so I do believe what > you said is needlessly rude (even if there is a grain of truth to it). This > is an issue that has crept up upon the developers as new features were added > -- many of which were asked for by the users. If performance issues creep upon you, then you have not designed your system for performance measurement and monitoring. This means you are clueless. It is like driving a car without a speedometer, and being all surprised when you are caught for speeding. In the last week, my script made 410 attempts at 20 minute intervals to reach the page http://www.wikipedia.com/wiki/Chemistry Out of these, only 86% were served in less than 5 seconds. Five percent of the calls timed out (my limit is 60 seconds). Now, this is far better than the worst problems that Wikipedia saw in April or May, but it is still pretty miserable. The non-English Wikipedias feature very similar numbers. The Sevilla project (http://enciclopedia.us.es/) serves 96% of all my attempts in under 2 seconds, and 99% in under five seconds. This should probably be attributed to luck rather than skill, but it helps move people from the Spanish Wikipedia over to the breakout project. > Software development seems to often work a lot like article development -- That's OK, but just like the basic Wiki software defines the concept of an article (it can be written, reviewed, its history tracked, modified, removed), the software should define a framework for new functionality that can measure its impact on performance, and turn it on or off. Think modules. -- Lars Aronsson (lars(a)aronsson.se) tel +46-70-7891609 http://aronsson.se/ http://elektrosmog.nu/ http://susning.nu/

1 0

Suggested mechanism for math formulas and other on-the-fly graphics
by Axel Boldt 12 Jul '02

12 Jul '02

I thought I'd bring up this idea again since it might be easy to implement with the new codebase. If you put text such as [$\int_{x=0}^\infty x^2 dx$] in Wiki, upon saving the article, TeX will be called and translate the formula into an image, and store the image on the server and its name in a database indexed with the formula text. When the Wiki page is presented, the image is inlined (and an alt attribute containg the formula text added). When the page is later edited and saved again, the system first checks whether an up-to-date image of the formula already exists; if not, TeX is called to regenerate it. This would make mathematicians, computer scientists, physicists and chemists happy. TeX includes a package for typesetting chemical structure formulas and another one for quite general labeled diagrams and trees. There's also a TeX package which allows to typeset musical notes and another one for chess positions. The concept could be expanded to other programs which can produce graphics on-the-fly based on a textual description. This includes gnuplot (graphs of functions) and maybe packages such as GD, imagemagick or even GIMP. Axel

4 3

Re: [Wikitech-l] Stress testing: odd observation
by Neil Harris 11 Jul '02

11 Jul '02

The alphabet-soup filling of the dummy articles has added lots of junk to the search indices, which is good. Try doing a search for the three-letter string 'vyf'. It brings up lots of articles containing 'vyf' in various combinations of upper and lower-case. But only one of them has a keyword-in-context display, which seems strange. Neil

1 0

Stress testing
by Axel Boldt 11 Jul '02

11 Jul '02

Neil, maybe you could pepper your stress testing with some calls to special functions; I would imagine that searching and RecentChanges are the most important ones, but the more the better. Axel

2 1

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Wikitech-l July 2002