Today Friday, the front page of the English Wikipedia has been fast
all day.
Another page (I monitor http://www.wikipedia.com/wiki/Sweden) was slow
for one period of 30 minutes (09:30-10:00 am GMT) and another period
of two hours (11:40-13:50 GMT). Some other URLs on the international
Wikipedias were also affected at the same time. This might be due to
maintenance or work being done on the scripts.
Subtract 7 hours from GMT to get the server's local time zone
(PDT = GMT -0700).
Apart from these two limited intervals, every URL that I monitor have
been fast all day, including the recent changes pages.
I'm very happy with this, and hope Brion and Jimmy (and who else?)
will soon get the talk namespace links back without hurting
performance. (But hey, never make big fixes five minutes before you
leave for the weekend! Better just leave it as is if you have to go.)
And now for some more relaxed Friday reading, actually related to
performance problems. (The following analysis might be politically
slanted. Don't take it too seriously.) The Swedish parliament
elections are coming up in September, so the political parties are
starting up their campaigns. The problem is there are no big issues
to fight about. The four non-socialist parties have unusually boring
candidates (Dukakis style), and everybody expects the current
social-democratic government to win. The single issue that seems to
be coming up is the national sick leave insurance, which is paid by
tax money, and far over budget. This is linked to the fact that
"burn-out" is now an accepted medical diagnosis for which you are
allowed to take a long sick leave on the tax payers' expense. You
would expect such welfare excesses to be on the social democrat
agenda, and that non-socialists would urge for tax cuts and a balanced
budget. However, the current s-d govt has been doing a great job
balancing the budget, and they will now have to deal with cutting back
this overgenerous sick leave compensation without hurting their
voters' feelings. Tough job. The Christian-democratic party's
candidate has already hurt a lot of feelings by claiming that "some"
of those receiving compensation are "cheating the system". That might
be true, but accusing "some" (who? me?) is obviously not the way to
attract voters. This issue now has media attention and some
interesting example cases are reported.
Like this one: Attorneys in Swedish district courts have been
right-sized in the past years, as part of balancing the budget. This
means that as soon as one gets sick, the rest get too much to do,
leading to stress and burn-out, which leads to more sick leaves.
Think of the court cases as HTTP requests arriving to Wikipedia.
There are some processes/attorneys there to handle the cases, but for
some reason one process gets blocked and cannot work. This leaves
more work for the remaining workers, but they are probably waiting for
the first process to get finished and unlock the resources (database
records?) that it is using. If processes are allowed to go to sleep
waiting for each other, the work will pile up. It will never end.
So, what is the solution? Throwing more attorneys at the problem?
Maybe, but more likely the work processes should be redesigned and
simplified. That allows the available attorneys to finish up a case
and take on the next one. Some of their tasks are more important than
others, but the performance or throughput of the system depends on
cutting away or redesigning the most time-consuming tasks. The high
degree of sick-leave is an indicator of system design flaws (albeit an
one), and thus not altogether bad.
In the same way, a high "load average" (as reported by the "uptime" or
"top" commands) is one indicator that the Wikipedia system is flawed.
The load average in a UNIX system is the number of processes that are
ready to run, waiting for the CPU to become available. Unfortunately,
most of them are just waiting to see if their wanted resource has
become available. If this is not the case (e.g. database record still
locked), they will go back to the end of the line, waiting again. Do
you remember those bread shop waiting lines in Soviet Russia?
Training new attorneys is in itself a time-consuming task, which
should be avoided if possible. Instead of paying sick leave (for how
long?) to the already trained attorneys, a "cure" for "burn-out"
should be found that can bring them back to work, thus relieving the
overload from their colleagues and saving tax payers' money at the
same time.
I have no idea how a "cure" for burn-out can be found, but I think it
is a necessary political trick, and thus will happen. It will not
hurt voters' feelings, and it is my guess that the people who can
achieve this will work for the winners of the election.
This might be the weakest analogy in history, but I think we should
treat the Wikipedia processes with the same dignity and respect that
the Swedish voters would expect. After all, they're supposed to work
for us. The processes feel self-fulfillment when they can finish
their job on time, and get distressed when they get locked up. Any
uncalled for delay will only result in more work piling up. That is a
flaw in the system design that has to be fixed, and we cannot go
around claiming that "some" of the workers are trying to cheat the
system. That will only lead to us losing their confidence.
--
Lars Aronsson (lars(a)aronsson.se)
Aronsson Datateknik
Teknikringen 1e, SE-583 30 Linuxköping, Sweden
tel +46-70-7891609
http://aronsson.se/http://elektrosmog.nu/http://susning.nu/
I thought I'd bring up this idea again since it might be easy to
implement with the new codebase.
If you put text such as [$\int_{x=0}^\infty x^2 dx$] in Wiki, upon
saving the article, TeX will be called and translate the formula into
an image, and store the image on the server and its name in a database
indexed with the formula text. When the Wiki page is presented, the
image is inlined (and an alt attribute containg the formula text
added). When the page is later edited and saved again, the system
first checks whether an up-to-date image of the formula already
exists; if not, TeX is called to regenerate it.
This would make mathematicians, computer scientists, physicists and
chemists happy. TeX includes a package for typesetting chemical
structure formulas and another one for quite general labeled diagrams
and trees. There's also a TeX package which allows to typeset musical
notes and another one for chess positions.
The concept could be expanded to other programs which can produce
graphics on-the-fly based on a textual description. This includes
gnuplot (graphs of functions) and maybe packages such as GD,
imagemagick or even GIMP.
Axel
> by blindly executing TeX when someone edits a page, we are assuming
> that they haven't included any malicious code in their TeX source.
TeX has two dangerous commands: shell escapes and writing to an
arbitrary file. Both can be globally disabled (and are disabled by
default in most TeX distributions). It is fairly easy however to write
TeX which eats memory like crazy (TeX allows recursion :-), so we
would have to somehow restrict the resources available to the TeX
process. But we are of course right now already wide open to all sorts
of denial-of-service attacks.
Axel
There are two SQL files in upload space, and no record of them being
uploaded. I am deleting them because passwords are exposed. Whoever is
working with them, please use somewhere else to put them.
phma
Currently, Wikipedia article names are
"case sensitive, but first character must be a capital".
This isn't optimal, because:
* very rarely two things have the same name, only different capitalization
* there is significant group of names thot should be all-lower-case,
like many Unix programs
Fully case sensitive system also wouldn't be optimal.
While it would allow all-lower-case names, it would force many links
to be written as [[Foo|foo]], what would be an unnecessary burder.
So I propose a switch to "case insensitive but case preserving"
system at some point, just like file names on Microsoft Windows.
I have been trying for hours to find what two users who recently uploaded
files have contributed. Frustrated by timeouts in Konqueror, I tried Lynx and
got the following:
Nupedia logo by milodesign.com
home | search | newest | join the effort | member area | about Nupedia
Sorry! 404 error!
Probably, you should log in!
The page you clicked on or typed in is not (currently) available.
Why? Various possible reasons:
* If you encountered this message while trying to get to or navigate
the member area, then probably, you are not logged in--so please
log in! E.g., you might be an editor and trying to do something
that only editors can do. Then definitely you should log in.
* It is possible that you are logged in, but you do not have
permission to see the URL your browser is pointing to. If you
believe this is in error, we'd like to know about it:
comments(a)nupedia.com.
* It is possible that we simply don't have the page on file. This
is always embarrassing, and we'd like to know about it:
comments(a)nupedia.com.
Whatever the problem, we sure hope it's fixed soon. We aim to please,
but we are overworked!
Please hit the "back" button on your web browser, or choose a link
from those listed above.
The command was:
time lynx
http://www.wikipedia.com/wiki/special:contributions\&theuser=Hellspawn
and it took about 10 minutes. The other user is Malkav.
Can you add an index on the contributor user ID in the table that logs who
contributed what? That should speed up the join in the database lookup.
phma
Bug 1 - import script tries to import articles which start with
lower-case letters, what might lead to clash.
Fix attached.
Bug 2 - import script has some single quotes escaping problem,
and articles with single quotes in name are not added to "linked"
table, and therefore are listed as not linked despite of being linked.
I couldn't find where the bug is.
>
>
>----- Forwarded message from Giskart <giskart(a)linux.be> -----
>
>From: Giskart <giskart(a)linux.be>
>Date: Mon, 24 Jun 2002 19:23:41 +0200
>To: jwales(a)bomis.com
>Subject: request admin powers WikiNL
>
>Can you give me administrator powers for the Dutch Wikipedia please ?
>
Nevermind. I am opt by TK. Have acces. -- giskart
I'm a little unsure how to do it on wiki nl. Is that one still running the old
perl script?
----- Forwarded message from Giskart <giskart(a)linux.be> -----
From: Giskart <giskart(a)linux.be>
Date: Mon, 24 Jun 2002 19:23:41 +0200
To: jwales(a)bomis.com
Subject: request admin powers WikiNL
Can you give me administrator powers for the Dutch Wikipedia please ?
Thanks,
Giskart
----- End forwarded message -----
I just found something that we defintively need: RecentChanges in
Mozilla's/Netscape's sidebar. See it at the DebianWiki:
http://wiki.debian.net/DebianWiki/RecentChangesSidebar
It is cool, and I want to have it. Things can be so easy :-)
Kurt
P.S.: You don't need to hurry. I can wait till all the bugs in test-de
have been solved ;-)
P.P.S.: Seriously: I hope that at least one of our busy developers likes
this feature.