Wikitech-l July 2002

wikitech-l@lists.wikimedia.org

20 participants
75 discussions

by Lars Aronsson

Today Friday, the front page of the English Wikipedia has been fast all day. Another page (I monitor http://www.wikipedia.com/wiki/Sweden) was slow for one period of 30 minutes (09:30-10:00 am GMT) and another period of two hours (11:40-13:50 GMT). Some other URLs on the international Wikipedias were also affected at the same time. This might be due to maintenance or work being done on the scripts. Subtract 7 hours from GMT to get the server's local time zone (PDT = GMT -0700). Apart from these two limited intervals, every URL that I monitor have been fast all day, including the recent changes pages. I'm very happy with this, and hope Brion and Jimmy (and who else?) will soon get the talk namespace links back without hurting performance. (But hey, never make big fixes five minutes before you leave for the weekend! Better just leave it as is if you have to go.) And now for some more relaxed Friday reading, actually related to performance problems. (The following analysis might be politically slanted. Don't take it too seriously.) The Swedish parliament elections are coming up in September, so the political parties are starting up their campaigns. The problem is there are no big issues to fight about. The four non-socialist parties have unusually boring candidates (Dukakis style), and everybody expects the current social-democratic government to win. The single issue that seems to be coming up is the national sick leave insurance, which is paid by tax money, and far over budget. This is linked to the fact that "burn-out" is now an accepted medical diagnosis for which you are allowed to take a long sick leave on the tax payers' expense. You would expect such welfare excesses to be on the social democrat agenda, and that non-socialists would urge for tax cuts and a balanced budget. However, the current s-d govt has been doing a great job balancing the budget, and they will now have to deal with cutting back this overgenerous sick leave compensation without hurting their voters' feelings. Tough job. The Christian-democratic party's candidate has already hurt a lot of feelings by claiming that "some" of those receiving compensation are "cheating the system". That might be true, but accusing "some" (who? me?) is obviously not the way to attract voters. This issue now has media attention and some interesting example cases are reported. Like this one: Attorneys in Swedish district courts have been right-sized in the past years, as part of balancing the budget. This means that as soon as one gets sick, the rest get too much to do, leading to stress and burn-out, which leads to more sick leaves. Think of the court cases as HTTP requests arriving to Wikipedia. There are some processes/attorneys there to handle the cases, but for some reason one process gets blocked and cannot work. This leaves more work for the remaining workers, but they are probably waiting for the first process to get finished and unlock the resources (database records?) that it is using. If processes are allowed to go to sleep waiting for each other, the work will pile up. It will never end. So, what is the solution? Throwing more attorneys at the problem? Maybe, but more likely the work processes should be redesigned and simplified. That allows the available attorneys to finish up a case and take on the next one. Some of their tasks are more important than others, but the performance or throughput of the system depends on cutting away or redesigning the most time-consuming tasks. The high degree of sick-leave is an indicator of system design flaws (albeit an one), and thus not altogether bad. In the same way, a high "load average" (as reported by the "uptime" or "top" commands) is one indicator that the Wikipedia system is flawed. The load average in a UNIX system is the number of processes that are ready to run, waiting for the CPU to become available. Unfortunately, most of them are just waiting to see if their wanted resource has become available. If this is not the case (e.g. database record still locked), they will go back to the end of the line, waiting again. Do you remember those bread shop waiting lines in Soviet Russia? Training new attorneys is in itself a time-consuming task, which should be avoided if possible. Instead of paying sick leave (for how long?) to the already trained attorneys, a "cure" for "burn-out" should be found that can bring them back to work, thus relieving the overload from their colleagues and saving tax payers' money at the same time. I have no idea how a "cure" for burn-out can be found, but I think it is a necessary political trick, and thus will happen. It will not hurt voters' feelings, and it is my guess that the people who can achieve this will work for the winners of the election. This might be the weakest analogy in history, but I think we should treat the Wikipedia processes with the same dignity and respect that the Swedish voters would expect. After all, they're supposed to work for us. The processes feel self-fulfillment when they can finish their job on time, and get distressed when they get locked up. Any uncalled for delay will only result in more work piling up. That is a flaw in the system design that has to be fixed, and we cannot go around claiming that "some" of the workers are trying to cheat the system. That will only lead to us losing their confidence. -- Lars Aronsson (lars(a)aronsson.se) Aronsson Datateknik Teknikringen 1e, SE-583 30 Linuxköping, Sweden tel +46-70-7891609 http://aronsson.se/ http://elektrosmog.nu/ http://susning.nu/

21 years, 8 months

Logging in?

by The Cunctator

How come I don't stay logged in permanently once I log in now? In other words, why have the cookies been set to expire session/daywise? They shouldn't, I don't think.

21 years, 9 months

[taw@users.sourceforge.net: Re: [Wikitech-l] rsync for mirroring]

by Jimmy Wales

Any opinions from wikitech-l about this? ----- Forwarded message from Tomasz Wegrzanowski <taw(a)users.sourceforge.net> ----- From: Tomasz Wegrzanowski <taw(a)users.sourceforge.net> Date: Tue, 30 Jul 2002 04:08:53 +0200 To: Jimmy Wales <jwales(a)bomis.com> Subject: Re: [Wikitech-l] rsync for mirroring On Mon, Jul 29, 2002 at 03:40:12PM -0700, Jimmy Wales wrote: > What do we need to do on our end? This is just one of many ways of doing it. You should probably play with logs, connection limits, running on lower permisions and stuff like that later, but it should work without this. Obviously you should add all Wikipedias to list (i have only 3 here) and use correct paths. 1. install rsync 2. ensure that /etc/services contains this line (if not either add this line or write port in /etc/inetd.conf numerucally): rsync 873/tcp # rsync 3. create /etc/rsyncd.conf containing something like that: read only = yes [pl] path = /home/taw/local/tmp/wiki-pl/ comment = Polish Wikipedia [de] path = /home/taw/local/tmp/wiki-de/ comment = German Wikipedia [eo] path = /home/taw/local/tmp/wiki-eo/ comment = Esperanto Wikipedia 4. put following line in /etc/inetd.conf rsync stream tcp nowait root /usr/bin/rsync rsyncd --daemon 5. restart inetd Now to check: $ rsync localhost:: pl Polish Wikipedia de German Wikipedia eo Esperanto Wikipedia $ rsync localhost::pl drwxrwxr-x 288 2002/01/09 22:22:50 . -rw-rw-r-- 19 2001/09/26 16:23:32 .htaccess drwxrwxr-x 112 2001/10/02 13:07:29 RCS -rw-rw-r-- 302 2000/07/18 20:40:13 hos.png -rw-rw-r-- 235 2001/09/26 17:49:14 index.html drwxrwxr-x 72 2001/10/18 01:37:30 lib-http -rwxrwxr-x 67 2001/09/26 18:05:02 showtr drwxrwxr-x 72 2002/01/12 20:26:36 temp -rwxrw-r-- 1160 2001/04/08 18:34:10 umtrans.pl drwxrwxr-x 592 2001/11/24 09:52:33 wiki $ ----- End forwarded message -----

21 years, 9 months

Supporting horrible browsers

by Neil Harris

Supporting old, horrible browsers. I have some current stats that suggests that IE 4.x and Netscape 4.x now represent 2% and 3% of users, respectively. Here's what I propose to do with Cologne Blue. 1. Browser-sniffing: detect these, and only these, broken browsers, at the webserver. 2. For these browsers alone, generate an XHTML page using tables for layout, and minimal CSS for typography and colours. 3. For all other browsers, generate an XHTML page using CSS for layout, and typography, and colours. (Oh, and a table for the header, but that's OK). Then: 1. Modern standards-compliant browsers will show the site as intended. 2. Very old browsers, text-only browsers, web spiders, and accessibility systems will show the site in the best backwards-compatible rendition possible, as the site will use nice old-fashioned HTML codes inside all the fancy layout stuff, ignoring the CSS completely. 3. The brain-dead browsers listed above will show a reasonable rendition of the site, for as long as it takes for their market share to fall near zero. I think that 90% of the layout code can be re-used in a single skin file, provided that I can get an indication if the browser is one of the broken ones. Does this seem like a reasonable approach? And does anyone have any GPL'd user-agent parsing PHP code? Neil

21 years, 9 months

Re: [Wikitech-l] rsync for mirroring

by Tomasz Wegrzanowski

On Mon, Jul 29, 2002 at 02:50:20PM +0200, Lars Aronsson wrote: > On Mon, 29 Jul 2002, Tomasz Wegrzanowski wrote: > > Could you make international wikis available via rsync ? > > An exciting idea. Would you even try Unison for this? > Is there any benefit to using Unison over rsync for this kind of > uni-directional application? > > http://www.cis.upenn.edu/~bcpierce/unison/ It doesn't seem to have any advantage and is less popular, so I'd rather choose rsync.

21 years, 9 months

Proposal: new Cologne Blue final

by Neil Harris

Dear all, This is my latest, and I hope final, mock-up rendition of Cologne Blue that I intend to offer to replace the existing implementation of Cologne Blue. I enclose a mock-up page: this represents the final look intended in standard-compliant CSS browsers like Mozilla and IE 6, but will probably not work properly in non-CSS-aware browsers yet. There's still lots to do, but the only way to find out is not to build a better mock-up, but to write the working code. I realise that this design is not perfect, but it is probably nicer-looking than the existing implementation, and I'm reading the code for the new software to see how to make the changes in an evolutionary way will work with cross-browser support. I'm also considering the idea of doing a version of this with tables alone for old browsers that ignore or munge CSS. If I do the CSS right, the CSS version should work OK for browsers like Lynx. Can anyone help me with how to go about contributing code, and where and how to test it? The first thing I'll need to do is just to clone an existing style, and call it something like "Cologne Beta", prior to changing it step-wise into real code. Once I have something up and running, then we can start voting for features. As all code will of course be GPL, if you dislike it enough, you'll be able to change it yourself. Regards, Neil

21 years, 9 months

rsync for mirroring

by Tomasz Wegrzanowski

Could you make international wikis available via rsync ? It will be much easier to make local mirrors in that case, and it will be possible to sync more often than once a day.

21 years, 9 months

Re: [Wikitech-l] Parsing

by lcrocker＠nupedia.com

> This made me think: Would it make sense to make a formal BNF > grammar for the Wikipedia text format, so a LALR(1) parser could > be made for it? Would that make any sense at all with PHP, or > just be too hard to code and inflexible? I'd love to have a formal grammar of some kind (I think regexps would be fine), and I agree with Jan that a totally wiki-specific syntax would be far better than out current mish-mash of HTML and wiki markup. But I'm not sure if it's not already too late to revisit those decisions. But if it isn't, I'll be happy to discuss what a syntax might look like.

21 years, 9 months

Usemod2SQL feature request

by Tomasz Wegrzanowski

Could you make it so that "See also" is added only to non-empty subpages ?

21 years, 9 months

Parsing

by Lars Aronsson

There's a discussion on wikipedia-l about the exact syntax of an URL and whether a punctuation mark at the end of the URL should be considered part of the URL or not. This made me think: Would it make sense to make a formal BNF grammar for the Wikipedia text format, so a LALR(1) parser could be made for it? Would that make any sense at all with PHP, or just be too hard to code and inflexible? Only ten years ago, people would use C programming and YACC to solve problems like this, and reg.exp based parsing was considered just too inefficient. -- Lars Aronsson (lars(a)aronsson.se) tel +46-70-7891609 http://aronsson.se/ http://elektrosmog.nu/ http://susning.nu/

21 years, 9 months

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Wikitech-l July 2002