Wikitech-l December 2003

wikitech-l@lists.wikimedia.org

84 participants
155 discussions

by Anthere

From: Jussi-Ville Heiskanen <jheiskan(a)welho.com> On Wed, 2003-12-03 at 01:29, Anthere wrote: Julie de Lespinasse She was a passionate... But she should be "de Lespinasse". That is a bit of a problem. >Please be advised that there is a more significant >consideration here >than the necessary addition of "de". >In the Finnish language there is a real possibility >that "Lespinasse" >would be parsed: "lisping alcoholic pedophile". >And "Nasse"? Well Nasse was a clown fairly popular on >Finnish TV >(portrayed by a genius actor, Vesa-Matti Loiri) whose >shctick was to >quiz children on their habits on the potty to a >ludicrously explicit >degree, while taking mock surreptious swigs out of his >bottle. >Jussi-Ville Heiskanen (aka Cimon Avaro) LOL. These two are redhibitory defects ;-) __________________________________ Do you Yahoo!? Free Pop-Up Blocker - Get it now http://companion.yahoo.com/

20 years, 6 months

New MediaWiki stable release coming soon

by Brion Vibber

Now that the dev branch is pretty well broken in on Wikipedia, it's my intention to branch and release a new 'stable' version in the next day. If there are any real show-stopper bugs remaining, please say something! -- brion vibber (brion @ pobox.com)

20 years, 6 months

Server-side File Caching Using a 404 Handler

by Evan Prodromou

We have a good server-side file caching system already. I'm proposing another one. It's almost always true with Web servers in general and Apache in particular that serving static files from the file system is going to be much, much faster -- usually an order of magnitude faster -- than doing any dynamic processing. In other words, any PHP page is going to be significantly slower than a plain HTML page. For this reason, I'm proposing using a directory in a MediaWiki installation as a static file cache. Here's how it would work: * There's a Web directory where static pages go -- say, "/var/www/wiki". * URLs for articles point to that directory: the link for "Foo Bar" renders to "/wiki/Foo_Bar". * Initially, an HTTP hit for "/wiki/Foo_Bar" will fail -- no page there. * Apache has a directive -- ErrorDocument -- for mapping a script to handle missing files (among other errors). We map a script -- /w/404Handler.phtml or something -- to handle these errors for the "/wiki/" directory. <Directory /var/www/wiki/> ErrorDocument 404 /w/404Handler.phtml </Directory> * The 404 handler tries to retrieve the article from the database. If the article doesn't exist, it shows an edit form for the article, just like a broken link works now. If the article is Special:, it lets wiki.phtml do the work. * If the article _does_ exist, the handler renders the page *to a file in the cache directory*, e.g. "/wiki/Foo_Bar.html". It then opens the file and serves it to the current user, too. (There are probably some other ways to optimize this, but the best I can see is that they would require two hits to the server, which is wasteful. It's easier and cheaper just to read the file that was written.) (Of course, it returns an HTTP 200 response on success.) * For the next hit to come in for "/wiki/Foo_Bar", the Web server will find the file there (MultiViews will find the ".html" version -- a future 404 handler might also write out ".xml" or other document file formats), and thus serve it directly, without running any PHP code. * On saving an edit of a page, the software simply *deletes* the cached HTML file. This will trigger the 404 handler on the next hit for the page, which will regenerate the cached file. Similar cache invalidation would be necessary for moving or deleting a page. * If an edit would change the appearance of links to a page (say, making a new article (broken link to good link), or deleting an article (good link to broken link), or crossing the default stub boundary (if it exists)), saving a page would also delete the cached versions of any pages that link to the changed page. These, also, will be regenerated by the 404 handler on the next hit (with the new link appearance). * A garbage-collection cron job runs every N minutes to keep the size of the cache to a reasonable level (# bytes, # files). It deletes the least recently used pages, by filesystem access time. (There's a possible race condition where a huge influx of activity could overflow the cache between garbage collection runs. If the installation is susceptible to this -- say, it has hard disk space limits -- garbage collection could happen in the 404 handler rather than in a cron job. This would obviously slow display of pages, though.) * Of course, logged-in users want to get their pages rendered _just_so_, with question marks and [edit] links, etc. They should also have all their "My page" and other links working. It may be possible to do most of this with JavaScript -- checking the UserId cookie, and showing and hiding parts based on that. Then the same .html file can be served to logged-in and not-logged-in users, with the client side doing the customization. But probably the easiest way is just to run the dynamic pages every time for logged in users. We can use the Apache rewrite engine to serve different stuff, based on whether you're logged in or not. We use a RewriteCond line to check if the UserId cookie exists: RewriteCond %{HTTP_COOKIE} ^.*UserId=.*$ RewriteRule ^/wiki/(.*)$ /wiki.phtml?title=${ampescape:$1} [L] This kinda depends on having a significant number of users not-logged-in. If 100% of users are logged in, we get no benefits, and a slight cost from Rewrite-checking the cookie on every hit. * A more aggressive version could try to cache some of the other features of MediaWiki, like user contributions, page histories, etc., with the same strategy. Some advantages with this design: * Reduce the amount of dynamic page servicing * Apache handles all the tweaky optimizations like compression, content negotiation, client-side caching, etc. It simplifies the MediaWiki code. Some disadvantages I see with this design: * There'd probably be some problems with funky characters -- spaces, punctuation, etc. "/wiki/Foo Bar" and "/wiki/Foo_Bar" should probably (usually) map to the same file. Some work needed here. * You lose view counting. However, a periodic Web server log checker could provide the same functionality -- reading the Web log, updating the database with new view counts. * Directory entries. Having a few thousand files in a single directory could get kinda losey for some file systems. There may be some rewrite tricks that could allow writing multiple sub-directories, like we do with images in the /upload dir right now, but having them appear as being in the /wiki directory. * Redirect articles are tricky. Anyways, I've been thinking about this for a while, thought I'd throw it out here to get responses. I'll probably rewrite this and put it on meta soon. ~ESP -- Evan Prodromou <evan(a)wikitravel.org> Wikitravel - http://www.wikitravel.org/ The free, complete, up-to-date and reliable world-wide travel guide

20 years, 6 months

Errors in conversion?

by Andre Engels

On [[en:Wikipedia:Protected page]], I find a number of links being shown as broken links, even though the page they linked to exists. I tried (quasi-zero-)edits on both the page itself and on one of the linked pages ([[en:Silesia]] - with apologies for editing a protected page), and neither solved the problem. Anyone knows what the problem is, and how to solve it? I find this especially bad, because with the move these problems should have been solved - instead they seem to have been added. Andre Engels

20 years, 6 months

Wikipedia Statistics

by Daniel Mayer

Coolness. I do see a minor bug: The link under 'MediaWiki' is to http://www.wikimedia.org/ while it should either be to http://wikipedia.sourceforge.net/ or better yet http://mediawiki.org/. --mav

20 years, 6 months

RE: Search engine. Was: [Wikitech-l] Fast

by Andrew Alder

G'day Peter and the Group At 07:39 AM 4/12/03 +0000, Peter Bartlett wrote: > > . The not being realtime may be a plus or minus, this is the very > thing we tossed around a little in the Pump. > > Certainly for Wikipedia contributors, realtime is best. But perhaps > not so for readers, who are after stable content. > >As was pointed out on the pump, there is no reason to suppose that the >pedia was any more "stable" when Google took its snapshot than it is at >any other time.. True. No argument at all with this. But the probability that Google indexes a particular version is roughly proportional to the time for which that version is the current version. Therefore, the version presented by Google is on average more stable than the "current" version. I tried to point this out, but I'm afraid I didn't do it very clearly. These two effects tend to cancel each other out. The version presented by Google is not necessarily better than the current version, but for readers it's probably no worse and may even be better. That was my point. Andrew A **** andrewa @ alder . ws http://www.zeta.org.au/~andrewa Phone 9441 4476 Mobile 04 2525 4476 **** --- Outgoing mail is certified Virus Free. Checked by AVG anti-virus system (http://www.grisoft.com). Version: 6.0.532 / Virus Database: 326 - Release Date: 27/10/03

20 years, 6 months

Wikipedia Statistics

by Brion Vibber

I've got the wikipedia statistics job successfully running on the new server ("geoffrin") now. This has the distinct advantage that the job can be run right after the backup dump instead of waiting many hours for my little aDSL line to download everything to run it on my home box. It takes about 2 hours 20 minutes to run, though a fair amount of this may be from having to read the dump compressed. Erik, would you mind if I add the stats scripts as a module in our CVS repository, along with my scripts that run your scripts? Latest update: http://www.wikipedia.org/wikistats/ -- brion vibber (brion @ pobox.com)

20 years, 6 months

math make failure

by wiki＠tictric.net

ocaml 3.0.14 TeTex 1.0.2+20011202-2 ImageMagick 5.4.4.5-1woody1 gs 6.53.3 media-wiki latest from cvs When I do a make in ~/math I tells me the following. What is it missing? ocamlopt -c util.ml ocamlc -c render_info.mli ocamlc -c tex.mli ocamlyacc parser.mly ocamlc -c parser.mli ocamlopt -c parser.ml ocamlc -c html.mli ocamlopt -c html.ml ocamlc -c mathml.mli ocamlopt -c mathml.ml ocamlc -c texutil.mli ocamlopt -c texutil.ml ocamlopt -c render.ml ocamllex lexer.mll 187 states, 3221 transitions, table size 14006 bytes ocamlopt -c lexer.ml ocamlopt -c texvc.ml File "texvc.ml", line 8, characters 14-27: Unbound value Digest.to_hex make: *** [texvc.cmx] Fehler 2 rm parser.ml lexer.ml -- manfred

20 years, 6 months

MediaWiki: namespace and customizable messages

by Brion Vibber

Tim Starling has fixed a few final bugs in this feature that's been in development and it should now be active on the various wikis. Most of the user interface messages can now be modified directly by editing a special page in the 'MediaWiki:' namespace. Since this can be somewhat dangerous these pages are protected and editing is restricted to sysops. So far this feature does not completely replace the LanguageXx.php files, but it should make it a lot easier to get most things adjusted without the delay of contacting the developers to get new versions of language files tested and installed. Additionally, custom messages may be used to insert common reusable bits of text into articles. For some documentation, see: http://meta.wikipedia.org/wiki/Meta-Wikimedia:MediaWiki_namespace and on the English wikipedia: http://en.wikipedia.org/wiki/Wikipedia:MediaWiki_namespace -- brion vibber (brion @ pobox.com)

20 years, 6 months

Apache/php on pliny

by Brion Vibber

I've reinstalled apache (now 1.3.29) and php (now 4.3.4) on pliny. mod_throttle is still disabled. We'll see if anything changes... Apache config: OPTIM='-O2 -mcpu=i686' ./configure --with-layout=Apache --enable-module=rewrite --enable-module=mmap-static --enable-module=headers --enable-module=expires --activate-module=src/modules/php4/libphp4.a PHP config: ./configure --enable-shmop --with-zlib --with-mysql=/usr/local/mysql --with-iconv --with-apache=/home/brion/src/apache_1.3.29 --with-readline --enable-sockets Pliny has GCC 2.96. -- brion vibber (brion @ pobox.com)

20 years, 6 months

← Newer
1
...
8
9
10
11
12
13
14
15
16
Older →

Jump to page:

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Wikitech-l December 2003