Subject: [Wikitech-l] search in spanish wikipedia is not workin
Fixed.
----- The following addresses had permanent fatal errors ----- wikidown@wikipedia.org (reason: 550 5.1.1 wikidown@wikipedia.org... User unknown)
Should be wikidown@bomis.com; this is now set in LocalSettings.php (it used to be hard coded), I've corrected it there.
Brion goes on vacation and everything starts to fall apart. First order of business of the Wikimedia Foundation is to set up a fund to clone Brion. :-)
Hey, that could be fun. :)
There must be something like
set-variable = max_connections=somebignumber
in mysql.conf.
At present my.cnf has: set-variable = max_connections=560
vs Apaches': MaxClients 175 (on pliny) MaxClients 200 (on larousse)
so we might have at most 375 apache processes attacking us at once. However, they might each take two mysql connections -- if the persistent connection is broken, it can't be closed (at least from PHP) short of killing the process, so it just opens a second non-persistent connection. And, in theory, we might see a handful more from SQL queries, which open another connection using a separate user for restricted permissions.
We could probably do with lowering the max apaches on pliny a bit and upping the max connections on mysql a bit just to keep that particular part from blowing up; however if they are blowing up, that's going to be a symptom of something else...
We do dynamic gzipping of pages on a rather large website (~3.000.000 dynamic hits daily). The experience we gathered so far showed us, that the gzipping itself is actually rather fast, compared to the page generation process through PHP/Perl. The main problem with dynamic gzipping is, that you have to build up the whole page in memory, instead of sending out lines as they are generated (don't know, how the Wikipedia software currently works).
Currently the page is output in several chunks, but usually the majority of it is the wiki page itself, which is processed (over and over and over) and eventually output as one chunk. The other chunks are in the headers and footers, generally.
If we're generating a newly cachable page, we turn on complete page buffering and capture the buffer to save it to disk (gzipped and not).
There are, of course, improvements that can be made to our parser...
-- brion vibber (brion @ pobox.com)
wikitech-l@lists.wikimedia.org