Re: Many things - Wikitech-l

22 Jun 2003

...

 Subject:
 [Wikitech-l] search in spanish wikipedia is not workin

Fixed.

...
    ----- The following addresses had permanent fatal
errors -----
 &lt;wikidown(a)wikipedia.org&gt;
    (reason: 550 5.1.1 &lt;wikidown(a)wikipedia.org&gt;rg>... User unknown)  
Should be wikidown(a)bomis.com; this is now set in LocalSettings.php (it 
used to be hard coded), I've corrected it there.

...
 Brion goes on vacation and everything starts to fall
apart. First order of 
business of the Wikimedia Foundation is to set up a fund to clone Brion. :-)
 Hey, that could be fun. :)

...
 There must be something like

   set-variable =   max_connections=somebignumber

in mysql.conf. 
 At present my.cnf has:
set-variable    = max_connections=560

vs Apaches':
MaxClients 175 (on pliny)
MaxClients 200 (on larousse)

so we might have at most 375 apache processes attacking us at once. 
However, they might each take two mysql connections -- if the persistent 
connection is broken, it can't be closed (at least from PHP) short of 
killing the process, so it just opens a second non-persistent 
connection. And, in theory, we might see a handful more from SQL 
queries, which open another connection using a separate user for 
restricted permissions.

We could probably do with lowering the max apaches on pliny a bit and 
upping the max connections on mysql a bit just to keep that particular 
part from blowing up; however if they are blowing up, that's going to be 
a symptom of something else...

...
  We do dynamic gzipping of pages on a rather large
website (~3.000.000 
 dynamic hits daily). The experience we gathered so far showed us, that 
 the gzipping itself is actually rather fast, compared to the page 
 generation process through PHP/Perl. The main problem with dynamic 
 gzipping is, that you have to build up the whole page in memory, 
 instead of sending out lines as they are generated (don't know, how 
 the Wikipedia software currently works). 
Currently the page is output in several chunks, but usually the majority 
of it is the wiki page itself, which is processed (over and over and 
over) and eventually output as one chunk. The other chunks are in the 
headers and footers, generally.

If we're generating a newly cachable page, we turn on complete page 
buffering and capture the buffer to save it to disk (gzipped and not).

There are, of course, improvements that can be made to our parser...

-- brion vibber (brion @ pobox.com)