Subject:
[Wikitech-l] search in spanish wikipedia is not workin
Fixed.
----- The following addresses had permanent fatal
errors -----
<wikidown(a)wikipedia.org>
(reason: 550 5.1.1 <wikidown(a)wikipedia.org>rg>... User unknown)
Should be wikidown(a)bomis.com; this is now set in LocalSettings.php (it
used to be hard coded), I've corrected it there.
Brion goes on vacation and everything starts to fall
apart. First order of
business of the Wikimedia Foundation is to set up a fund to clone Brion. :-)
Hey, that could be fun. :)
There must be something like
set-variable = max_connections=somebignumber
in mysql.conf.
At present my.cnf has:
set-variable = max_connections=560
vs Apaches':
MaxClients 175 (on pliny)
MaxClients 200 (on larousse)
so we might have at most 375 apache processes attacking us at once.
However, they might each take two mysql connections -- if the persistent
connection is broken, it can't be closed (at least from PHP) short of
killing the process, so it just opens a second non-persistent
connection. And, in theory, we might see a handful more from SQL
queries, which open another connection using a separate user for
restricted permissions.
We could probably do with lowering the max apaches on pliny a bit and
upping the max connections on mysql a bit just to keep that particular
part from blowing up; however if they are blowing up, that's going to be
a symptom of something else...
We do dynamic gzipping of pages on a rather large
website (~3.000.000
dynamic hits daily). The experience we gathered so far showed us, that
the gzipping itself is actually rather fast, compared to the page
generation process through PHP/Perl. The main problem with dynamic
gzipping is, that you have to build up the whole page in memory,
instead of sending out lines as they are generated (don't know, how
the Wikipedia software currently works).
Currently the page is output in several chunks, but usually the majority
of it is the wiki page itself, which is processed (over and over and
over) and eventually output as one chunk. The other chunks are in the
headers and footers, generally.
If we're generating a newly cachable page, we turn on complete page
buffering and capture the buffer to save it to disk (gzipped and not).
There are, of course, improvements that can be made to our parser...
-- brion vibber (brion @
pobox.com)