New subject: Wikipedia performance and caching of responses -- some questions

10 Apr 2002


      Jim accidentally sent this just to me, I'm sending it back to the list:
On mer, 2002-04-10 at 18:27, Jimmy Wales wrote:
...
Brion L. VIBBER wrote:
...
...
My best guess is that the parsing and lookups on regular pages are
currently the main load, not editing or exotic database queries -- is
this right?
Not a clue. Initially, the database certainly was the main load, but I
haven't heard any newer figures. Jimbo?
I'll reset the slow-query log and make a new version available after a few
hours of data collection.
...
We used to cache rendered articles, but Jimbo disabled this feature some
time ago, claiming he was unable to find a performance advantage. (See
mailing list archives circa February 13.)
But, I'm willing to try it again.
...
Personally, I've always find that idea suspicious; caching is definitely
faster on my test machine, and is going to be a particularly big help
with, say, long pages full of HTML tables! But then, my test machine has
a much much lower load to deal with than the real Wikipedia. :)
Nonetheless, if cacheing really isn't helping, that's because it's not
doing something right. It should be found, fixed, and reenabled.
I would say that I agree with that.
Here's a question for everyone.
Let's say we have some portion of the page pre-calculated and cached.
Is it faster to keep that cached text *in the database*, or *on the
hard drive*?
I'm very strongly biased towards thinking that keeping it on the hard
drive is faster, and by a significant margin, but only because I've
never tested it and because I know (from long experience at Bomis) that
opening up a text file on disk and spitting it out can be *really* fast,
if the machine has enough ram such that the filesystem can cache lots of
popular files in memory.
But, everything I read about MySQL talks about how screamingly fast it
allegedly is, so...
--Jimbo

Re: [Wikitech-l] Wikipedia performance and caching of responses -- some questions