On Wednesday, Oct 22, 2003, at 17:20 US/Pacific, Tim Starling wrote:
Brion Vibber wrote:
I've consolidated some of the watchlist
access code into
WatchedItem.php and added memcached support for it. This should
reduce db hits on logged-in page views; it had been checking the
watchlist table (twice!) per logged-in page render.
Good stuff. Watchlist queries were common features in the slow query
log.
A clarification: what I've got it going for is the 'is this page
watched?' check on every wiki page render. It's a simple enough query,
but even simple queries put load on the database and can hang if the db
doesn't want to respond. Right now it's storing each user<->page
relationship as a separate key. This probably isn't the most efficient
thing; it only wins you in db terms when you come back to a page you've
already visited. Every single _new_ page you visit still needs to be
checked against the db before the watched/not watched state can be
stored.
An alternate way might be to slurp the entire list of watched pages for
a user into a list and store it as one key. It then only needs to hit
the database the first time you come in and whenever you alter your
watchlist, and won't need to store a hojillion non-matches for all the
pages you visit that you're not watching. The whole watchlist can be in
the tens of kilobytes for a few really active users, but for most
people it'll be quite small, so this would probably be an efficiency
win. However it's 3am so I'll do this tomorrow. :)
BTW as I discovered with wfMsg(), querying memcached
on a local
machine takes on the order of a millisecond (on my poor old 66 MHz FSB
Celeron). So if you need to access thousands of entries in the course
of a request, it's probably better to consolidate the entries into
larger chunks.
A note about memcached is that it's suboptimal on a stock 2.4.x Linux
kernel. A 2.6 or patched 2.4 kernel with epoll notification support, or
the similar kqueue on BSD, is recommended. Until we either get a new
kernel (successfully) installed or the LiveJournal guys finish tracking
down the bugs with rtsig notification, which stock 2.4 does support, we
won't get the best possible response times on memcached.
Should still be better than a database that doesn't feel like
responding for 90 seconds while it ponders statistics, though. ;)
(Also a note:
main development is currently going on in the stable
branch and is focused on speed, security, and bug fixes, not
features. A lot of fixes will need to be forward-ported to the dev
branch at some point.)
I've been working on the dev branch. When do you next want to merge
the two branches? I'm not going to port my parser optimisation to
stable, so if we want that, we'll have to do a proper merge.
Well, I've already ported some of that over. I think we should probably
do some more merging both ways; there are bug fixes in stable that
should go into dev, but not all of the new features in dev are ready
for stable yet.
-- brion vibber (brion @
pobox.com)