I stated:
>No, it shouldn't be a problem in normal circumstances.
>An HTML cache should only be changed if
>the EXISTENCE of a cached page changes. If an article is
>linked to by 1000s+ of pages, it almost certainly already exists,
>so no cache (other than the one edited article) would be invalidated.
Neil Harris said:
>By "hitting" in this case, I meant "deleting or creating".
>Sorry for the ambiguity.
>Take a look at something like [[census]], which is linked by about
>36.000 articles.
It sounds like we're in agreement. [[census]] already
exists, and LOTS of articles link to it. If someone completely
DELETED [[census]] (and didn't just edit it), then
that would cause a lot of caches to be affected. But that should
be a warning signal - perhaps [[census]] should be edited, but
it should probably still exist as an article (and NOT deleted).
>... but there are lots of
>data integrity and race condition / transaction issues to be thought
>about before any of this can be implemented.
Yes, anytime there are multiple front-ends, there are
potentials for race conditions. I heartily agree.
>Let's finish splitting the
>system into two machines, DB and WWW, before any re-architecture is
>performed.
Fair enough.
However, if all wikitext is removed from MySQL
and placed in the filesystem, splitting out the MySQL database on
a separate machine may not buy much. Then, most of the work would be
then done by the filesystem, with only housekeeping metainformation
being accessed through MySQL, and only when editing or accessing
special pages.
Unfortunately, we can't really separate processor architecture from
software architecture - the goal is to maximize performance while
minimizing hardware & development-time cost. Hopefully, this
kind of free-flowing discussion of alternatives will yield
that perfect combination - or at least a good one.
If performance is still a problem (some time) after the second
computer is installed, there's at least one other approach that
I haven't seen emphasized here: throw RAM at the problem.
If one of the systems shows any significant paging, installing
lots of memory (so the whole DB is in memory)
is likely to make things faster.
Even relatively low-end machines have really impressive database
stats when EVERYTHING runs out of main memory :-).
Hopefully, the second machine will make all this unnecessary.
I just thought I'd post a few ideas as I think of them.
----- Forwarded message from Jason Richey <jasonr(a)bomis.com> -----
From: Jason Richey <jasonr(a)bomis.com>
Date: Mon, 5 May 2003 10:31:27 -0700
To: Jimmy Wales <jwales(a)bomis.com>
Subject: Re: [msochuck(a)yahoo.com: please fix vikipedio.com redirect (estis: Helpon ! la vikipedio malaperis.)]
this doesn't make sense. These domains already point to the new
server, which is already configured to answer for that hostname...
Wait, there is a problem... There must be some freaky rewriteRule or
something that is "Stealing" the opportunity to serve this page. I've
been looking at the httpd.conf for a while, and I'm not sure what the
deal is (and I'm a little nervous about restarting the apache on this
bogged-down server).
Still looking...
Jimmy Wales wrote:
> ----- Forwarded message from Chuck Smith <msochuck(a)yahoo.com> -----
>
> From: Chuck Smith <msochuck(a)yahoo.com>
> Date: Sun, 4 May 2003 21:29:27 +0200 (CEST)
> To: jwales(a)bomis.com
> Subject: please fix vikipedio.com redirect (estis: Helpon ! la vikipedio malaperis.)
>
> Jimbo,
>
> Please make www.vikipedio.com a redirect to
> eo.wikipedia.org. Thanks!
>
> Bonvolu farigi www.vikipedio.com plusendilon al
> eo.wikipedio.org. Dankon!
>
> Chuck
>
> --- Paul Ebermann <Paul-Ebermann(a)gmx.de> a écrit : >
> De: "Paul Ebermann" <Paul-Ebermann(a)gmx.de>
> > À: <wikieo-l(a)wikipedia.org>
> > Objet: Re: [WikiEO-l] Helpon ! la vikipedio
> > malaperis.
> > Date: Sun, 4 May 2003 19:37:23 +0200
> >
> > "Arno Lagrange" skribis:
> >
> >
> > > Kio okazas ? La e-o vikio tute malaperis. Kiam mi
> > volas aliri al
> > > eo.vikipedio.org aux eo.vikipedio.com mi nur
> > atingas cxu la angla-lingvan,
> > > cxu iun komercan pagxon kun nomo wikipedia.com.
> > > Kio estas tiu fusxajxo ?
> > > Sxajnas ke tamen la alilingvaj plu funkcias.
> >
> >
> > http://eo.wikipedia.org/, http://www.vikipedio.org/,
> > http://www.vikipedio.com/, http://eo.vikipedio.com/
> >
> > direktas (por mi) al la esperanta vikipedio.
> >
> > http://www.vikipedio.com/
> >
> > montras la anglalingvan.
> >
> >
> > Pauxlo
> >
> >
> > _______________________________________________
> > WikiEO-l dissendolisto
> > WikiEO-l(a)wikipedia.org
> > http://www.wikipedia.org/mailman/listinfo/wikieo-l
>
> =====
> Learn Esperanto! - http://www.lernu.net/
> My homepage - http://www.ikso.net/~chuck
> Enciklopedio - http://eo.wikipedia.org/
>
> ___________________________________________________________
> Do You Yahoo!? -- Une adresse @yahoo.fr gratuite et en français !
> Yahoo! Mail : http://fr.mail.yahoo.com
>
> ----- End forwarded message -----
--
"Jason C. Richey" <jasonr(a)bomis.com>
----- End forwarded message -----
Brion Vibber <vibber(a)aludra.usc.edu> posted some nice stats,
showing that the CPU usage is primarily in the Apache daemons.
So, a profiler/hotspot/gprof-like tool for PHP would be useful.
I found "APD" (as well as APC, a PHP compilation caching tool) at:
http://www.omniti.com/~george/php/
I didn't find any other PHP profiling tools, though perhaps they're
out there.
Of course, an alternative could be to run gprof itself against
the Apache daemon (including PHP). Perhaps there's a hotspot
in the underlying Apache/PHP implementation; fixing that would
help Wikipedia and probably many others too.
Anybody have a test platform willing to run the profile?
Again, I don't have such a setup :-(.
I love messages like this. I always feel bad for about 10 seconds,
and as a good honest businessman, my first instinct is to refund their
money. An unhappy customer deserves to get their money back, after
all!
And then I remember...
----- Forwarded message from SaRiNnA220(a)aol.com -----
From: SaRiNnA220(a)aol.com
Date: Sun, 4 May 2003 15:45:28 EDT
To: wikidown(a)bomis.com
Subject: !!!!!
Well, yea, I don't mean to be rude but I'm kind of trying to do a project
here and your site just kind of SHUT DOWN on me. This is a little ridiculous,
you can put it back up now, you know, any time would be nice.
~ A Very Perturbed User
----- End forwarded message -----
> Jens Frank wrote:
> On Fri, May 02, 2003 at 01:15:21PM -0700, David A. Wheeler wrote:
> > Brion Vibber <vibber(a)aludra.usc.edu> posted some nice stats,
> > showing that the CPU usage is primarily in the Apache daemons.
>
> Yes, and the vmstat showed that the CPU is even idle while disk
> I/O is rather high. It's not mysql doing nothing. It's mysql
> waiting for the disk. The new server, how many disks will it
> have?
Breakthrough! Improving disk throughput for the database would probably do
wonders for Wikipedia performance.
Is it possible for you, hardware wise, to leave Apache and PHP on the
current server and move the database to a system with a higher performance
controller and multiple disks? If the database is will continue on the
current server, a controller upgrade may be in order.
Hi listers,
last ten minutes I recived several times the following error message:
Warning: mysql_pconnect() [function.mysql-pconnect]: Access denied for user: 'wikiuser(a)localhost.localdomain' (Using password: YES) in /usr/local/apache/htdocs/w/DatabaseFunctions.php on line 28
Could not connect to DB on 127.0.0.1
Access denied for user: 'wikiuser(a)localhost.localdomain' (Using password: YES)
If this error persists after reloading and clearing your browser cache, please notify the Wikipedia developers.
Just to be sure you are aware of this.
--
Luc Van Oostenryck aka User:Looxix
If the PHP function mysql_pconnect cannot connect to the database, it
produces a warning, divulging MySQL username and password to the world.
This should be switched off if possible. Right now, the MySQL password
for wikiuser needs to be changed immediately, because the world knows
:-)
Axel
__________________________________
Do you Yahoo!?
The New Yahoo! Search - Faster. Easier. Bingo.
http://search.yahoo.com
A long time ago someone suggested that links to empty (but not deleted)
articles should be shown as if the article wouldn't exist (strictly
speaking: red). If I remember correctly this would take to much time.
But now that the software and database are redesigned I wanted to remind
you about the idea. A real deletion of an article often isn't required,
and the benefits would be:
* less work for sysops
* less cause for discussion about the deletion of pages
* less cause for mistrust against sysops who delete pages
I cant see any disadvantages (besides a little more crap in some
article's history), maybe someone else can.
(In case you implement it: A little reminder that an article is empty at
the moment, but has some information in it's history would be nice.)
Kurt
This message comes from the new server... It should now be accessible at 130.94.122.199 to all those people who have accounts on the old server. It is a basic REDHAT 9.0 install, I didn't do anything with the kernel or the like. I didn't install MySQL, PHP, or APACHE, as I assume that we have special needs in these programs.
The machine is a p3 866 WITH 1 gig of ram (2 512meg chips, 2 empty slots. The board is an SMP board, but only has one processor at the moment. The board has bios option for console redirection to a serial port. I have tunred that option on and connected the first serial port of the two wikipedia macines together, for anybody that cares.
That's that... I'm going to bed.
Jason (From the new server)