Jimbo,
Hate to bother you when you were up so late last night, but...
I think www.wikipedia.org is down. I haven't gotten a single web page
(since 9:00 A.M. when I first checked it). Nor any e-mail (time-stamped
after 5:00 A.M.).
Ed Poor
Magnus,
Thank you for your work on the offline parser. Whether it's in C or Java
doesn't matter too much too me, since I'd rather get involved in coding.
But I hope I can get a version that will run under Windoze (until I get
Linux running on my personal box).
Hello,
Attached is a patch against the current phase3 branch.
It adds a new column to the cur table, cur_text_length,
that stores the length of the cur_text column.
It is used by Special:Shortpages, Special:Longpages,
Special:Newpages and stub detection.
To add it to an existing wikipedia, you have to
execute these SQL statements.
ALTER TABLE cur
ADD COLUMN (cur_text_length int(8) unsigned NOT NULL default 0);
CREATE INDEX namespace_redirect_length
ON cur(cur_namespace,cur_is_redirect,cur_text_length);
UPDATE cur
SET cur_text_length = length(cur_text);
Beware! This takes some time for a big DB like the English one.
The order of the indexes columns is important, don't change it.
Regards,
JeLuF
Cross posted to Intlwiki-l
I haven't seen any objections about the page names, so could somebody export
the output of
http://meta.wikipedia.org/wiki/Mobilisation_de_fonds
and
http://meta.wikipedia.org/wiki/Geldbeschaffung
to, respectively
http://wikimediafoundation.org/Mobilisation_de_fonds
and
http://wikimediafoundation.org/Geldbeschaffung
Very important note: These pages, for now, will NOT be protected, so please do
not link them from anywhere important or seemingly official (such as the
sidebar or current wiki donation pages). I expect that these pages will go
through some more basic development before they are protected and go live
(for example; some explanation needs to be added that at least in France and
Belgium new PayPal users need two credit cards and some type of access
number).
-- Daniel Mayer (aka mav)
The undeletion system has stopped working perhaps a
week ago on the french wiki.
I supposed it bothers nobody else but me, since I am
the only one to apparently have noticed :-)
Could it be fixed please ? Thanks a lot
__________________________________
Do you Yahoo!?
The New Yahoo! Shopping - with improved product search
http://shopping.yahoo.com
Hello -
I am downloading en.wikipedia and trying to get both the cur and old
tables on my system.
I was able to do an import into a 3.* version of MySQL, but I do not
have administrative control of that machine, so I could get the table
space set up to be big enough.
I have total control over this machine. It is a Mac OS X 10.3 running
MySQL 4.1.0-alpha-debug. The SQL files that I was able to download gave
me a syntax error before any data makes it into the tables. I am
re-downloading to see if that helps.
Does anyone have any suggestions? Are there known issues with importing
into 4.1.* MySQL?
I see that there were, last January, discussions about being able to be
a replication client of the databases, but I do not any follow-up on
these discussions, so it is not clear that one can become a replication
client at this point.
Also, is there a history of the downloadable files? In other words, if
I look at today's file, I can see that the md5 digest for
http://download.wikipedia.org/archives/en/20031010_old_table.sql.bz2 is
6178dc8bbb25c9788b04cd5cb692a70b, but given a copy of
20030922_old_table.sql.bz2, I cannot seem to tell what its md5 digest
was supposed to be. It would be helpful to be able to look at a list of
the data files that were put up in the past and see what their digest
was.
thanx - ray
Andre Engels wrote:
> On Mon, 22 Sep 2003, Tim Starling wrote:
> > logged-in users, that way most edits go to a web server which is close
> > to the master DB.
>
> Would it not be better to keep the mirrors read-only, and have them redirect
> to the master for write-access? To have writing in several places causes
> significant overhead in avoiding edit conflicts and such.
There is a lot of hypothesis and discussion here. Have you considered
that there are some 40-60 page views for every single edit? What about
using some real statistics instead of guessing? (Just my hypothesis.)
Plus the tech discussion should be on wikitech-l, not wikipedia-l.
http://www.wikipedia.org/wiki/Special:Statistics reports 40 views per
edit, as an average since July 2002. More recently, the English
Wikipedia has received:
Month Edits per month Page views views/edit
--------- --------------- ---------- ----------
July, 2003 212K 9.9M 46
Aug, 2003 248K 13.0M 52
Sept 1-24, 2003 227K 13.9M 61
As a comparison, the fast response time susning.nu wiki features 100
page views per edit. A faster Wikipedia would receive more page
views, probably 30M per month. The number of edits per month would
also increase, but perhaps not as much.
The Wikipedia statistics are spread out over too many places, and none
of these pages are wiki-editable, so I cannot add cross-reference
links.
- Webalizer graphs, http://www.wikipedia.org/stats/
- Article count, http://www.wikipedia.org/wiki/Special:Statistics
- Erik Zachte's edit count, http://www.wikipedia.org/wikistats/
older version at http://members.chello.nl/epzachte/Wikipedia/Statistics/EN/Sitemap.htm
--
Lars Aronsson (lars(a)aronsson.se)
Aronsson Datateknik - http://aronsson.se/
I wrote a little perl script to monitor the health of Wikipedia by
regularly checking the file size of the main log. The idea was that it
would email me if apache crashed. I soon discovered something rather
interesting and unexpected.
Once every 15 minutes or so, Wikipedia shuts down for a minute or two.
This is bad, needless to say. During these episodes, it shows few if any
log entries. Load is depressed slightly (from ~20 to ~10), so CPU
starvation didn't seem likely. I suspected database troubles.
Attached is a zip file containing the output from SHOW PROCESSLIST.
There is one file showing "typical" output, between the episodes, and
three showing output in the middle of an episode. One thing stands out
like a sore thumb:
SELECT 1 FROM user_newtalk WHERE user_ip='167.206.112.85'
SELECT 1 FROM user_newtalk WHERE user_ip='217.244.15.107'
SELECT 1 FROM user_newtalk WHERE user_ip='80.55.166.58'
SELECT 1 FROM user_newtalk WHERE user_ip='66.196.90.11'
SELECT 1 FROM user_newtalk WHERE user_ip='172.176.254.188'
SELECT 1 FROM user_newtalk WHERE user_ip='220.54.212.24'
SELECT 1 FROM user_newtalk WHERE user_id=7580
SELECT 1 FROM user_newtalk WHERE user_ip='194.78.48.226'
SELECT 1 FROM user_newtalk WHERE user_ip='80.225.14.29'
SELECT 1 FROM user_newtalk WHERE user_ip='144.32.128.73'
SELECT 1 FROM user_newtalk WHERE user_ip='62.216.15.127'
SELECT 1 FROM user_newtalk WHERE user_ip='134.151.225.179'
SELECT 1 FROM user_newtalk WHERE user_ip='213.81.145.123'
SELECT 1 FROM user_newtalk WHERE user_ip='129.137.208.159'
SELECT 1 FROM user_newtalk WHERE user_ip='195.93.72.17'
SELECT 1 FROM user_newtalk WHERE user_ip='218.19.141.2'
SELECT 1 FROM user_newtalk WHERE user_ip='68.11.187.242'
SELECT 1 FROM user_newtalk WHERE user_ip='63.34.208.93'
SELECT 1 FROM user_newtalk WHERE user_ip='81.196.21.16'
SELECT 1 FROM user_newtalk WHERE user_ip='202.156.2.82'
Millions of user_newtalk requests, conspicuously absent from the typical
dump. Many of them are doing "statistics", whatever that means, and the
rest are locked. One side of me is curious as to what "statistics" means
and why it produces this behaviour. The other side of me says
KILL KILL KILL
I DON'T CARE WHY OR HOW IT'S HAPPENING JUST KILL IT NOW!!!!!!
ARRRRGGHH (axe hitting hard drive noises)
Anyone up for some memcached programming?
Also attached is some sample output from the monitor program. Each entry
represents 5 seconds, and is expressed in units of the "danger
threshold" of 100 bytes per second.
BTW this appears to be unrelated to whatever the problem is with it as I
type, i.e. a long-term (~1hr) slowdown with load at 0.25 and log file
turnover at ~20% of normal.
-- Tim Starling.
For the moment I've disabled read-write SQL access for developer
accounts on the wiki. Our active developers should already have login
access to the server where they can do read-write sql when needed, more
securely.
Anyone marked "developer" on the wiki can lock or unlock the wiki into
read-only mode, and look at our PHP configuration. And, that's about it.
-- brion vibber (brion @ pobox.com)