This went off-list for some reason...
---------- Forwarded message ---------- From: Thomas Dalton thomas.dalton@gmail.com Date: 29 June 2010 01:18 Subject: Re: [Xmldatadumps-l] Number of pages on Wikipedia To: "Chrisil J. Arackaparambil" chrisil@lanl.gov
On 29 June 2010 01:06, Chrisil J. Arackaparambil chrisil@lanl.gov wrote:
Hello everybody,
I was doing a bit of analysis of the dump enwiki-20100130-pages-meta-history.xml.7z. What I found to my surprise is that there are (at least) 7 million pages in the main namespace. I got this figure by grepping for page titles that do not contain a ":" character. Is this really the case or am I missing something? I'd seen some Wikimedia stats that said the number of articles currently is about 3.2 million, so I'm not sure why I'm seeing so many pages in the dump.
The 3.2 million figure does not include redirects.
xmldatadumps-l@lists.wikimedia.org