Thank you v much John!

Yes that was the case! The number does match now.

Thanks!
Yuki


On 20 Aug 2020, at 16:42, John <phoenixoverride@gmail.com> wrote:

Are you limiting your count to namespace 0?

On Thu, Aug 20, 2020 at 10:45 AM Yuki Kumagai <yuki.kumagai@cognitionx.io> wrote:
Hiya

I have a question about wikipedia xml database dump. Apologies if this wasn't an appropriate place for asking a question.
On a wikipedia page, it's mentioned that the current number of articles in english is: 6,144,248
https://en.wikipedia.org/wiki/Wikipedia:Size_of_Wikipedia

However when I count the number of page elements in recent dump (excluding redirects) it's about ~10 million
I was just wondering what would be the reason for this?

Thank you in advance 

--
Yuki Kumagai
Senior Engineer
CognitionX 





Driving the acceleration and responsible deployment of AI
Stay up-to-date with our daily All Things AI newsletter




_______________________________________________

Xmldatadumps-l mailing list

Xmldatadumps-l@lists.wikimedia.org

https://lists.wikimedia.org/mailman/listinfo/xmldatadumps-l