Thank you v much John!
Yes that was the case! The number does match now.
Thanks!
Yuki
On 20 Aug 2020, at 16:42, John <phoenixoverride(a)gmail.com> wrote:
Are you limiting your count to namespace 0?
On Thu, Aug 20, 2020 at 10:45 AM Yuki Kumagai <yuki.kumagai(a)cognitionx.io>
wrote:
Hiya
I have a question about wikipedia xml database dump. Apologies if this
wasn't an appropriate place for asking a question.
On a wikipedia page, it's mentioned that the current number of articles in
english is: 6,144,248
https://en.wikipedia.org/wiki/Wikipedia:Size_of_Wikipedia
However when I count the number of page elements in recent dump (excluding
redirects) it's about ~10 million
I was just wondering what would be the reason for this?
Thank you in advance
--
*Yuki Kumagai*
Senior Engineer
CognitionX <https://cognitionx.com/>
Driving the acceleration and responsible deployment of AI
Stay up-to-date with our daily All Things AI
<https://confirmsubscription.com/h/d/13A269E463396CB2> newsletter
_______________________________________________
Xmldatadumps-l mailing list
Xmldatadumps-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/xmldatadumps-l