On Wed, Aug 1, 2018 at 3:07 PM Yuan Gao <gaoyuan@google.com> wrote:
Hi Tilman,
our team, i.e., the team working on extracting the knowledge from Wikipedia in Google, has just compared our crawled data with https://meta.wikimedia.org/wiki/List_of_Wikipedias/Table. In the following sites, we have quite significant diffs:

The stats Special Page for bo.wikipedia provide the following count as of today:

Content pages 5,818
Pages (All pages in the wiki, including talk pages, redirects, etc.)16,498

A page, according to software documentation is: "The automatic definition used by the software at Special:Statistics is: any page that is in the article namespace, is not a redirect page and contains at least one wiki link." Could it be possible that your definition is broader than the Mediawiki one? https://en.wikipedia.org/wiki/Wikipedia:What_is_an_article%3F#Lists_of_articles_and_statistics
Other things I would suggest is if Google may be including duplicate results.

There could be some amount of caching in both the statistics calculation and the rendering of those pages, although probably not enough to double the number of articles.

--
Jaime Crespo
<http://wikimedia.org>