Am 07.09.2015 um 18:05 schrieb Emilio J. RodrĂguez-Posada:
Wow, that is a big difference. Almost 4 million.
I think that MediaWiki doesn't count pages without any [[link]]. Is that the reason?
No, that only applies to Wikitext.
Here is the relevant code from ItemContent:
public function isCountable( $hasLinks = null ) { return !$this->isRedirect() && !$this->getItem()->isEmpty(); }
And the relevant code from Item:
public function isEmpty() { return $this->fingerprint->isEmpty() && $this->statements->isEmpty() && $this->siteLinks->isEmpty(); }
So all pages that are not empty (have labels or descriptions or aliases or statements or sitelinks), and are not redirects, should be counted.
Is it possible that the difference of 3,694,285 is mainly redirects? Which dump were you refering to, Markus? The XML dump contains redirects, and so does the RDF dump. The JSON dump doesn't... so if you were referring to the JSON dump, that would imply we have 3.7 million empty (useless) items.
Or, of course, the counting mechanism is just broken. Which is quite possible.