Hello, Iam trying to figure out a discrepancy between
(i) querying statistics numbers returned by the API (e.g. http://www.grazwiki.at/api.php?action=query&meta=siteinfo&siprop=sta...) and
(ii) querying the page titles and metadata using the API (e.g. http://www.grazwiki.at/api.php?action=query&list=allpages&aplimit=50...)
(i) reports 12816 pages 6300 articles 46584 edits 6551 images 137 users
(ii) sums up to 17788 pages (sum over all namespaceIDs >=0) 46906 edits (parsed using http://www.grazwiki.at/api.php?action=query&prop=revisions&pageids=1...) 72 users (http://www.grazwiki.at/api.php?action=query&list=allusers&aulimit=50...)
! So I have much more pages when fetching them step by step than reported by the statistics query (only namespaceIDs >= 0, ensuring that I do not count duplicated) ! API lists 72 users, but statistics report 137 ! Edits roughly match, but I have other Wikis in the pipeline where the difference is much higher
First I thought that the official statistics leave out some namespaces, but I could not figure combination which would explain the results. Could it be cached results? That would be quite a difference and it would not really explain the user-delta.
Iam extracting the Wikis for research and want to make sure that I have an accurate extraction/representation of the Wikis.
=> I would appreciate any information and ideas that can explain the differences.
Regards,
Rüdiger
On Tue, Jan 30, 2018 at 5:07 AM, Rüdiger Gleim ruediger.gleim@gmx.de wrote:
=> I would appreciate any information and ideas that can explain the differences.
If a bug at some point caused the site_stats table to not be updated for some situation, that would result in such discrepancies.
Try running maintenance/initSiteStats.php with the --update option on your wiki.
wikitech-l@lists.wikimedia.org