As an aside, this may be a case where generators in the api are useful - e.g. https://en.wikipedia.org/w/api.php?action=query&generator=redirects&... (Note: does not include the actual non-redirect article in the results, and you have to pay close attention to the continue parameters) https://en.wikipedia.org/w/api.php?action=query&generator=redirects&...
On Mon, Feb 24, 2020 at 4:28 AM bawolff bawolff+wn@gmail.com wrote:
Hi,
When I tested the api it seemed to work with redirects (e.g. https://mediawiki.org/w/api.php?action=query&format=json&prop=pagevi... Where Main_Page redirects to the page MediaWiki )
Then we attempted to use the redirects of a page and using the old page
ids to grab the pageview data
Just to be clear, when a page is moved, it keeps its page_id. So redirects may have historically had the page_id that the target page has now.
If all else fails, you can look at the big dataset files at https://dumps.wikimedia.org/other/analytics/ . They should be available (in some form or another) going back to 2007, and I believe they are the source of the data that the api and all other tools return.
-- Brian
On Mon, Feb 24, 2020 at 12:17 AM James Gardner via Wikitech-l < wikitech-l@lists.wikimedia.org> wrote:
Hi all,
We are a group of undergraduates working on a project using the MediaWiki API. While working on this project, we ran into a unique issue involving pageviews. When trying to pull pageview data for a particular page, the redirects of a page would not be counted along with the original pageviews. For example, the Hong Kong protests page only has direct views, and not views from previous titles.
We attempted to use the wmflabs.org tool, but it only shows data from a certain date. (Example link:
https://tools.wmflabs.org/pageviews/?project=en.wikipedia.org&platform=a... https://tools.wmflabs.org/pageviews/?project=en.wikipedia.org&platform=all-access&agent=user&start=2019-07-01&end=2020-01-25&pages=2019%E2%80%9320_Hong_Kong_protests%7CChina < https://tools.wmflabs.org/pageviews/?project=en.wikipedia.org&platform=a...
)
Then we attempted to use the redirects of a page and using the old page ids to grab the pageview data, but there was no data returned. When we attempted to grab data for a page that we knew would have a long past, but the parameter of "pvipcontinue" did not appear ( https://www.mediawiki.org/w/api.php?action=help&modules=query%2Bpageview... ). (Example:
https://www.mediawiki.org/wiki/Special:ApiSandbox#action=query&format=js... )
In the end, we are trying to get an accurate count of view for a certain page no matter the source.
Any guidance or assistance is greatly appreciated.
Thanks, Jackie, James, Junyi, Kirby _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l