Hello there,
I'm a beginner programmer and am working on a project using Wikipedia views, trying to reproduce a paper. I was pulling json from http://stats.grok.se but the last month isn't updated. I'm trying to get the last 30 days of wikipedia views for several pages, but the hourly files are 70mb compressed. Is it possible for me to query the specific data directly through some kind of Wikipedia database directly?
Otherwise in regards to the files (located at http://dumps.wikimedia.org/other/pagecounts-raw/2016/2016-01/), should they be opened/accessed in some way? I'm using a high level language so working with these things sounds like it will break my computer.
Like I said, I'm fairly new to this all so I apologize in advance if these questions seem silly.
Thank you for any help,
Dominic Della Sera
Hi Dominic,
You might be interested in Pageviews.js: https://github.com/tomayac/pageviews.js.
Cheers, Tom
There are also R (https://github.com/Ironholds/pageviews) and Python (https://github.com/mediawiki-utilities/python-mwviews) clients depending on your language of preference :)
On 29 February 2016 at 08:40, Thomas Steiner tomac@google.com wrote:
Hi Dominic,
You might be interested in Pageviews.js: https://github.com/tomayac/pageviews.js.
Cheers, Tom
-- Dr. Thomas Steiner, Employee (blog.tomayac.com, twitter.com/tomayac)
Google Germany GmbH, ABC-Str. 19, 20354 Hamburg Geschäftsführer: Matthew Scott Sucherman, Paul Terence Manicle Registergericht und -nummer: Hamburg, HRB 86891
-----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.29 (GNU/Linux)
iFy0uwAntT0bE3xtRa5AfeCheCkthAtTh3reSabiGbl0ck0fjumBl3DCharaCTersAttH3b0ttom.hTtP5://xKcd.c0m/1181/ -----END PGP SIGNATURE-----
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Hi Dominic,
There's an API you can use to get pageviews [1]. If you are using Python, JS or R, there are convenient libraries that make it even easier [2].
The Wikimedia Foundation does not maintain stats.grok.se, it was built years ago and the server has not been very reliable lately. The new API maintained by the foundation is the best way to go.
[1] https://wikitech.wikimedia.org/wiki/Analytics/PageviewAPI [2] https://wikitech.wikimedia.org/wiki/Analytics/PageviewAPI#Clients
On Mon, Feb 29, 2016 at 7:33 AM, Oliver Keyes okeyes@wikimedia.org wrote:
There are also R (https://github.com/Ironholds/pageviews) and Python (https://github.com/mediawiki-utilities/python-mwviews) clients depending on your language of preference :)
On 29 February 2016 at 08:40, Thomas Steiner tomac@google.com wrote:
Hi Dominic,
You might be interested in Pageviews.js: https://github.com/tomayac/pageviews.js.
Cheers, Tom
-- Dr. Thomas Steiner, Employee (blog.tomayac.com, twitter.com/tomayac)
Google Germany GmbH, ABC-Str. 19, 20354 Hamburg Geschäftsführer: Matthew Scott Sucherman, Paul Terence Manicle Registergericht und -nummer: Hamburg, HRB 86891
-----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.29 (GNU/Linux)
iFy0uwAntT0bE3xtRa5AfeCheCkthAtTh3reSabiGbl0ck0fjumBl3DCharaCTersAttH3b0ttom.hTtP5://xKcd.c0m/1181/
-----END PGP SIGNATURE-----
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
-- Oliver Keyes Count Logula Wikimedia Foundation
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics