Glamorous & Massview report?

List overview All Threads
Download

newer

older

(no subject)

Analytics Cluster Hadoop Upgrade...

Itzik - Wikimedia Israel

26 Feb 2017 26 Feb '17

11:13 p.m.

Hey,

I'm trying to create a report of pageviews of all the articles that uses a file from a specific commons category ("Wikimedia Israel - Channel 2 videos").

I pulled up the list of articles using Glamorous: https://tools.wmflabs.org/glamtools/glamorous.php?doit=1&category=Wikime...

And now I look for a way to use massviews in order to get the pageviews of this pages (or even better, pagesviews per file).

And last thing, although I asked it half a year ago, I'll try again, maybe something has changed since - there is an easy way to get video views statistics also of this files?

Thanks :)

*Regards,Itzik Edri* Chairperson, Wikimedia Israel +972-54-5878078 | http://www.wikimedia.org.il Imagine a world in which every single human being can freely share in the sum of all knowledge. That's our commitment!

Attachments:

attachment.htm (text/html — 2.9 KB)

Show replies by date

Federico Leva (Nemo)

26 Feb 26 Feb

11:23 p.m.

Itzik - Wikimedia Israel, 26/02/2017 16:13:

...

file from a specific commons category ("Wikimedia Israel - Channel 2 videos").

https://github.com/hay/wiki-tools/blob/master/etc/mediacounts-stats.py

Nemo

Itzik - Wikimedia Israel

27 Feb 27 Feb

12:56 a.m.

ammm.. maybe a easier way for someone who don't want to play with code and download dumps? :)

On Sun, Feb 26, 2017 at 5:23 PM, Federico Leva (Nemo) nemowiki@gmail.com wrote:

...

Itzik - Wikimedia Israel, 26/02/2017 16:13:

...
file from a specific commons category ("Wikimedia Israel - Channel 2 videos").

https://github.com/hay/wiki-tools/blob/master/etc/mediacounts-stats.py

Nemo

Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics

Leon Ziemba

1:14 a.m.

Hi Itzik,

It looks like GLAMorous has an XML endpoint https://tools.wmflabs.org/glamtools/glamorous.php?doit=1&category=Wikimedia+Israel+-+Channel+2+videos&use_globalusage=1&ns0=1&show_details=1&projects[wikipedia]=1&projects[wikimedia]=1&projects[wikisource]=1&projects[wikibooks]=1&projects[wikiquote]=1&projects[wiktionary]=1&projects[wikinews]=1&projects[wikivoyage]=1&projects[wikispecies]=1&projects[mediawiki]=1&projects[wikidata]=1&projects[wikiversity]=1&format=xml for the data, so it's possible for me to add a new "Source" into Massviews for this specific purpose. I will look into it :)

However it sounds like what you really want is the number of times the media was actually played, which is what the code was for that Nemo linked to. I'm working on a client for this (T149642 https://phabricator.wikimedia.org/T149642) that I hope to be released soon. It will allow you to plug in the Commons category and get all the plays for each file within it. The API that makes this possible is up and running, if are able to make use of it. For instance here https://tools.wmflabs.org/mediaplaycounts/api/1/CategoryPlaycount/date/Wikimedia_Israel_-_Channel_2_videos/20170101 are the play counts for each file on January 1, 2017. See also the API documentation https://phabricator.wikimedia.org/P4339.

Hope this helps, if not just to tell you a solution is soon to come!

~ MusikAnimal

On Sun, Feb 26, 2017 at 11:56 AM, Itzik - Wikimedia Israel < itzik@wikimedia.org.il> wrote:

...

ammm.. maybe a easier way for someone who don't want to play with code and download dumps? :)

*Regards,Itzik Edri* Chairperson, Wikimedia Israel +972-54-5878078 <+972%2054-587-8078> | http://www.wikimedia.org.il Imagine a world in which every single human being can freely share in the sum of all knowledge. That's our commitment!

On Sun, Feb 26, 2017 at 5:23 PM, Federico Leva (Nemo) nemowiki@gmail.com wrote:

...
Itzik - Wikimedia Israel, 26/02/2017 16:13:

...
file from a specific commons category ("Wikimedia Israel - Channel 2 videos").

https://github.com/hay/wiki-tools/blob/master/etc/mediacounts-stats.py

Nemo

Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics

Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics

Federico Leva (Nemo)

1:32 a.m.

Itzik - Wikimedia Israel, 26/02/2017 17:56:

...

ammm.. maybe a easier way for someone who don't want to play with code and download dumps? :)

Does a standard command like grep qualify as easier? :-) On a computer with some bandwidth I did something like:

wget -r -np -nH -nd -A bz2 http://ftp.acc.umu.se/mirror/wikimedia.org/other/mediacounts/daily/2016/ ; find -name "mediacounts*bz2" -print0 | xargs -0 -P8 -I§ -n1 bzgrep webm § | grep -E '/Channel_?2.+webm' > 2016-12-channel2.csv

Which gives me about 650k accesses during December 2016, of which 10500 downloads as complete file and 8k streamed plays.

Nemo

Leon Ziemba

5:09 a.m.

Alrighty, I've added GLAMorous as a new source to Massviews. To use just enter the Commons category name. So for your category: http://tools.wmflabs.org/massviews-test/?platform=all-access&agent=user&...

Hopefully this is what you were looking for, I know some other GLAM folks requested this (T150507 https://phabricator.wikimedia.org/T150507).

This is only available on the test version of Massviews because I would still very much consider this a quick demo. It seems to work fine for your example but it is by no means production-ready, and you may encounter random errors, especially with larger datasets. Some translations are also missing.

The source is called "GLAMorous" but that's actually a misnomer. I ended up implementing everything myself, so it does not use the GLAMorous tool at all. It also lacks features like selectively choosing projects, and it assumes you only want mainspace pages. I imagine this accounts for most use cases, though.

Let me know if you run into any problems or have any feedback. I'm sure you'd like to see pageviews totals grouped by the source file, which I will try to work on soon, among other features, before officially releasing this.

Best,

~MA

On Sun, Feb 26, 2017 at 12:32 PM, Federico Leva (Nemo) nemowiki@gmail.com wrote:

...

Itzik - Wikimedia Israel, 26/02/2017 17:56:

...
ammm.. maybe a easier way for someone who don't want to play with code and download dumps? :)

Does a standard command like grep qualify as easier? :-) On a computer with some bandwidth I did something like:

wget -r -np -nH -nd -A bz2 http://ftp.acc.umu.se/mirror/w ikimedia.org/other/mediacounts/daily/2016/ ; find -name "mediacounts*bz2" -print0 | xargs -0 -P8 -I§ -n1 bzgrep webm § | grep -E '/Channel_?2.+webm'

...
2016-12-channel2.csv

Which gives me about 650k accesses during December 2016, of which 10500 downloads as complete file and 8k streamed plays.

Nemo

Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics

Itzik - Wikimedia Israel

1 Mar 1 Mar

6:43 a.m.

Amazing!

Thank you Leon and Federico for your quick help. As always I'm happy to see more and more features to the Analytics tool :)

On Sun, Feb 26, 2017 at 11:09 PM, Leon Ziemba musikanimal@wikimedia.org wrote:

...

Alrighty, I've added GLAMorous as a new source to Massviews. To use just enter the Commons category name. So for your category: http://tools.wmflabs.org/massviews-test/?platform=all- access&agent=user&source=glamorous&target=Wikimedia_ Israel_-_Channel_2_videos&range=latest-20&sort=views& direction=1&view=list&debug=true

Hopefully this is what you were looking for, I know some other GLAM folks requested this (T150507 https://phabricator.wikimedia.org/T150507).

This is only available on the test version of Massviews because I would still very much consider this a quick demo. It seems to work fine for your example but it is by no means production-ready, and you may encounter random errors, especially with larger datasets. Some translations are also missing.

The source is called "GLAMorous" but that's actually a misnomer. I ended up implementing everything myself, so it does not use the GLAMorous tool at all. It also lacks features like selectively choosing projects, and it assumes you only want mainspace pages. I imagine this accounts for most use cases, though.

Let me know if you run into any problems or have any feedback. I'm sure you'd like to see pageviews totals grouped by the source file, which I will try to work on soon, among other features, before officially releasing this.

Best,

~MA

On Sun, Feb 26, 2017 at 12:32 PM, Federico Leva (Nemo) <nemowiki@gmail.com

...
wrote:

...
Itzik - Wikimedia Israel, 26/02/2017 17:56:

...
ammm.. maybe a easier way for someone who don't want to play with code and download dumps? :)

Does a standard command like grep qualify as easier? :-) On a computer with some bandwidth I did something like:

wget -r -np -nH -nd -A bz2 http://ftp.acc.umu.se/mirror/w ikimedia.org/other/mediacounts/daily/2016/ ; find -name "mediacounts*bz2" -print0 | xargs -0 -P8 -I§ -n1 bzgrep webm § | grep -E '/Channel_?2.+webm' > 2016-12-channel2.csv

Which gives me about 650k accesses during December 2016, of which 10500 downloads as complete file and 8k streamed plays.

Nemo

Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics

Thomas Steiner

27 Feb 27 Feb

12:04 a.m.

Hello Itzik,

For the pageviews part, maybe you can make use of my Google Spreadsheets add-on ( https://github.com/tomayac/wikipedia-tools-for-google-spreadsheets/blob/mast...), specifically its WIKIPAGEVIEWS() function.

Cheers, Tom

-- Dr. Thomas Steiner, Employee (https://blog.tomayac.com, https://twitter.com/tomayac) Google Germany GmbH, ABC-Str. 19, 20354 Hamburg, Germany Managing Directors: Matthew Scott Sucherman, Paul Terence Manicle Registration office and registration number: Hamburg, HRB 86891 -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.1.17 (GNU/Linux) iFy0uwAntT0bE3xtRa5AfeCheCkthAtTh3reSabiGbl0ck0fjumBl3DCharaCTersAttH3b0ttom.hTtP5://xKcd.c0m/1181/ -----END PGP SIGNATURE-----

2706

Age (days ago)

2708

Last active (days ago)

analytics@lists.wikimedia.org

7 comments

4 participants

tags (0)

participants (4)

Federico Leva (Nemo)
Itzik - Wikimedia Israel
Leon Ziemba
Thomas Steiner