Erik, thanks so much, I wondered why Treeviews was grinding to such a hault
:)
Looking at the category trees do you think there is a reasonable
subcategory depth I could go to to get a fairly complete overview without
you having to prune? If not then yes please if you could prune that would
be great :)
Thanks again
John
On 21 April 2016 at 23:34, Erik Zachte <ezachte(a)wikimedia.org> wrote:
Here are all 96610 subcategories of Education, with
2.6 million articles.
The problem is sometimes one unexpected subcategory can draw in lots of
unexpected content, and the most viewed article can thus be totally
off-topic.
I could do some iterations and prune the tree into something more
manageable, by blacklisting weird subbranches.
https://stats.wikimedia.org/wikimedia/pageviews/categorized/wp-en/2016-02/c…
Erik Zachte
*From:* Wiki-research-l [mailto:
wiki-research-l-bounces(a)lists.wikimedia.org] *On Behalf Of *Leila Zia
*Sent:* Thursday, April 21, 2016 23:13
*To:* Research into Wikimedia content and communities
*Subject:* Re: [Wiki-research-l] Finding the most viewed Wikipedia
articles on education
John, I played with Wikipedia Tools for Google and I'm sure it will do
what you're looking for. Check out this
<https://docs.google.com/spreadsheets/d/1HeFluqXXcSXw14pk_hceKbuxykNaTjOJMLrNxs81Ifk/edit#gid=0>
Google spreadsheet. You just have to repeat a slightly modified formula in
columns B and C to get what you have in column D for all subcategories of
Education listed in A. You can automate that part, too.
L
On Thu, Apr 21, 2016 at 12:39 PM, john cummings <mrjohncummings(a)gmail.com>
wrote:
Hi Leila
Thanks very much, what I need to be able to do is get all the articles
within the category and subcategories of Category:Education and then get
page views for all of them, its a lot of pages...... My friend Ed Saperia
created a spreadsheet to do this but unfortunately the query API limits to
a few 100 articles so its not possible to run the query through that.
Any other suggestions would be very much appreciated.
Thanks
John
On 21 April 2016 at 18:54, Leila Zia <leila(a)wikimedia.org> wrote:
Hi John,
Two comments:
* Have you tried Wikipedia Tools for Google
<https://chrome.google.com/webstore/detail/wikipedia-tools/aiilcelhmpllcgkhhpifagfehbddkdfp?hl=en>?
It's a very neat add-on for Chrome, and in your case, the two functions
WIKICATEGORYMEMBERS and WIKIPAGEVIEWS may help you get what you want.
* If you are looking for having a list of articles related to Education
that are available in English and are missing in another language, you can
use the article recommendation API. For example:
http://recommend.wmflabs.org/api?s=en&t=fr&n=10&article=Educati… gives
you the top 10 recommendations for articles related to Education that are
available in English but missing in French. Note that "related" is not the
same as articles that are in category "Education" though I hope we can
accommodate categories in the future. The documentation for the API is in
here <https://github.com/ewulczyn/translation-recs-app/tree/master/api>.
Hope this helps.
Best,
Leila
Leila Zia
Research Scientist
Wikimedia Foundation
On Thu, Apr 21, 2016 at 5:04 AM, john cummings <mrjohncummings(a)gmail.com>
wrote:
Hi all
I'm doing some work with colleagues from the education sector at UNESCO to
look at improving some of the most viewed education articles on English
language Wikipedia.
I'm trying to use TreeViews to get information on what are the most viewed
articles in Category:Education, unfortunately such large categories just
crash my browser, it means I will have to split the query up into at least
50-100 smaller queries.
Does anyone know of a less manual way around this? Ideally the output
would be spreadsheet of the article title and the number of page views of
the article for a 30, 60 or 90 period in the recent past. I will use
Treeviews if it is the only way but I'd really love to save myself from
half a day of data entry. I imagine this would also be useful for people
working with other organisations for other subjects.
Thanks
John
_______________________________________________
Wiki-research-l mailing list
Wiki-research-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
_______________________________________________
Wiki-research-l mailing list
Wiki-research-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
_______________________________________________
Wiki-research-l mailing list
Wiki-research-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
_______________________________________________
Wiki-research-l mailing list
Wiki-research-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l