Thanks for bringing this up! I don't have any answers, but there's a
feature I'd like to build on this dataset. I wonder if bringing this stuff
into a more readily available database could be part of that project in some
way.
Basically, I'd like to publish per-editor pageview stats. That is,
Mediawiki would keep track of the number of times an article had been viewed
since the first day you edited it, and let you know how many times your
edits had been seen (approximately, depending on the resolution of the
data). I think such personalized stats could really help to drive editor
retention. The information is available now through Henrik's tool, but
even if you know about stats.grok.se, it's hard to keep track and make the
connection between the graphs there and one's own contributions.
Clearly, pageview data of at least daily resolution would be required to
make such a thing work.
Are there other specific projects that require this data? It will be much
easier to make a case for accelerating development of the dataset if there
are some clear examples of where it's needed, and especially if it can help
to meet the current editor retention goals.
-Ian
On Thu, Aug 11, 2011 at 3:12 PM, MZMcBride <z(a)mzmcbride.com> wrote:
Hi.
I've been asked a few times recently about doing reports of the most-viewed
pages per month/per day/per year/etc. A few years after Domas first started
publishing this information in raw form, the current situation seems rather
bleak. Henrik has a visualization tool with a very simple JSON API behind
it
(<http://stats.grok.se>), but other than that, I don't know of any efforts
to put this data into a database.
Currently, if you want data on, for example, every article on the English
Wikipedia, you'd have to make 3.7 million individual HTTP requests to
Henrik's tool. At one per second, you're looking at over a month's worth of
continuous fetching. This is obviously not practical.
A lot of people were waiting on Wikimedia's Open Web Analytics work to come
to fruition, but it seems that has been indefinitely put on hold. (Is that
right?)
Is it worth a Toolserver user's time to try to create a database of
per-project, per-page page view statistics? Is it worth a grant from the
Wikimedia Foundation to have someone work on this? Is it worth trying to
convince Wikimedia Deutschland to assign resources? And, of course, it
wouldn't be a bad idea if Domas' first-pass implementation was improved on
Wikimedia's side, regardless.
Thoughts and comments welcome on this. There's a lot of desire to have a
usable system.
MZMcBride
_______________________________________________
Wikitech-l mailing list
Wikitech-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l