Hi Sumana,

This is great information on gadget "enablement" but you're right, an enabled preference does not equate to usage per se. There must be some way to get this information out of the logfiles. I'm not familiar enough with how gadget usage can be quantified, but I'm sure that it should be possible.

I'm in the process of getting involved deeper in the analytics team as a volunteer ever since I found out about Kraken at the Amsterdam Hackathon. To this end I've come up with some questions that I'd like to be able to ask of the data and sent this to Diederik and Erik as a basis for providing some insights that I think will be valuable. When I am further along in exploring the data and learning the tools, I'll be happy to help take a look at gadget and bot "signs of life" that can be found in the logfiles.



On Thu, Jul 4, 2013 at 12:22 AM, Sumana Harihareswara <sumanah@wikimedia.org> wrote:
Summary: we have some new stats regarding gadget usage across WMF sites,
but I'd like more analysis of gadget & bot usage.

Oliver Keyes has some code and results up at
https://github.com/Ironholds/MetaAnalysis/tree/master/GadgetUsage to
analyze "data around gadgets being used on various wikimedia projects":

"GadgetUsage.r is the generation script. It is dependent on (a) access
to the analytics slaves and (b) the list of databases

"gadget_data.tsv is the raw data, consisting of an aggregate number of
users for each preference on each wiki, with preference, wiki and wiki
type (source, wiki, versity, etc) defined.

"gadgets_by_wikis.tsv is a rework of the data to look at what gadgets
are used on multiple wikis, and how many wikis that is. It also includes
an aggregate of the number of users across those wikis using the gadget.

"wikis_by_gadgets.tsv is a rework that looks at the number of distinct
gadgets on each individual wiki. Unsuprisingly there's a power law."

This helps a lot with addressing one of the analytics "dreams" from
https://www.mediawiki.org/wiki/Analytics/Dreams - "What proportion of
logged-in editors have activated any gadgets at all? What are the most
popular gadgets?"  However, Oliver's data "is based on preference data -
it may or may not include data for those gadgets set as defaults."  So
if someone could improve this to ensure that we appropriately count
gadget usage for gadgets that default to on, that would be very helpful.

My team would also like to know:
* who maintains the most popular gadgets? (so we can invite them to
hackathons, help get them training, get those gadgets localised and
ported to other wikis, and so on)
* when were the gadgets last updated? (so we can identify stale ones
that enthusiastic volunteers could take over maintaining)
* similar stats regarding bot usage -- what bots are making the most
edits, or edits that in aggregate change the most bytes? who owns those
bots? what wikis are they active on? (so we can help maintainers better,
ensure they hear about API breaking changes, etc., and develop a bot
inventory/directory to make it easier for other wikis' users to start
using useful bots)

If there's anyone interested in taking this on, either inside or outside
WMF's Analytics team, that would be great. Otherwise I anticipate that
Engineering Community Team will take it on sometime in the
October-December 2013 period.

Sumana Harihareswara
Engineering Community Manager
Wikimedia Foundation

Analytics mailing list


Michael Wilkes
mobile +31 6 39629706
skype handle eclectiqus