On 7/2/10, James Howison james@howison.name wrote:
Hi all,
I'm working on a study for which I'd like to know more about editors' watchlisting practices. Of course what I'd really like is to know who had what page on their watchlist when, but I understand the obvious privacy issues there. I assume those issues explain why that information is not (AFAIK) available in dumps etc.
I have read some great qualitative pieces which discuss watchlisting [e.g. 1], which are very helpful (please don't hesitate to suggest others), but haven't seen quantitative data, which our study calls for.
Failing exact data, what do we know about the distribution of practices of watchlisting?
Currently my plan is to assume that anyone who has edited an article in the past 6 months has it on their watchlist. Obviously a very corse assumption.
A better assumption is that a page is on user A's watchlist if they edit the page within 10 mins of another user editing the page.
Also worth considering is the public watchlists which are created using the "related changes" feature. e.g. I have a separate watchlist for pages I create, as this is publicly information anyway:
https://secure.wikimedia.org/wikipedia/en/wiki/Special:RecentChangesLinked/U...
wrt to the watchlist, it is only possible to know which pages are on a watchlist as of _now_, so the data would need to be snapshotted periodically in order to analyse how an individual manages their watchlist, etc. I would love to know when I added a page to my watchlist, but the schema doesn't record this information.
http://www.mediawiki.org/wiki/Manual:Watchlist_table
There are quite a few watchlist related bugs, which may also give you some useful information about how users want to use their watchlist, and hints into how they are currently using it. ;-)
https://bugzilla.wikimedia.org/buglist.cgi?quicksearch=watchlist
-- John Vandenberg