On 7/2/10, James Howison <james(a)howison.name> wrote:
I'm working on a study for which I'd like to know more about editors'
watchlisting practices. Of course what I'd really like is to know who had
what page on their watchlist when, but I understand the obvious privacy
issues there. I assume those issues explain why that information is not
(AFAIK) available in dumps etc.
I have read some great qualitative pieces which discuss watchlisting [e.g.
1], which are very helpful (please don't hesitate to suggest others), but
haven't seen quantitative data, which our study calls for.
Failing exact data, what do we know about the distribution of practices of
Currently my plan is to assume that anyone who has edited an article in the
past 6 months has it on their watchlist. Obviously a very corse assumption.
A better assumption is that a page is on user A's watchlist if they
edit the page within 10 mins of another user editing the page.
Also worth considering is the public watchlists which are created
using the "related changes" feature. e.g. I have a separate watchlist
for pages I create, as this is publicly information anyway:
wrt to the watchlist, it is only possible to know which pages are on a
watchlist as of _now_, so the data would need to be snapshotted
periodically in order to analyse how an individual manages their
watchlist, etc. I would love to know when I added a page to my
watchlist, but the schema doesn't record this information.
There are quite a few watchlist related bugs, which may also give you
some useful information about how users want to use their watchlist,
and hints into how they are currently using it. ;-)