[Foundation-l] Data retention

Tim Starling tstarling at wikimedia.org
Sat Sep 20 02:23:03 UTC 2008


Joe Szilagyi wrote:
> On Thu, Sep 18, 2008 at 5:33 PM, Tim Starling <tstarling at wikimedia.org>wrote:
> 
>> Joe Szilagyi wrote:
>>> That is what has been said around the chatter lines. Was this documented
>> in
>>> the SVN somewhere if so, and approved? For all Wikis? Just some?
>> Are you implying that this change could somehow be controversial? If so,
>> can you explain how that might be?
>>
> 
> Not inherently controversial, but I'm not clear on if the CU data retention
> is the same on each project, or different on each--does Chinese Wikipedia
> save as long as English Wikipedia? Does Commons save as long as Wikinews,
> etc.? If there is any change to the actual length in data retention, who
> makes that decision? The WMF board? Sue? The checkuser mail list? Shouldn't
> that sort of matter be decided with community input?

It's the same everywhere, it's three months. Neither the Board nor the
executive have expressed any desire to make that decision, but they are
free to weigh in if they want to. We chose the three month figure as a
compromise between privacy advocates and troll hunters. The checkuser-l
mailing list only represents one of those two groups, which is why I don't
think it's appropriate that they should make that decision. I think
community input should be encouraged, which is why I think the figure
should be public.

> I've just been thinking aloud and wondering if the debatable value of any
> obfuscation of the retention length of Checkuser data, rather than clearly
> articulating it in public, outweighs the risk and harm to some users given
> that in the wake of the Poetlister incident we've seen that Checkuser data
> is not compromise-proof.

Some people have said "trolls just leave the site for 3 months and come
back when they know the CheckUser data has expired". The same argument
would apply for any finite retention period, it's obvious that some sort
of trade-off has to be made. If it's secret and short, then the trolls
will work it out eventually anyway. If it's secret and long, then that's
the worst possible situation for privacy.

Troll hunters can and should retain CheckUser results for particularly
troublesome users, beyond the database retention period. Often, their IP
address becomes public anyway, when it is found in email headers and
anonymous edits.

If a troll stays away from the site for 3 months to avoid detection by
CheckUser, then you should consider yourself lucky. That's one of the best
possible outcomes. There are lots of ways a troll can disrupt the site
continuously, regardless of the data retention time.

-- Tim Starling





More information about the wikimedia-l mailing list