[Foundation-l] Data retention

Tim Starling tstarling at wikimedia.org
Thu Sep 11 04:26:16 UTC 2008


Gregory Maxwell wrote:
> On Wed, Sep 10, 2008 at 11:11 PM, Tim Starling <tstarling at wikimedia.org> wrote:
>> Jon wrote:
>>> I could not find this in the privacy policy... however, what is
>>> Wikimedia's current data retention policy?  That is to ask, how long do
>>> projects keep data for use in tools such as checkuser?
>> CheckUser data used to be kept for 3 months, but Aaron recently increased
>> it to 5 months. I'm not sure why or on whose authority.
>>
>> <http://svn.wikimedia.org/viewvc/mediawiki/trunk/extensions/CheckUser/CheckUser.php?r1=39734&r2=40620>
> 
> I think Jon was inquiring about more than just checkuser (notice the
> "such as").  I would assume that anyone asking about data retention in
> general is not overly concerned with the specific modes of retention,
> but is more concerned with the maximum retention time (across all
> modes) of any particular type of private data.

The other logs are not automatically rotated, and need to be manually
purged. The retention time is thus not consistent. Typically we have kept
around 6 months of data. There are error logs, and logs for various kinds
of special requests. They are not used for sockpuppet investigation.

I've said in the past that I think 6 months would be a reasonable horizon
for all private data -- it would give us plenty of data for operations,
and would be a far shorter period than that used by the large commercial
websites.

-- Tim Starling





More information about the wikimedia-l mailing list