[Foundation-l] New draft of privacy policy

Brion Vibber brion at wikimedia.org
Sat Jun 21 19:22:08 UTC 2008


Anthony wrote:
> Something else I think is worth pointing out: "the raw log data is not
> made public, and is normally discarded after about two weeks." has
> changed to "The raw log data is kept indefinitely, but is not made
> public."
> 
> I get the impression that this isn't a change in policy, so much as a
> change in wording.

It's an update to reflect actual practice.

> But then, it does seem to contradict the Data
> Retention Policy
> (http://wikimediafoundation.org/wiki/Resolution:Data_Retention_Policy).

The data retention policy is, shall we say, super vague. It makes no 
specific provisions, but iterates our general preference for not keeping 
lots of private data around for a long time.

CheckUser data -- the scariest for most people as it records IP, proxy 
forward records, and user-agent for ALL EDITS -- is kept for 90 days in 
the database.

We currently keep 1:1000-sampled *HTTP proxy logs* indefinitely; every 
once in a while the whole bunch gets scanned over to batch out some 
ad-hoc statistics. At some point we expect to normalize that process, at 
which point we'll have no need to keep around the old log samples.

(These would be basically worthless for any kind of track-down-a-user 
lookup since there's a 99.9% chance that whatever you're looking for 
isn't in it, and even if it is you don't have enough information in the 
log to tell.)

Various debug logs are also kept indefinitely, and from time to time old 
ones get thrown out.

-- brion



More information about the foundation-l mailing list