[Foundation-l] New draft of privacy policy
Brion Vibber
brion at wikimedia.org
Sat Jun 21 19:22:08 UTC 2008
Anthony wrote:
> Something else I think is worth pointing out: "the raw log data is not
> made public, and is normally discarded after about two weeks." has
> changed to "The raw log data is kept indefinitely, but is not made
> public."
>
> I get the impression that this isn't a change in policy, so much as a
> change in wording.
It's an update to reflect actual practice.
> But then, it does seem to contradict the Data
> Retention Policy
> (http://wikimediafoundation.org/wiki/Resolution:Data_Retention_Policy).
The data retention policy is, shall we say, super vague. It makes no
specific provisions, but iterates our general preference for not keeping
lots of private data around for a long time.
CheckUser data -- the scariest for most people as it records IP, proxy
forward records, and user-agent for ALL EDITS -- is kept for 90 days in
the database.
We currently keep 1:1000-sampled *HTTP proxy logs* indefinitely; every
once in a while the whole bunch gets scanned over to batch out some
ad-hoc statistics. At some point we expect to normalize that process, at
which point we'll have no need to keep around the old log samples.
(These would be basically worthless for any kind of track-down-a-user
lookup since there's a 99.9% chance that whatever you're looking for
isn't in it, and even if it is you don't have enough information in the
log to tell.)
Various debug logs are also kept indefinitely, and from time to time old
ones get thrown out.
-- brion
More information about the foundation-l
mailing list