I'm a big fan of the GDPR and why it had to be created. (I'm doing a lot of the bureaucratic work on the tech side at the day job and am getting very used to thinking of ways something could constitute Personally Identifying Information.)
But I'm wondering how we'll approach it for the Wikimedia sites. Not just the log data - but the content.
We already have problems with Right To Be Forgotten, and well-cited content being removed from the search engines.
What do we have in place to deal with this when - not if - we get GDPR requests to remove information about a person from the site?
I don't mean just the letter of the law, in the EU or the US - I mean also, how we can handle this *right*. Because there are multiple competing legitimate interests here, and the editing communities tend to take a lot more care than they're strictly required to by law, because we are here to get things right. (This is why our DMCA numbers are ridiculously low for a top 10 site, for example.)
Is anyone keeping track of what the communities are doing, as well as WMF itself?
- d.
28.05.2018 00:32 David Gerard kirjutas:
But I'm wondering how we'll approach it for the Wikimedia sites. Not just the log data - but the content.
We already have problems with Right To Be Forgotten, and well-cited content being removed from the search engines.
What do we have in place to deal with this when - not if - we get GDPR requests to remove information about a person from the site?
I don't mean just the letter of the law, in the EU or the US - I mean also, how we can handle this *right*.
/---/
Is anyone keeping track of what the communities are doing, as well as WMF itself?
Just a note based on the discussion we had in Wikimedia Estonia, lot of smaller Wikipedia editions have quite low threshold for notability and some of the articles end up almost doxing some hardly noteworthy person. I am not sure how common this is, but already before GDPR I have encouraged some people to write on discussion pages, that there is lot of rather unimportant information in the article that should be edited for encyclopedic clarity and unnecessary parts removed. However, as we know, everything still remains in the article history and is technically found, so from GDPR point of view this is not maybe enough to improve data protection to the amount the new regulation expects to.
Also, in GDPR itself, there are exemptions for a) journalism and b) historical research or "freedom of expression in every democratic society" and "archiving purposes in the public interest, scientific or historical research purposes". These should generally apply to Wikipedia and most Wikimedia projects, however it is not clear what is the right policy for data in revision history and discussion pages, and what are the notability criteria that should be applied. But this was so already before GDPR and most complaints are from people who actually are noteworthy and probably fit under either journalism or historical research category or even both, so there is nothing more to do than notice that notability criteria may need some more clarity and raising of the threshold and start a discussion to improve the criteria.
Any more ideas?
Märt Põder // board member at ee.wikimedia.org | ee.okfn.org -- twitter.com/trtram | facebook.com/boamaod
On 27 May 2018 at 22:32, David Gerard dgerard@gmail.com wrote:
I'm a big fan of the GDPR and why it had to be created. (I'm doing a lot of the bureaucratic work on the tech side at the day job and am getting very used to thinking of ways something could constitute Personally Identifying Information.)
But I'm wondering how we'll approach it for the Wikimedia sites. Not just the log data - but the content.
Wave around article 85 a lot.
We already have problems with Right To Be Forgotten, and well-cited content being removed from the search engines.
What do we have in place to deal with this when - not if - we get GDPR requests to remove information about a person from the site?
Wave around article 85 a lot. The content is a fairly minor problem. Trying to cleanup after users who've inserted their personal information into talk pages presents more of an issue.
I don't mean just the letter of the law, in the EU or the US - I mean also, how we can handle this *right*. Because there are multiple competing legitimate interests here, and the editing communities tend to take a lot more care than they're strictly required to by law, because we are here to get things right. (This is why our DMCA numbers are ridiculously low for a top 10 site, for example.)
At the moment all we can really do is wait and see how it develops. This is why you have sites trying to block the EU even if they are not aware of any issues. 4% of turnover and no caselaw?
I'm not seeing a rush at OTRS yet but that is probably going to be ground zero on working out what to do with this stuff.
I'm not even aware that we'd be subject to GPDR.
We already allow removal of personal information in some cases (outing by others, accidentally revealing one's IP address, etc.). If we were going to allow it in any case that doesn't happen today, that would need to be agreed to by the community, in which case the best thing to do would be an on-wiki RfC.
Todd
On Sun, May 27, 2018 at 3:32 PM, David Gerard dgerard@gmail.com wrote:
I'm a big fan of the GDPR and why it had to be created. (I'm doing a lot of the bureaucratic work on the tech side at the day job and am getting very used to thinking of ways something could constitute Personally Identifying Information.)
But I'm wondering how we'll approach it for the Wikimedia sites. Not just the log data - but the content.
We already have problems with Right To Be Forgotten, and well-cited content being removed from the search engines.
What do we have in place to deal with this when - not if - we get GDPR requests to remove information about a person from the site?
I don't mean just the letter of the law, in the EU or the US - I mean also, how we can handle this *right*. Because there are multiple competing legitimate interests here, and the editing communities tend to take a lot more care than they're strictly required to by law, because we are here to get things right. (This is why our DMCA numbers are ridiculously low for a top 10 site, for example.)
Is anyone keeping track of what the communities are doing, as well as WMF itself?
- d.
Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/ wiki/Mailing_lists/Guidelines and https://meta.wikimedia.org/ wiki/Wikimedia-l New messages to: Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe
On 28 May 2018 at 18:06, Todd Allen toddmallen@gmail.com wrote:
I'm not even aware that we'd be subject to GPDR.
We are. Its really broad. We are dealing with the personal information of people in the European union while engaging in economic activity.
If we were going to allow it in any case that doesn't happen today, that would need to be agreed to by the community, in which case the best thing to do would be an on-wiki RfC.
RfCs don't pay €20 million fines. Its hard to get into specifics without hitting [[WP:BEANS]] territory so for now wait and see is probably the only viable option.
On Sun, May 27, 2018 at 11:32 PM, David Gerard dgerard@gmail.com wrote:
I'm a big fan of the GDPR and why it had to be created. (I'm doing a lot of the bureaucratic work on the tech side at the day job and am getting very used to thinking of ways something could constitute Personally Identifying Information.)
But I'm wondering how we'll approach it for the Wikimedia sites. Not just the log data - but the content.
We already have problems with Right To Be Forgotten, and well-cited content being removed from the search engines.
What do we have in place to deal with this when - not if - we get GDPR requests to remove information about a person from the site?
I don't mean just the letter of the law, in the EU or the US - I mean also, how we can handle this *right*. Because there are multiple competing legitimate interests here, and the editing communities tend to take a lot more care than they're strictly required to by law, because we are here to get things right. (This is why our DMCA numbers are ridiculously low for a top 10 site, for example.)
In general Wikipedia falls under the journalistic exemption ("publication of ideas, information or opinions"), which means many rules from the GDPR are dropped. Mostly what remains is just that a weighing has to be done between the subject's privacy interest and Wikipedia's own reporting interest. Even the possibility to object to that decision is dropped in this case, so if, as I assume will happen, such a request is taken as a reason to re-evaluate that decision, we are already going beyond the minimum of what the law requires.
wikimedia-l@lists.wikimedia.org