This month, our research showcase https://www.mediawiki.org/wiki/Wikimedia_Research/Showcase#March_2016 hosts Andrei Rizoiu (Australian National University) to talk about his work http://cm.cecs.anu.edu.au/post/wikiprivacy/ on *how private traits of Wikipedia editors can be exposed from public data* (such as edit histories) using off-the-shelf machine learning techniques. (abstract below)
If you're interested in learning what the combination of machine learning and public data mean for privacy and surveillance, come and join us this *Wednesday March 16* at *1pm Pacific Time*.
The event will be recorded and publicly streamed https://www.youtube.com/watch?v=Xle0oOFCNnk. As usual, we will be hosting the conversation with the speaker and Q&A on the #wikimedia-research channel on IRC.
Looking forward to seeing you there,
Dario
Evolution of Privacy Loss in WikipediaThe cumulative effect of collective online participation has an important and adverse impact on individual privacy. As an online system evolves over time, new digital traces of individual behavior may uncover previously hidden statistical links between an individual’s past actions and her private traits. To quantify this effect, we analyze the evolution of individual privacy loss by studying the edit history of Wikipedia over 13 years, including more than 117,523 different users performing 188,805,088 edits. We trace each Wikipedia’s contributor using apparently harmless features, such as the number of edits performed on predefined broad categories in a given time period (e.g. Mathematics, Culture or Nature). We show that even at this unspecific level of behavior description, it is possible to use off-the-shelf machine learning algorithms to uncover usually undisclosed personal traits, such as gender, religion or education. We provide empirical evidence that the prediction accuracy for almost all private traits consistently improves over time. Surprisingly, the prediction performance for users who stopped editing after a given time still improves. The activities performed by new users seem to have contributed more to this effect than additional activities from existing (but still active) users. Insights from this work should help users, system designers, and policy makers understand and make long-term design choices in online content creation systems.
*Dario Taraborelli *Head of Research, Wikimedia Foundation wikimediafoundation.org • nitens.org • @readermeter http://twitter.com/readermeter
Reminder, this showcase is starting in 5 minutes. See the stream here: https://www.youtube.com/watch?v=Xle0oOFCNnk
Join us on Freenode at #wikimedia-research http://webchat.freenode.net/?channels=wikimedia-research to ask Andrei questions.
-Aaron
On Tue, Mar 15, 2016 at 12:53 PM, Dario Taraborelli < dtaraborelli@wikimedia.org> wrote:
This month, our research showcase https://www.mediawiki.org/wiki/Wikimedia_Research/Showcase#March_2016 hosts Andrei Rizoiu (Australian National University) to talk about his work http://cm.cecs.anu.edu.au/post/wikiprivacy/ on *how private traits of Wikipedia editors can be exposed from public data* (such as edit histories) using off-the-shelf machine learning techniques. (abstract below)
If you're interested in learning what the combination of machine learning and public data mean for privacy and surveillance, come and join us this *Wednesday March 16* at *1pm Pacific Time*.
The event will be recorded and publicly streamed https://www.youtube.com/watch?v=Xle0oOFCNnk. As usual, we will be hosting the conversation with the speaker and Q&A on the #wikimedia-research channel on IRC.
Looking forward to seeing you there,
Dario
Evolution of Privacy Loss in WikipediaThe cumulative effect of collective online participation has an important and adverse impact on individual privacy. As an online system evolves over time, new digital traces of individual behavior may uncover previously hidden statistical links between an individual’s past actions and her private traits. To quantify this effect, we analyze the evolution of individual privacy loss by studying the edit history of Wikipedia over 13 years, including more than 117,523 different users performing 188,805,088 edits. We trace each Wikipedia’s contributor using apparently harmless features, such as the number of edits performed on predefined broad categories in a given time period (e.g. Mathematics, Culture or Nature). We show that even at this unspecific level of behavior description, it is possible to use off-the-shelf machine learning algorithms to uncover usually undisclosed personal traits, such as gender, religion or education. We provide empirical evidence that the prediction accuracy for almost all private traits consistently improves over time. Surprisingly, the prediction performance for users who stopped editing after a given time still improves. The activities performed by new users seem to have contributed more to this effect than additional activities from existing (but still active) users. Insights from this work should help users, system designers, and policy makers understand and make long-term design choices in online content creation systems.
*Dario Taraborelli *Head of Research, Wikimedia Foundation wikimediafoundation.org • nitens.org • @readermeter http://twitter.com/readermeter
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Dario and Aaron, thanks for letting us know about this. Is the research available in writing for people who don't want to sit through the video?
Sarah
On Wed, Mar 16, 2016 at 12:55 PM, Aaron Halfaker ahalfaker@wikimedia.org wrote:
Reminder, this showcase is starting in 5 minutes. See the stream here: https://www.youtube.com/watch?v=Xle0oOFCNnk
Join us on Freenode at #wikimedia-research http://webchat.freenode.net/?channels=wikimedia-research to ask Andrei questions.
-Aaron
On Tue, Mar 15, 2016 at 12:53 PM, Dario Taraborelli < dtaraborelli@wikimedia.org> wrote:
This month, our research showcase https://www.mediawiki.org/wiki/Wikimedia_Research/Showcase#March_2016
hosts
Andrei Rizoiu (Australian National University) to talk about his work http://cm.cecs.anu.edu.au/post/wikiprivacy/ on *how private traits of Wikipedia editors can be exposed from public data* (such as edit histories) using off-the-shelf machine learning techniques. (abstract
below)
If you're interested in learning what the combination of machine learning and public data mean for privacy and surveillance, come and join us this
*Wednesday
March 16* at *1pm Pacific Time*.
The event will be recorded and publicly streamed https://www.youtube.com/watch?v=Xle0oOFCNnk. As usual, we will be hosting the conversation with the speaker and Q&A on the #wikimedia-research channel on IRC.
Looking forward to seeing you there,
Dario
Evolution of Privacy Loss in WikipediaThe cumulative effect of collective online participation has an important and adverse impact on individual privacy. As an online system evolves over time, new digital traces of individual behavior may uncover previously hidden statistical links
between
an individual’s past actions and her private traits. To quantify this effect, we analyze the evolution of individual privacy loss by studying the edit history of Wikipedia over 13 years, including more than 117,523 different users performing 188,805,088 edits. We trace each Wikipedia’s contributor using apparently harmless features, such as the number of
edits
performed on predefined broad categories in a given time period (e.g. Mathematics, Culture or Nature). We show that even at this unspecific
level
of behavior description, it is possible to use off-the-shelf machine learning algorithms to uncover usually undisclosed personal traits, such
as
gender, religion or education. We provide empirical evidence that the prediction accuracy for almost all private traits consistently improves over time. Surprisingly, the prediction performance for users who stopped editing after a given time still improves. The activities performed by
new
users seem to have contributed more to this effect than additional activities from existing (but still active) users. Insights from this
work
should help users, system designers, and policy makers understand and
make
long-term design choices in online content creation systems.
*Dario Taraborelli *Head of Research, Wikimedia Foundation wikimediafoundation.org • nitens.org • @readermeter http://twitter.com/readermeter
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines New messages to: Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe
On Wed, Mar 16, 2016 at 7:53 PM, SarahSV sarahsv.wiki@gmail.com wrote:
Dario and Aaron, thanks for letting us know about this. Is the research available in writing for people who don't want to sit through the video?
Sarah
Sarah – yes, see http://cm.cecs.anu.edu.au/post/wikiprivacy/
On Wed, Mar 16, 2016 at 12:55 PM, Aaron Halfaker ahalfaker@wikimedia.org
wrote:
Reminder, this showcase is starting in 5 minutes. See the stream here: https://www.youtube.com/watch?v=Xle0oOFCNnk
Join us on Freenode at #wikimedia-research http://webchat.freenode.net/?channels=wikimedia-research to ask Andrei questions.
-Aaron
On Tue, Mar 15, 2016 at 12:53 PM, Dario Taraborelli < dtaraborelli@wikimedia.org> wrote:
This month, our research showcase <https://www.mediawiki.org/wiki/Wikimedia_Research/Showcase#March_2016
hosts
Andrei Rizoiu (Australian National University) to talk about his work http://cm.cecs.anu.edu.au/post/wikiprivacy/ on *how private traits
of
Wikipedia editors can be exposed from public data* (such as edit histories) using off-the-shelf machine learning techniques. (abstract
below)
If you're interested in learning what the combination of machine
learning
and public data mean for privacy and surveillance, come and join us
this
*Wednesday
March 16* at *1pm Pacific Time*.
The event will be recorded and publicly streamed https://www.youtube.com/watch?v=Xle0oOFCNnk. As usual, we will be hosting the conversation with the speaker and Q&A on the #wikimedia-research channel on IRC.
Looking forward to seeing you there,
Dario
Evolution of Privacy Loss in WikipediaThe cumulative effect of
collective
online participation has an important and adverse impact on individual privacy. As an online system evolves over time, new digital traces of individual behavior may uncover previously hidden statistical links
between
an individual’s past actions and her private traits. To quantify this effect, we analyze the evolution of individual privacy loss by studying the edit history of Wikipedia over 13 years, including more than
117,523
different users performing 188,805,088 edits. We trace each Wikipedia’s contributor using apparently harmless features, such as the number of
edits
performed on predefined broad categories in a given time period (e.g. Mathematics, Culture or Nature). We show that even at this unspecific
level
of behavior description, it is possible to use off-the-shelf machine learning algorithms to uncover usually undisclosed personal traits,
such
as
gender, religion or education. We provide empirical evidence that the prediction accuracy for almost all private traits consistently improves over time. Surprisingly, the prediction performance for users who
stopped
editing after a given time still improves. The activities performed by
new
users seem to have contributed more to this effect than additional activities from existing (but still active) users. Insights from this
work
should help users, system designers, and policy makers understand and
make
long-term design choices in online content creation systems.
*Dario Taraborelli *Head of Research, Wikimedia Foundation wikimediafoundation.org • nitens.org • @readermeter http://twitter.com/readermeter
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines New messages to: Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe
Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines New messages to: Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe