In a nutshell: We are asking for your input to help us learn how to release the historical edit data of Wikimedia projects in a more efficient way. Please provide your feedback via https://docs.google.com/forms/d/e/1FAIpQLScc15eSeFrVvAh_ydpX_1p0v6-WSx2qe3Ec... by 2019-09-03.
****** Dear researchers,
The Analytics team at Wikimedia Foundation [1] has been working on building a data lake [2] for Wikimedia edits [3] to enable the research and analysis of Wikimedia's edit data in a more efficient way. This data is a history of activity on Wikimedia projects as complete and research-friendly as possible. Edits have context, such as whether they were reverted, in the same line as the edit itself. So you can focus more on what you want to find out instead of writing code to wrestle the data. Each line of the data released will include the following and more (see full specification [3a], [3b], [3c]): * editor edit count, groups, blocks, bot status, name, current and historical (time of edit) * seconds since this editor's last edit * page context, current and historical (namespace, seconds since last revision, etc.) * seconds to identity revert or deletion, if applicable * revision tags (mobile edit, ve edit, etc.)
The first instance of this data will be released in the coming months and to make this release as useful as possible for you all, the users of the data, the team needs to hear your thoughts on how to slice and dice the data at publishing time. You can provide your input at https://docs.google.com/forms/d/e/1FAIpQLScc15eSeFrVvAh_ydpX_1p0v6-WSx2qe3Ec... .
Please provide your input to this survey no later than 2019-09-03.
Best, Leila
[1] https://wikitech.wikimedia.org/wiki/Analytics [2] https://en.wikipedia.org/wiki/Data_lake [3] https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Edits a) https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Edits/Mediawiki_hist... b) https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Edits/Mediawiki_user... c) https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Edits/Mediawiki_page...
-- Leila Zia Principal Research Scientist, Head of Research Wikimedia Foundation
Hello,
I've just tried to use the form and got resource unavailable.
RhinosF1 Volunteer Miraheze
On Tue, 20 Aug 2019 at 22:07, Leila Zia leila@wikimedia.org wrote:
In a nutshell: We are asking for your input to help us learn how to release the historical edit data of Wikimedia projects in a more efficient way. Please provide your feedback via
https://docs.google.com/forms/d/e/1FAIpQLScc15eSeFrVvAh_ydpX_1p0v6-WSx2qe3Ec... by 2019-09-03.
Dear researchers,
The Analytics team at Wikimedia Foundation [1] has been working on building a data lake [2] for Wikimedia edits [3] to enable the research and analysis of Wikimedia's edit data in a more efficient way. This data is a history of activity on Wikimedia projects as complete and research-friendly as possible. Edits have context, such as whether they were reverted, in the same line as the edit itself. So you can focus more on what you want to find out instead of writing code to wrestle the data. Each line of the data released will include the following and more (see full specification [3a], [3b], [3c]):
- editor edit count, groups, blocks, bot status, name, current and
historical (time of edit)
- seconds since this editor's last edit
- page context, current and historical (namespace, seconds since last
revision, etc.)
- seconds to identity revert or deletion, if applicable
- revision tags (mobile edit, ve edit, etc.)
The first instance of this data will be released in the coming months and to make this release as useful as possible for you all, the users of the data, the team needs to hear your thoughts on how to slice and dice the data at publishing time. You can provide your input at
https://docs.google.com/forms/d/e/1FAIpQLScc15eSeFrVvAh_ydpX_1p0v6-WSx2qe3Ec... .
Please provide your input to this survey no later than 2019-09-03.
Best, Leila
[1] https://wikitech.wikimedia.org/wiki/Analytics [2] https://en.wikipedia.org/wiki/Data_lake [3] https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Edits a) https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Edits/Mediawiki_hist... b) https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Edits/Mediawiki_user... c) https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Edits/Mediawiki_page...
-- Leila Zia Principal Research Scientist, Head of Research Wikimedia Foundation
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
I'm sorry. This is fixed now. Try again: https://docs.google.com/forms/d/e/1FAIpQLScc15eSeFrVvAh_ydpX_1p0v6-WSx2qe3Ec...
On Tue, Aug 20, 2019 at 2:22 PM RhinosF1 rhinosf1@gmail.com wrote:
Hello,
I've just tried to use the form and got resource unavailable.
RhinosF1 Volunteer Miraheze
On Tue, 20 Aug 2019 at 22:07, Leila Zia leila@wikimedia.org wrote:
In a nutshell: We are asking for your input to help us learn how to release the historical edit data of Wikimedia projects in a more efficient way. Please provide your feedback via https://docs.google.com/forms/d/e/1FAIpQLScc15eSeFrVvAh_ydpX_1p0v6-WSx2qe3Ec... by 2019-09-03.
Dear researchers,
The Analytics team at Wikimedia Foundation [1] has been working on building a data lake [2] for Wikimedia edits [3] to enable the research and analysis of Wikimedia's edit data in a more efficient way. This data is a history of activity on Wikimedia projects as complete and research-friendly as possible. Edits have context, such as whether they were reverted, in the same line as the edit itself. So you can focus more on what you want to find out instead of writing code to wrestle the data. Each line of the data released will include the following and more (see full specification [3a], [3b], [3c]):
- editor edit count, groups, blocks, bot status, name, current and
historical (time of edit)
- seconds since this editor's last edit
- page context, current and historical (namespace, seconds since last
revision, etc.)
- seconds to identity revert or deletion, if applicable
- revision tags (mobile edit, ve edit, etc.)
The first instance of this data will be released in the coming months and to make this release as useful as possible for you all, the users of the data, the team needs to hear your thoughts on how to slice and dice the data at publishing time. You can provide your input at https://docs.google.com/forms/d/e/1FAIpQLScc15eSeFrVvAh_ydpX_1p0v6-WSx2qe3Ec... .
Please provide your input to this survey no later than 2019-09-03.
Best, Leila
[1] https://wikitech.wikimedia.org/wiki/Analytics [2] https://en.wikipedia.org/wiki/Data_lake [3] https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Edits a) https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Edits/Mediawiki_hist... b) https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Edits/Mediawiki_user... c) https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Edits/Mediawiki_page...
-- Leila Zia Principal Research Scientist, Head of Research Wikimedia Foundation
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
A friendly reminder that you have time until 2019-09-03 to let us know your wishes/constraints for the new data-release discussed below. (Thanks to those of you who have already responded.)
On Tue, Aug 20, 2019 at 2:06 PM Leila Zia leila@wikimedia.org wrote:
In a nutshell: We are asking for your input to help us learn how to release the historical edit data of Wikimedia projects in a more efficient way. Please provide your feedback via https://docs.google.com/forms/d/e/1FAIpQLScc15eSeFrVvAh_ydpX_1p0v6-WSx2qe3Ec... by 2019-09-03.
Dear researchers,
The Analytics team at Wikimedia Foundation [1] has been working on building a data lake [2] for Wikimedia edits [3] to enable the research and analysis of Wikimedia's edit data in a more efficient way. This data is a history of activity on Wikimedia projects as complete and research-friendly as possible. Edits have context, such as whether they were reverted, in the same line as the edit itself. So you can focus more on what you want to find out instead of writing code to wrestle the data. Each line of the data released will include the following and more (see full specification [3a], [3b], [3c]):
- editor edit count, groups, blocks, bot status, name, current and
historical (time of edit)
- seconds since this editor's last edit
- page context, current and historical (namespace, seconds since last
revision, etc.)
- seconds to identity revert or deletion, if applicable
- revision tags (mobile edit, ve edit, etc.)
The first instance of this data will be released in the coming months and to make this release as useful as possible for you all, the users of the data, the team needs to hear your thoughts on how to slice and dice the data at publishing time. You can provide your input at https://docs.google.com/forms/d/e/1FAIpQLScc15eSeFrVvAh_ydpX_1p0v6-WSx2qe3Ec... .
Please provide your input to this survey no later than 2019-09-03.
Best, Leila
[1] https://wikitech.wikimedia.org/wiki/Analytics [2] https://en.wikipedia.org/wiki/Data_lake [3] https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Edits a) https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Edits/Mediawiki_hist... b) https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Edits/Mediawiki_user... c) https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Edits/Mediawiki_page...
-- Leila Zia Principal Research Scientist, Head of Research Wikimedia Foundation
wiki-research-l@lists.wikimedia.org