Today we announce a new snapshot (named *2017-06*) of the mediawiki history data [1]. It includes these awesome new fields:
*event_user_revision_count*: 'Cumulative revision count per user for the current event_user_id (only available in revision-create events so far)'
*page_revision_count*: 'In revision/page events: Cumulative revision count per page for the current page_id (only available in revision-create events so far)'
The *event_user_revision_count* field is useful as a close estimate to user_editcount, but it does not include Flow talk page edits. We've also added event_user_seconds_to_previous_revision and page_seconds_to_previous_revision, but those are not being computed right now.
The mediawiki_history dataset is updated every month, but we thought we'd let you know about this one since it has new goodies. It's all thanks to Joseph who did everything but announce this wonderful work and then had to rush away to welcome his daughter into the world. Hi Joseph! Stop reading work email! :D
[1] https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Edits/Mediawiki_hist...
Further clarification that this snapshot of data is not yet public (meaning available to the outside world, not just WMF/NAD holders) . Our team is working towards making this data available next year in labs in the same fashion that data is now available on the labs replicas.
Thanks,
Nuria
On Wed, Jul 12, 2017 at 9:34 AM, Dan Andreescu dandreescu@wikimedia.org wrote:
Today we announce a new snapshot (named *2017-06*) of the mediawiki history data [1]. It includes these awesome new fields:
*event_user_revision_count*: 'Cumulative revision count per user for the current event_user_id (only available in revision-create events so far)'
*page_revision_count*: 'In revision/page events: Cumulative revision count per page for the current page_id (only available in revision-create events so far)'
The *event_user_revision_count* field is useful as a close estimate to user_editcount, but it does not include Flow talk page edits. We've also added event_user_seconds_to_previous_revision and page_seconds_to_previous_revision, but those are not being computed right now.
The mediawiki_history dataset is updated every month, but we thought we'd let you know about this one since it has new goodies. It's all thanks to Joseph who did everything but announce this wonderful work and then had to rush away to welcome his daughter into the world. Hi Joseph! Stop reading work email! :D
[1] https://wikitech.wikimedia.org/wiki/Analytics/ Data_Lake/Edits/Mediawiki_history
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
On Wed, Jul 12, 2017 at 12:16 PM, Nuria Ruiz nuria@wikimedia.org wrote:
Further clarification that this snapshot of data is not yet public (meaning available to the outside world, not just WMF/NAD holders) .
Thanks for clarifying this and the work you and your team has put into this.
Our team is working towards making this data available next year in labs in the same fashion that data is now available on the labs replicas.
Can you specify what you mean by "next year"? I can think fiscal, calendar, etc. :)
A big thumbs up for making data public. wiki-research-l list and audience will be happy.
Best, Leila
Thanks,
Nuria
On Wed, Jul 12, 2017 at 9:34 AM, Dan Andreescu dandreescu@wikimedia.org wrote:
Today we announce a new snapshot (named 2017-06) of the mediawiki history data [1]. It includes these awesome new fields:
event_user_revision_count: 'Cumulative revision count per user for the current event_user_id (only available in revision-create events so far)'
page_revision_count: 'In revision/page events: Cumulative revision count per page for the current page_id (only available in revision-create events so far)'
The event_user_revision_count field is useful as a close estimate to user_editcount, but it does not include Flow talk page edits. We've also added event_user_seconds_to_previous_revision and page_seconds_to_previous_revision, but those are not being computed right now.
The mediawiki_history dataset is updated every month, but we thought we'd let you know about this one since it has new goodies. It's all thanks to Joseph who did everything but announce this wonderful work and then had to rush away to welcome his daughter into the world. Hi Joseph! Stop reading work email! :D
[1] https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Edits/Mediawiki_hist...
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Can you specify what you mean by "next year"? I can think fiscal, calendar, etc. :)
We are aiming for this data to be public in its current analytics-friendly form by end 2017/ begginning 2018.
On Wed, Jul 12, 2017 at 12:22 PM, Leila Zia leila@wikimedia.org wrote:
On Wed, Jul 12, 2017 at 12:16 PM, Nuria Ruiz nuria@wikimedia.org wrote:
Further clarification that this snapshot of data is not yet public
(meaning
available to the outside world, not just WMF/NAD holders) .
Thanks for clarifying this and the work you and your team has put into this.
Our team is working towards making this data available next year in labs
in the same
fashion that data is now available on the labs replicas.
Can you specify what you mean by "next year"? I can think fiscal, calendar, etc. :)
A big thumbs up for making data public. wiki-research-l list and audience will be happy.
Best, Leila
Thanks,
Nuria
On Wed, Jul 12, 2017 at 9:34 AM, Dan Andreescu <dandreescu@wikimedia.org
wrote:
Today we announce a new snapshot (named 2017-06) of the mediawiki
history
data [1]. It includes these awesome new fields:
event_user_revision_count: 'Cumulative revision count per user for the current event_user_id (only available in revision-create events so far)'
page_revision_count: 'In revision/page events: Cumulative revision count per page for the current page_id (only available in revision-create
events
so far)'
The event_user_revision_count field is useful as a close estimate to user_editcount, but it does not include Flow talk page edits. We've also added event_user_seconds_to_previous_revision and page_seconds_to_previous_revision, but those are not being computed
right
now.
The mediawiki_history dataset is updated every month, but we thought
we'd
let you know about this one since it has new goodies. It's all thanks
to
Joseph who did everything but announce this wonderful work and then had
to
rush away to welcome his daughter into the world. Hi Joseph! Stop
reading
work email! :D
[1] https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/
Edits/Mediawiki_history
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
On Wed, Jul 12, 2017 at 12:25 PM, Nuria Ruiz nuria@wikimedia.org wrote:
Can you specify what you mean by "next year"? I can think fiscal, calendar, etc. :)
We are aiming for this data to be public in its current analytics-friendly form by end 2017/ begginning 2018.
Thank you!
On Wed, Jul 12, 2017 at 12:22 PM, Leila Zia leila@wikimedia.org wrote:
On Wed, Jul 12, 2017 at 12:16 PM, Nuria Ruiz nuria@wikimedia.org wrote:
Further clarification that this snapshot of data is not yet public (meaning available to the outside world, not just WMF/NAD holders) .
Thanks for clarifying this and the work you and your team has put into this.
Our team is working towards making this data available next year in labs in the same fashion that data is now available on the labs replicas.
Can you specify what you mean by "next year"? I can think fiscal, calendar, etc. :)
A big thumbs up for making data public. wiki-research-l list and audience will be happy.
Best, Leila
Thanks,
Nuria
On Wed, Jul 12, 2017 at 9:34 AM, Dan Andreescu dandreescu@wikimedia.org wrote:
Today we announce a new snapshot (named 2017-06) of the mediawiki history data [1]. It includes these awesome new fields:
event_user_revision_count: 'Cumulative revision count per user for the current event_user_id (only available in revision-create events so far)'
page_revision_count: 'In revision/page events: Cumulative revision count per page for the current page_id (only available in revision-create events so far)'
The event_user_revision_count field is useful as a close estimate to user_editcount, but it does not include Flow talk page edits. We've also added event_user_seconds_to_previous_revision and page_seconds_to_previous_revision, but those are not being computed right now.
The mediawiki_history dataset is updated every month, but we thought we'd let you know about this one since it has new goodies. It's all thanks to Joseph who did everything but announce this wonderful work and then had to rush away to welcome his daughter into the world. Hi Joseph! Stop reading work email! :D
[1]
https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Edits/Mediawiki_hist...
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics