Abbey asked a question during today’s research group that I wanted to relay to the wikimetrics devs.
Would it be possible to allow people to use Wikimetrics’ project-level reports to generate cohorts, in other words, obtain lists of user_ids or user_names matching specific criteria, for example:
• registered users on a given date or period • newly active editors on a given date • unique editors on a given date or period
UX research as well as LCA would die to have such a functionality (the fallback is to do this via Quarry or post a request to Research & Data or someone in Grantmaking).
Dario
Dario:
There are no technical blockers to be able to generate that data. Now, product wise it does not seem like a fit as wikimetrics' purpose is to produce data and run metrics. All wikimetrics computations are pre-canned.
It seems to me the use cases you passed along are better fitted by a tool being able to freely query the db like quarry.
Thanks,
Nuria
On Thu, Oct 2, 2014 at 4:00 PM, Dario Taraborelli < dtaraborelli@wikimedia.org> wrote:
Abbey asked a question during today’s research group that I wanted to relay to the wikimetrics devs.
Would it be possible to allow people to use Wikimetrics’ project-level reports to *generate *cohorts, in other words, obtain lists of user_ids or user_names matching specific criteria, for example:
• registered users on a given date or period • newly active editors on a given date • unique editors on a given date or period
UX research as well as LCA would die to have such a functionality (the fallback is to do this via Quarry or post a request to Research & Data or someone in Grantmaking).
Dario
Wikimetrics mailing list Wikimetrics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikimetrics
Nuria,
thanks for the feedback – for context, the reason why I am asking is that:
• I was under the impression that this data was already being stored somewhere in temporary tables by Wikimetrics when generating project-level reports • this is quite similar to one of the earliest feature requests that we had for UserMetrics (the predecessor of Wikimetrics) under the notion of “generated cohorts”:
1) take the non-aggregate output of a report (say all registered users or new active editors from foowiki in a given time period) 2) save the output as a cohort 3) re-run that cohort through a different metric
Using Quarry still relies on the end user’s ability to understand how to turn a research question into a query. Having a curated query library is a good step in that direction, but that still requires some basic knowledge of SQL.
D
On Oct 2, 2014, at 4:16 PM, Nuria Ruiz nuria@wikimedia.org wrote:
Dario:
There are no technical blockers to be able to generate that data. Now, product wise it does not seem like a fit as wikimetrics' purpose is to produce data and run metrics. All wikimetrics computations are pre-canned.
It seems to me the use cases you passed along are better fitted by a tool being able to freely query the db like quarry.
Thanks,
Nuria
On Thu, Oct 2, 2014 at 4:00 PM, Dario Taraborelli dtaraborelli@wikimedia.org wrote: Abbey asked a question during today’s research group that I wanted to relay to the wikimetrics devs.
Would it be possible to allow people to use Wikimetrics’ project-level reports to generate cohorts, in other words, obtain lists of user_ids or user_names matching specific criteria, for example:
• registered users on a given date or period • newly active editors on a given date • unique editors on a given date or period
UX research as well as LCA would die to have such a functionality (the fallback is to do this via Quarry or post a request to Research & Data or someone in Grantmaking).
Dario
Wikimetrics mailing list Wikimetrics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikimetrics
Wikimetrics mailing list Wikimetrics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikimetrics
I was under the impression that this data was already being stored
somewhere in temporary tables by Wikimetrics when generating project-level reports The data is not being stored anywhere at this time, we just query a table for users that match a criteria.
this is quite similar to one of the earliest feature requests that we had
for UserMetrics (the predecessor of Wikimetrics) under the notion of “generated cohorts”:
- take the non-aggregate output of a report (say all registered users or
new active editors from *foowiki *in a given time period)
- save the output as a cohort
- re-run that cohort through a different metric
I see. At this time in wikimetrics we have no way to store & reuse intermediate results of metrics in other metrics. Which, if you notice, is a performance concern as we recompute stuff. We have talked about doing this in the past (we called it chaining metrics) and we have some backlog items to this extent. You can talk to kevin about this use case (which is slightly different than the original one you described) and he can add it to the backlog.
On Thu, Oct 2, 2014 at 5:43 PM, Dario Taraborelli < dtaraborelli@wikimedia.org> wrote:
Nuria,
thanks for the feedback – for context, the reason why I am asking is that:
• I was under the impression that this data was already being stored somewhere in temporary tables by Wikimetrics when generating project-level reports • this is quite similar to one of the earliest feature requests that we had for UserMetrics (the predecessor of Wikimetrics) under the notion of “generated cohorts”:
- take the non-aggregate output of a report (say all registered users or
new active editors from *foowiki *in a given time period) 2) save the output as a cohort 3) re-run that cohort through a different metric
Using Quarry still relies on the end user’s ability to understand how to turn a research question into a query. Having a curated query library is a good step in that direction, but that still requires some basic knowledge of SQL.
D
On Oct 2, 2014, at 4:16 PM, Nuria Ruiz nuria@wikimedia.org wrote:
Dario:
There are no technical blockers to be able to generate that data. Now, product wise it does not seem like a fit as wikimetrics' purpose is to produce data and run metrics. All wikimetrics computations are pre-canned.
It seems to me the use cases you passed along are better fitted by a tool being able to freely query the db like quarry.
Thanks,
Nuria
On Thu, Oct 2, 2014 at 4:00 PM, Dario Taraborelli < dtaraborelli@wikimedia.org> wrote:
Abbey asked a question during today’s research group that I wanted to relay to the wikimetrics devs.
Would it be possible to allow people to use Wikimetrics’ project-level reports to *generate *cohorts, in other words, obtain lists of user_ids or user_names matching specific criteria, for example:
• registered users on a given date or period • newly active editors on a given date • unique editors on a given date or period
UX research as well as LCA would die to have such a functionality (the fallback is to do this via Quarry or post a request to Research & Data or someone in Grantmaking).
Dario
Wikimetrics mailing list Wikimetrics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikimetrics
Wikimetrics mailing list Wikimetrics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikimetrics
Wikimetrics mailing list Wikimetrics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikimetrics
I added the story to the backlog so we don't lose it: https://bugzilla.wikimedia.org/show_bug.cgi?id=71614
On Thu, Oct 2, 2014 at 5:56 PM, Nuria Ruiz nuria@wikimedia.org wrote:
I was under the impression that this data was already being stored
somewhere in temporary tables by Wikimetrics when generating project-level reports The data is not being stored anywhere at this time, we just query a table for users that match a criteria.
this is quite similar to one of the earliest feature requests that we had
for UserMetrics (the predecessor of Wikimetrics) under the notion of “generated cohorts”:
- take the non-aggregate output of a report (say all registered users or
new active editors from *foowiki *in a given time period)
- save the output as a cohort
- re-run that cohort through a different metric
I see. At this time in wikimetrics we have no way to store & reuse intermediate results of metrics in other metrics. Which, if you notice, is a performance concern as we recompute stuff. We have talked about doing this in the past (we called it chaining metrics) and we have some backlog items to this extent. You can talk to kevin about this use case (which is slightly different than the original one you described) and he can add it to the backlog.
On Thu, Oct 2, 2014 at 5:43 PM, Dario Taraborelli < dtaraborelli@wikimedia.org> wrote:
Nuria,
thanks for the feedback – for context, the reason why I am asking is that:
• I was under the impression that this data was already being stored somewhere in temporary tables by Wikimetrics when generating project-level reports • this is quite similar to one of the earliest feature requests that we had for UserMetrics (the predecessor of Wikimetrics) under the notion of “generated cohorts”:
- take the non-aggregate output of a report (say all registered users or
new active editors from *foowiki *in a given time period) 2) save the output as a cohort 3) re-run that cohort through a different metric
Using Quarry still relies on the end user’s ability to understand how to turn a research question into a query. Having a curated query library is a good step in that direction, but that still requires some basic knowledge of SQL.
D
On Oct 2, 2014, at 4:16 PM, Nuria Ruiz nuria@wikimedia.org wrote:
Dario:
There are no technical blockers to be able to generate that data. Now, product wise it does not seem like a fit as wikimetrics' purpose is to produce data and run metrics. All wikimetrics computations are pre-canned.
It seems to me the use cases you passed along are better fitted by a tool being able to freely query the db like quarry.
Thanks,
Nuria
On Thu, Oct 2, 2014 at 4:00 PM, Dario Taraborelli < dtaraborelli@wikimedia.org> wrote:
Abbey asked a question during today’s research group that I wanted to relay to the wikimetrics devs.
Would it be possible to allow people to use Wikimetrics’ project-level reports to *generate *cohorts, in other words, obtain lists of user_ids or user_names matching specific criteria, for example:
• registered users on a given date or period • newly active editors on a given date • unique editors on a given date or period
UX research as well as LCA would die to have such a functionality (the fallback is to do this via Quarry or post a request to Research & Data or someone in Grantmaking).
Dario
Wikimetrics mailing list Wikimetrics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikimetrics
Wikimetrics mailing list Wikimetrics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikimetrics
Wikimetrics mailing list Wikimetrics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikimetrics
Wikimetrics mailing list Wikimetrics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikimetrics
On Thu, Oct 2, 2014 at 5:43 PM, Dario Taraborelli < dtaraborelli@wikimedia.org> wrote:
Using Quarry still relies on the end user’s ability to understand how to turn a research question into a query. Having a curated query library is a good step in that direction, but that still requires some basic knowledge of SQL.
I don't think it does, actually. Can't we just create a canned Quarry
query that generates a list of all accounts created in the past 24 hours in an output format that is Wikimetrics-ready, and then direct people to the persistent URL? User pastes that verbatim into their own "New query" box, and download the resulting data. Bada-bing!
- J
D
On Oct 2, 2014, at 4:16 PM, Nuria Ruiz nuria@wikimedia.org wrote:
Dario:
There are no technical blockers to be able to generate that data. Now, product wise it does not seem like a fit as wikimetrics' purpose is to produce data and run metrics. All wikimetrics computations are pre-canned.
It seems to me the use cases you passed along are better fitted by a tool being able to freely query the db like quarry.
Thanks,
Nuria
On Thu, Oct 2, 2014 at 4:00 PM, Dario Taraborelli < dtaraborelli@wikimedia.org> wrote:
Abbey asked a question during today’s research group that I wanted to relay to the wikimetrics devs.
Would it be possible to allow people to use Wikimetrics’ project-level reports to *generate *cohorts, in other words, obtain lists of user_ids or user_names matching specific criteria, for example:
• registered users on a given date or period • newly active editors on a given date • unique editors on a given date or period
UX research as well as LCA would die to have such a functionality (the fallback is to do this via Quarry or post a request to Research & Data or someone in Grantmaking).
Dario
Wikimetrics mailing list Wikimetrics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikimetrics
Wikimetrics mailing list Wikimetrics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikimetrics
Research-Internal mailing list Research-Internal@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/research-internal
+1 I'm a fan of bada-bing. But not sure about how it fits in the bigger picture
On Mon, Oct 6, 2014 at 11:37 AM, Jonathan Morgan jmorgan@wikimedia.org wrote:
On Thu, Oct 2, 2014 at 5:43 PM, Dario Taraborelli < dtaraborelli@wikimedia.org> wrote:
Using Quarry still relies on the end user’s ability to understand how to turn a research question into a query. Having a curated query library is a good step in that direction, but that still requires some basic knowledge of SQL.
I don't think it does, actually. Can't we just create a canned Quarry
query that generates a list of all accounts created in the past 24 hours in an output format that is Wikimetrics-ready, and then direct people to the persistent URL? User pastes that verbatim into their own "New query" box, and download the resulting data. Bada-bing!
- J
D
On Oct 2, 2014, at 4:16 PM, Nuria Ruiz nuria@wikimedia.org wrote:
Dario:
There are no technical blockers to be able to generate that data. Now, product wise it does not seem like a fit as wikimetrics' purpose is to produce data and run metrics. All wikimetrics computations are pre-canned.
It seems to me the use cases you passed along are better fitted by a tool being able to freely query the db like quarry.
Thanks,
Nuria
On Thu, Oct 2, 2014 at 4:00 PM, Dario Taraborelli < dtaraborelli@wikimedia.org> wrote:
Abbey asked a question during today’s research group that I wanted to relay to the wikimetrics devs.
Would it be possible to allow people to use Wikimetrics’ project-level reports to *generate *cohorts, in other words, obtain lists of user_ids or user_names matching specific criteria, for example:
• registered users on a given date or period • newly active editors on a given date • unique editors on a given date or period
UX research as well as LCA would die to have such a functionality (the fallback is to do this via Quarry or post a request to Research & Data or someone in Grantmaking).
Dario
Wikimetrics mailing list Wikimetrics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikimetrics
Wikimetrics mailing list Wikimetrics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikimetrics
Research-Internal mailing list Research-Internal@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/research-internal
-- Jonathan T. Morgan Learning Strategist Wikimedia Foundation User:Jmorgan (WMF) https://meta.wikimedia.org/wiki/User:Jmorgan_(WMF) jmorgan@wikimedia.org
Wikimetrics mailing list Wikimetrics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikimetrics
yeah, relative to the other stuff we need for wikimetrics, I think this is lower priority. Jonathan, if you hear more and more needs for it, you can prioritize this sooner... This would be something for Marcel to implement once he's up to speed on coding wikimetrics.
On Tue, Oct 7, 2014 at 11:25 AM, Dan Andreescu dandreescu@wikimedia.org wrote:
+1 I'm a fan of bada-bing. But not sure about how it fits in the bigger picture
On Mon, Oct 6, 2014 at 11:37 AM, Jonathan Morgan jmorgan@wikimedia.org wrote:
On Thu, Oct 2, 2014 at 5:43 PM, Dario Taraborelli < dtaraborelli@wikimedia.org> wrote:
Using Quarry still relies on the end user’s ability to understand how to turn a research question into a query. Having a curated query library is a good step in that direction, but that still requires some basic knowledge of SQL.
I don't think it does, actually. Can't we just create a canned Quarry
query that generates a list of all accounts created in the past 24 hours in an output format that is Wikimetrics-ready, and then direct people to the persistent URL? User pastes that verbatim into their own "New query" box, and download the resulting data. Bada-bing!
- J
D
On Oct 2, 2014, at 4:16 PM, Nuria Ruiz nuria@wikimedia.org wrote:
Dario:
There are no technical blockers to be able to generate that data. Now, product wise it does not seem like a fit as wikimetrics' purpose is to produce data and run metrics. All wikimetrics computations are pre-canned.
It seems to me the use cases you passed along are better fitted by a tool being able to freely query the db like quarry.
Thanks,
Nuria
On Thu, Oct 2, 2014 at 4:00 PM, Dario Taraborelli < dtaraborelli@wikimedia.org> wrote:
Abbey asked a question during today’s research group that I wanted to relay to the wikimetrics devs.
Would it be possible to allow people to use Wikimetrics’ project-level reports to *generate *cohorts, in other words, obtain lists of user_ids or user_names matching specific criteria, for example:
• registered users on a given date or period • newly active editors on a given date • unique editors on a given date or period
UX research as well as LCA would die to have such a functionality (the fallback is to do this via Quarry or post a request to Research & Data or someone in Grantmaking).
Dario
Wikimetrics mailing list Wikimetrics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikimetrics
Wikimetrics mailing list Wikimetrics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikimetrics
Research-Internal mailing list Research-Internal@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/research-internal
-- Jonathan T. Morgan Learning Strategist Wikimedia Foundation User:Jmorgan (WMF) https://meta.wikimedia.org/wiki/User:Jmorgan_(WMF) jmorgan@wikimedia.org
Wikimetrics mailing list Wikimetrics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikimetrics
Wikimetrics mailing list Wikimetrics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikimetrics
wikimetrics@lists.wikimedia.org