On Fri, Oct 11, 2013 at 10:42 AM, Dario Taraborelli <
dtaraborelli(a)wikimedia.org> wrote:
I think we should set the right expectations about
working with large
cohorts.
I would not call 700 editors a large cohort :) this should just work fine.
We don't know yet whether this was related to labs maintenance or not and I
asked Steven to share the cohort with us to verify that it works.
Quoting Dan's response from September 17:
I just wanted to correct one small misunderstanding.
Running large cohorts does *not* work in wikimetrics at this time for two
reasons:
1. You'll have a problem uploading them as Dario mentioned (because it
validates each user individually against the database, as Dario guessed).
The best solution for this is to create a temp table of all the users we
are trying to upload and verify them in one query. This would be very fast
and not too hard to implement.
Uploading a cohort this should work work but it's a blocking operation
which is
not very user friendly, Mingle card 818 addresses this issue.
2. A large cohort will not fit in the "IN" clause of a SQL query. This is
a known limitation and we have to fix it by creating a temporary table from
the cohort. We can then join to the temp table for any metrics. The
reason I've delayed this is because the same mechanism could be used to
implement dynamic cohorts, boolean cohort combinations, and project
level cohorts. We should prioritize these technically related features and
then I can come up with a plan to do the minimally viable thing without
shooting ourselves in the foot.
I did some calculations and it seems that this is only an issue with
cohorts larger
than 200k editors.
Hope that makes sense. As for the rest, I leave prioritization up to you
guys except where it touches on technical issues, as above.
Dario
On Oct 11, 2013, at 9:50 AM, Toby Negrin <tnegrin(a)wikimedia.org> wrote:
Hi Steven -- is this working now?
Cheers,
-Toby
On Thu, Oct 10, 2013 at 6:36 PM, Steven Walling <swalling(a)wikimedia.org>wrote;wrote:
On Thu, Oct 10, 2013 at 6:32 PM, Diederik van Liere <
dvanliere(a)wikimedia.org> wrote:
Hi Steven,
It could be related to general Labs maintenance that is happening right
now but to be sure can you email me your cohort so we can test it ourselves
tomorrow?
D
I'll try it again when Labs is not undergoing maintenance. It's not
critical.
--
Steven Walling,
Product Manager
https://wikimediafoundation.org/
_______________________________________________
Analytics mailing list
Analytics(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics
_______________________________________________
Analytics mailing list
Analytics(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics
_______________________________________________
Analytics mailing list
Analytics(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics