Given that information, do you have any idea if we are
in danger of
overloading EventLogging?
Logging broad events (such a page load) 1 to 1 might incur into problems as
our traffic is high enough that events logged1/1000 happen still in very
large amounts.
Some numbers (oversimplyfying and rounding)
We have about 200 million visits per day for the enwiki mobile site . This
means about 2300 pageviews per sec, if we are sending 1 load event per
pageview EL will (sadly) die, most likely.
If we assume EL handles up to 350 events per second (and now we are at 270
events per sec) I would think that sending 10 events per sec on your case
would be pretty safe. That would be sampling about 1/200 for a load event
per every pageview. This seems like a good upper bound.
Now, since there are no constrains as to how long you keep your experiment
running you can try a lower sampling ratio, say, 1/1000 and keep the
experiment running for longer.
On Tue, Jan 6, 2015 at 5:50 PM, Ryan Kaldari <rkaldari(a)wikimedia.org> wrote:
The highest volume events we are going to log will
be:
1. For each of the 166,000 articles, one event when the page loads
2. For each of the 166,000 articles, one event when the WikiGrok widget
enters the viewport (about half as often as #1)
These will be active for all mobile users, logged in and logged out,
including many high pageview articles.
Given that information, do you have any idea if we are in danger of
overloading EventLogging? If so, do you have recommendations on sampling?
So far, everyone has said not to worry about it, but it would be good to
get a sanity check for this test specifically.
Kaldari
On Tue, Jan 6, 2015 at 4:57 PM, Nuria Ruiz <nuria(a)wikimedia.org> wrote:
(cc-ing mobile-tech)
Since we do not the details of how wikigrok is used and its throughput of
requests we can not "estimate" sampling ourselves. I imagine wikigrok is
been deployed to a number of users and it is with that usage the mobile
team could estimate the total throughput expected, with this throughput we
can recommend sampling ratios.
Thanks for asking about this without before deploying!
On Tue, Jan 6, 2015 at 4:55 PM, Ryan Kaldari <rkaldari(a)wikimedia.org>
wrote:
I can elaborate on this after I finished the SWAT
deployment.... Gimme
30 minutes or so.
On Tue, Jan 6, 2015 at 4:51 PM, Leila Zia <leila(a)wikimedia.org> wrote:
Hi,
The mobile team is planning to switch WikiGrok on for non-logged in
users next week (2014-01-12). The widget will be on on 166,029 article
pages in enwiki. There are two EventLogging schema that may collect data
heavily and we want to make sure EL can handle the influx of data.
The two schema collecting data are:
https://meta.wikimedia.org/wiki/Schema:MobileWebWikiGrok
https://meta.wikimedia.org/wiki/Schema:MobileWebWikiGrokError
and the list of pages affected is in:
wgq_page in enwiki.wikigrok_questions.
It would be great if someone from the dev side let us know whether
we will need sampling.
Thanks,
Leila
_______________________________________________
Analytics mailing list
Analytics(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics
_______________________________________________
Analytics mailing list
Analytics(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics
_______________________________________________
Analytics mailing list
Analytics(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics