Hi,
The mobile team is planning to switch WikiGrok on for non-logged in users next week (2014-01-12). The widget will be on on 166,029 article pages in enwiki. There are two EventLogging schema that may collect data heavily and we want to make sure EL can handle the influx of data.
The two schema collecting data are: https://meta.wikimedia.org/wiki/Schema:MobileWebWikiGrok https://meta.wikimedia.org/wiki/Schema:MobileWebWikiGrokError and the list of pages affected is in: wgq_page in enwiki.wikigrok_questions.
It would be great if someone from the dev side let us know whether we will need sampling.
Thanks, Leila
I can elaborate on this after I finished the SWAT deployment.... Gimme 30 minutes or so.
On Tue, Jan 6, 2015 at 4:51 PM, Leila Zia leila@wikimedia.org wrote:
Hi,
The mobile team is planning to switch WikiGrok on for non-logged in users next week (2014-01-12). The widget will be on on 166,029 article pages in enwiki. There are two EventLogging schema that may collect data heavily and we want to make sure EL can handle the influx of data.
The two schema collecting data are: https://meta.wikimedia.org/wiki/Schema:MobileWebWikiGrok https://meta.wikimedia.org/wiki/Schema:MobileWebWikiGrokError and the list of pages affected is in: wgq_page in enwiki.wikigrok_questions.
It would be great if someone from the dev side let us know whether we will need sampling.
Thanks, Leila
(cc-ing mobile-tech)
Since we do not the details of how wikigrok is used and its throughput of requests we can not "estimate" sampling ourselves. I imagine wikigrok is been deployed to a number of users and it is with that usage the mobile team could estimate the total throughput expected, with this throughput we can recommend sampling ratios.
Thanks for asking about this without before deploying!
On Tue, Jan 6, 2015 at 4:55 PM, Ryan Kaldari rkaldari@wikimedia.org wrote:
I can elaborate on this after I finished the SWAT deployment.... Gimme 30 minutes or so.
On Tue, Jan 6, 2015 at 4:51 PM, Leila Zia leila@wikimedia.org wrote:
Hi,
The mobile team is planning to switch WikiGrok on for non-logged in users next week (2014-01-12). The widget will be on on 166,029 article pages in enwiki. There are two EventLogging schema that may collect data heavily and we want to make sure EL can handle the influx of data.
The two schema collecting data are: https://meta.wikimedia.org/wiki/Schema:MobileWebWikiGrok https://meta.wikimedia.org/wiki/Schema:MobileWebWikiGrokError and the list of pages affected is in: wgq_page in enwiki.wikigrok_questions.
It would be great if someone from the dev side let us know whether we will need sampling.
Thanks, Leila
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
The highest volume events we are going to log will be: 1. For each of the 166,000 articles, one event when the page loads 2. For each of the 166,000 articles, one event when the WikiGrok widget enters the viewport (about half as often as #1)
These will be active for all mobile users, logged in and logged out, including many high pageview articles.
Given that information, do you have any idea if we are in danger of overloading EventLogging? If so, do you have recommendations on sampling? So far, everyone has said not to worry about it, but it would be good to get a sanity check for this test specifically.
Kaldari
On Tue, Jan 6, 2015 at 4:57 PM, Nuria Ruiz nuria@wikimedia.org wrote:
(cc-ing mobile-tech)
Since we do not the details of how wikigrok is used and its throughput of requests we can not "estimate" sampling ourselves. I imagine wikigrok is been deployed to a number of users and it is with that usage the mobile team could estimate the total throughput expected, with this throughput we can recommend sampling ratios.
Thanks for asking about this without before deploying!
On Tue, Jan 6, 2015 at 4:55 PM, Ryan Kaldari rkaldari@wikimedia.org wrote:
I can elaborate on this after I finished the SWAT deployment.... Gimme 30 minutes or so.
On Tue, Jan 6, 2015 at 4:51 PM, Leila Zia leila@wikimedia.org wrote:
Hi,
The mobile team is planning to switch WikiGrok on for non-logged in users next week (2014-01-12). The widget will be on on 166,029 article pages in enwiki. There are two EventLogging schema that may collect data heavily and we want to make sure EL can handle the influx of data.
The two schema collecting data are: https://meta.wikimedia.org/wiki/Schema:MobileWebWikiGrok https://meta.wikimedia.org/wiki/Schema:MobileWebWikiGrokError and the list of pages affected is in: wgq_page in enwiki.wikigrok_questions.
It would be great if someone from the dev side let us know whether we will need sampling.
Thanks, Leila
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Leila,
It might be worthwhile to merge that article set with the webrequest data we have in order to get a sense for how many pageloads/second to expect.
-Aaron
On Tue, Jan 6, 2015 at 7:50 PM, Ryan Kaldari rkaldari@wikimedia.org wrote:
The highest volume events we are going to log will be:
- For each of the 166,000 articles, one event when the page loads
- For each of the 166,000 articles, one event when the WikiGrok widget
enters the viewport (about half as often as #1)
These will be active for all mobile users, logged in and logged out, including many high pageview articles.
Given that information, do you have any idea if we are in danger of overloading EventLogging? If so, do you have recommendations on sampling? So far, everyone has said not to worry about it, but it would be good to get a sanity check for this test specifically.
Kaldari
On Tue, Jan 6, 2015 at 4:57 PM, Nuria Ruiz nuria@wikimedia.org wrote:
(cc-ing mobile-tech)
Since we do not the details of how wikigrok is used and its throughput of requests we can not "estimate" sampling ourselves. I imagine wikigrok is been deployed to a number of users and it is with that usage the mobile team could estimate the total throughput expected, with this throughput we can recommend sampling ratios.
Thanks for asking about this without before deploying!
On Tue, Jan 6, 2015 at 4:55 PM, Ryan Kaldari rkaldari@wikimedia.org wrote:
I can elaborate on this after I finished the SWAT deployment.... Gimme 30 minutes or so.
On Tue, Jan 6, 2015 at 4:51 PM, Leila Zia leila@wikimedia.org wrote:
Hi,
The mobile team is planning to switch WikiGrok on for non-logged in users next week (2014-01-12). The widget will be on on 166,029 article pages in enwiki. There are two EventLogging schema that may collect data heavily and we want to make sure EL can handle the influx of data.
The two schema collecting data are: https://meta.wikimedia.org/wiki/Schema:MobileWebWikiGrok https://meta.wikimedia.org/wiki/Schema:MobileWebWikiGrokError and the list of pages affected is in: wgq_page in enwiki.wikigrok_questions.
It would be great if someone from the dev side let us know whether we will need sampling.
Thanks, Leila
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
agreed. Many of these articles will see spikes in traffic during the test (as the sample includes many celebrity articles) but the historical volume of traffic for the whole sample should give us a decent estimate of the throughput.
I also wouldn’t worry about any events other than MobileWebWikiGrok.page-impression and the events in the error log: all other events require user interaction.
Dario
On Jan 7, 2015, at 7:08 AM, Aaron Halfaker <ahalfaker@wikimedia.org mailto:ahalfaker@wikimedia.org> wrote:
Leila,
It might be worthwhile to merge that article set with the webrequest data we have in order to get a sense for how many pageloads/second to expect.
-Aaron
On Tue, Jan 6, 2015 at 7:50 PM, Ryan Kaldari <rkaldari@wikimedia.org mailto:rkaldari@wikimedia.org> wrote: The highest volume events we are going to log will be:
- For each of the 166,000 articles, one event when the page loads
- For each of the 166,000 articles, one event when the WikiGrok widget enters the viewport (about half as often as #1)
These will be active for all mobile users, logged in and logged out, including many high pageview articles.
Given that information, do you have any idea if we are in danger of overloading EventLogging? If so, do you have recommendations on sampling? So far, everyone has said not to worry about it, but it would be good to get a sanity check for this test specifically.
Kaldari
On Tue, Jan 6, 2015 at 4:57 PM, Nuria Ruiz <nuria@wikimedia.org mailto:nuria@wikimedia.org> wrote: (cc-ing mobile-tech)
Since we do not the details of how wikigrok is used and its throughput of requests we can not "estimate" sampling ourselves. I imagine wikigrok is been deployed to a number of users and it is with that usage the mobile team could estimate the total throughput expected, with this throughput we can recommend sampling ratios.
Thanks for asking about this without before deploying!
On Tue, Jan 6, 2015 at 4:55 PM, Ryan Kaldari <rkaldari@wikimedia.org mailto:rkaldari@wikimedia.org> wrote: I can elaborate on this after I finished the SWAT deployment.... Gimme 30 minutes or so.
On Tue, Jan 6, 2015 at 4:51 PM, Leila Zia <leila@wikimedia.org mailto:leila@wikimedia.org> wrote: Hi,
The mobile team is planning to switch WikiGrok on for non-logged in users next week (2014-01-12). The widget will be on on 166,029 article pages in enwiki. There are two EventLogging schema that may collect data heavily and we want to make sure EL can handle the influx of data.
The two schema collecting data are: https://meta.wikimedia.org/wiki/Schema:MobileWebWikiGrok https://meta.wikimedia.org/wiki/Schema:MobileWebWikiGrok https://meta.wikimedia.org/wiki/Schema:MobileWebWikiGrokError https://meta.wikimedia.org/wiki/Schema:MobileWebWikiGrokError and the list of pages affected is in: wgq_page in enwiki.wikigrok_questions.
It would be great if someone from the dev side let us know whether we will need sampling.
Thanks, Leila
Analytics mailing list Analytics@lists.wikimedia.org mailto:Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org mailto:Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org mailto:Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org mailto:Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Given that information, do you have any idea if we are in danger of
overloading EventLogging? Logging broad events (such a page load) 1 to 1 might incur into problems as our traffic is high enough that events logged1/1000 happen still in very large amounts.
Some numbers (oversimplyfying and rounding)
We have about 200 million visits per day for the enwiki mobile site . This means about 2300 pageviews per sec, if we are sending 1 load event per pageview EL will (sadly) die, most likely.
If we assume EL handles up to 350 events per second (and now we are at 270 events per sec) I would think that sending 10 events per sec on your case would be pretty safe. That would be sampling about 1/200 for a load event per every pageview. This seems like a good upper bound.
Now, since there are no constrains as to how long you keep your experiment running you can try a lower sampling ratio, say, 1/1000 and keep the experiment running for longer.
On Tue, Jan 6, 2015 at 5:50 PM, Ryan Kaldari rkaldari@wikimedia.org wrote:
The highest volume events we are going to log will be:
- For each of the 166,000 articles, one event when the page loads
- For each of the 166,000 articles, one event when the WikiGrok widget
enters the viewport (about half as often as #1)
These will be active for all mobile users, logged in and logged out, including many high pageview articles.
Given that information, do you have any idea if we are in danger of overloading EventLogging? If so, do you have recommendations on sampling? So far, everyone has said not to worry about it, but it would be good to get a sanity check for this test specifically.
Kaldari
On Tue, Jan 6, 2015 at 4:57 PM, Nuria Ruiz nuria@wikimedia.org wrote:
(cc-ing mobile-tech)
Since we do not the details of how wikigrok is used and its throughput of requests we can not "estimate" sampling ourselves. I imagine wikigrok is been deployed to a number of users and it is with that usage the mobile team could estimate the total throughput expected, with this throughput we can recommend sampling ratios.
Thanks for asking about this without before deploying!
On Tue, Jan 6, 2015 at 4:55 PM, Ryan Kaldari rkaldari@wikimedia.org wrote:
I can elaborate on this after I finished the SWAT deployment.... Gimme 30 minutes or so.
On Tue, Jan 6, 2015 at 4:51 PM, Leila Zia leila@wikimedia.org wrote:
Hi,
The mobile team is planning to switch WikiGrok on for non-logged in users next week (2014-01-12). The widget will be on on 166,029 article pages in enwiki. There are two EventLogging schema that may collect data heavily and we want to make sure EL can handle the influx of data.
The two schema collecting data are: https://meta.wikimedia.org/wiki/Schema:MobileWebWikiGrok https://meta.wikimedia.org/wiki/Schema:MobileWebWikiGrokError and the list of pages affected is in: wgq_page in enwiki.wikigrok_questions.
It would be great if someone from the dev side let us know whether we will need sampling.
Thanks, Leila
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Thanks everyone for chiming in. Your comments were very helpful. :-)
Nuria, I checked the per second pageview count for the pages wikigrok will be live on for 3 hours in 2015-01-07 (as a sample). We're talking about a total of ~170 events per sec for these pages. Of course major events can affect this number. This number added to the current 270 events per sec you mentioned will send us over the 350 events per sec limit (if it's a hard limit). What do you think?
Leila
On Wed, Jan 7, 2015 at 10:13 AM, Nuria Ruiz nuria@wikimedia.org wrote:
Given that information, do you have any idea if we are in danger of
overloading EventLogging? Logging broad events (such a page load) 1 to 1 might incur into problems as our traffic is high enough that events logged1/1000 happen still in very large amounts.
Some numbers (oversimplyfying and rounding)
We have about 200 million visits per day for the enwiki mobile site . This means about 2300 pageviews per sec, if we are sending 1 load event per pageview EL will (sadly) die, most likely.
If we assume EL handles up to 350 events per second (and now we are at 270 events per sec) I would think that sending 10 events per sec on your case would be pretty safe. That would be sampling about 1/200 for a load event per every pageview. This seems like a good upper bound.
Now, since there are no constrains as to how long you keep your experiment running you can try a lower sampling ratio, say, 1/1000 and keep the experiment running for longer.
On Tue, Jan 6, 2015 at 5:50 PM, Ryan Kaldari rkaldari@wikimedia.org wrote:
The highest volume events we are going to log will be:
- For each of the 166,000 articles, one event when the page loads
- For each of the 166,000 articles, one event when the WikiGrok widget
enters the viewport (about half as often as #1)
These will be active for all mobile users, logged in and logged out, including many high pageview articles.
Given that information, do you have any idea if we are in danger of overloading EventLogging? If so, do you have recommendations on sampling? So far, everyone has said not to worry about it, but it would be good to get a sanity check for this test specifically.
Kaldari
On Tue, Jan 6, 2015 at 4:57 PM, Nuria Ruiz nuria@wikimedia.org wrote:
(cc-ing mobile-tech)
Since we do not the details of how wikigrok is used and its throughput of requests we can not "estimate" sampling ourselves. I imagine wikigrok is been deployed to a number of users and it is with that usage the mobile team could estimate the total throughput expected, with this throughput we can recommend sampling ratios.
Thanks for asking about this without before deploying!
On Tue, Jan 6, 2015 at 4:55 PM, Ryan Kaldari rkaldari@wikimedia.org wrote:
I can elaborate on this after I finished the SWAT deployment.... Gimme 30 minutes or so.
On Tue, Jan 6, 2015 at 4:51 PM, Leila Zia leila@wikimedia.org wrote:
Hi,
The mobile team is planning to switch WikiGrok on for non-logged in users next week (2014-01-12). The widget will be on on 166,029 article pages in enwiki. There are two EventLogging schema that may collect data heavily and we want to make sure EL can handle the influx of data.
The two schema collecting data are: https://meta.wikimedia.org/wiki/Schema:MobileWebWikiGrok https://meta.wikimedia.org/wiki/Schema:MobileWebWikiGrokError and the list of pages affected is in: wgq_page in enwiki.wikigrok_questions.
It would be great if someone from the dev side let us know whether we will need sampling.
Thanks, Leila
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
We're talking about a total of ~170 events per sec for these pages.
This is to high to log in 1:1 rate, we would need to do 1:10.
On Wed, Jan 7, 2015 at 4:10 PM, Leila Zia leila@wikimedia.org wrote:
Thanks everyone for chiming in. Your comments were very helpful. :-)
Nuria, I checked the per second pageview count for the pages wikigrok will be live on for 3 hours in 2015-01-07 (as a sample). We're talking about a total of ~170 events per sec for these pages. Of course major events can affect this number. This number added to the current 270 events per sec you mentioned will send us over the 350 events per sec limit (if it's a hard limit). What do you think?
Leila
On Wed, Jan 7, 2015 at 10:13 AM, Nuria Ruiz nuria@wikimedia.org wrote:
Given that information, do you have any idea if we are in danger of
overloading EventLogging? Logging broad events (such a page load) 1 to 1 might incur into problems as our traffic is high enough that events logged1/1000 happen still in very large amounts.
Some numbers (oversimplyfying and rounding)
We have about 200 million visits per day for the enwiki mobile site . This means about 2300 pageviews per sec, if we are sending 1 load event per pageview EL will (sadly) die, most likely.
If we assume EL handles up to 350 events per second (and now we are at 270 events per sec) I would think that sending 10 events per sec on your case would be pretty safe. That would be sampling about 1/200 for a load event per every pageview. This seems like a good upper bound.
Now, since there are no constrains as to how long you keep your experiment running you can try a lower sampling ratio, say, 1/1000 and keep the experiment running for longer.
On Tue, Jan 6, 2015 at 5:50 PM, Ryan Kaldari rkaldari@wikimedia.org wrote:
The highest volume events we are going to log will be:
- For each of the 166,000 articles, one event when the page loads
- For each of the 166,000 articles, one event when the WikiGrok widget
enters the viewport (about half as often as #1)
These will be active for all mobile users, logged in and logged out, including many high pageview articles.
Given that information, do you have any idea if we are in danger of overloading EventLogging? If so, do you have recommendations on sampling? So far, everyone has said not to worry about it, but it would be good to get a sanity check for this test specifically.
Kaldari
On Tue, Jan 6, 2015 at 4:57 PM, Nuria Ruiz nuria@wikimedia.org wrote:
(cc-ing mobile-tech)
Since we do not the details of how wikigrok is used and its throughput of requests we can not "estimate" sampling ourselves. I imagine wikigrok is been deployed to a number of users and it is with that usage the mobile team could estimate the total throughput expected, with this throughput we can recommend sampling ratios.
Thanks for asking about this without before deploying!
On Tue, Jan 6, 2015 at 4:55 PM, Ryan Kaldari rkaldari@wikimedia.org wrote:
I can elaborate on this after I finished the SWAT deployment.... Gimme 30 minutes or so.
On Tue, Jan 6, 2015 at 4:51 PM, Leila Zia leila@wikimedia.org wrote:
Hi,
The mobile team is planning to switch WikiGrok on for non-logged in users next week (2014-01-12). The widget will be on on 166,029 article pages in enwiki. There are two EventLogging schema that may collect data heavily and we want to make sure EL can handle the influx of data.
The two schema collecting data are: https://meta.wikimedia.org/wiki/Schema:MobileWebWikiGrok https://meta.wikimedia.org/wiki/Schema:MobileWebWikiGrokError and the list of pages affected is in: wgq_page in enwiki.wikigrok_questions.
It would be great if someone from the dev side let us know whether we will need sampling.
Thanks, Leila
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Sorry, I send it too soon, trying again:
We're talking about a total of ~170 events per sec for these pages.
This is to high to log in 1:1 rate, we would need to do 1:10. At this time most events on EL logging log at a much lower rate, events over 1 per sec are the following, as you can see mobile & media viewer are the majority of the throughput.
My preference would be to be less than 400 events per sec until we have done some perf testing to make sure we can handle it (we might be able to as we have done many improvements since we set these thresholds)
MobileWebClickTracking 41.35% (114.15/sec) MediaViewer 21.66% (59.78/sec) MobileWikiAppToCInteraction 12.44% (34.35/sec) PageContentSaveComplete 3.39% (9.35/sec) EchoInteraction 2.69% (7.42/sec) NavigationTiming 2.51% (6.93/sec) MultimediaViewerNetworkPerformance 1.84% (5.07/sec) SaveTiming 1.58% (4.37/sec) Edit 1.39% (3.83/sec) PersonalBar 1.24% (3.43/sec) TimingData 0.83% (2.28/sec) MobileWebUIClickTracking 0.73% (2.02/sec) Popups 0.68% (1.87/sec) MobileWikiAppOnboarding 0.62% (1.70/sec) MultimediaViewerDimensions 0.61% (1.68/sec) UniversalLanguageSelector 0.50% (1.37/sec) PageCreation 0.50% (1.37/sec) MultimediaViewerDuration 0.47% (1.30/sec) MobileWebEditing 0.45% (1.25/sec) MobileWikiAppSearch 0.41% (1.13/sec) CentralAuth 0.40% (1.12/sec)
On Wed, Jan 7, 2015 at 5:12 PM, Nuria Ruiz nuria@wikimedia.org wrote:
We're talking about a total of ~170 events per sec for these pages.
This is to high to log in 1:1 rate, we would need to do 1:10.
On Wed, Jan 7, 2015 at 4:10 PM, Leila Zia leila@wikimedia.org wrote:
Thanks everyone for chiming in. Your comments were very helpful. :-)
Nuria, I checked the per second pageview count for the pages wikigrok will be live on for 3 hours in 2015-01-07 (as a sample). We're talking about a total of ~170 events per sec for these pages. Of course major events can affect this number. This number added to the current 270 events per sec you mentioned will send us over the 350 events per sec limit (if it's a hard limit). What do you think?
Leila
On Wed, Jan 7, 2015 at 10:13 AM, Nuria Ruiz nuria@wikimedia.org wrote:
Given that information, do you have any idea if we are in danger of
overloading EventLogging? Logging broad events (such a page load) 1 to 1 might incur into problems as our traffic is high enough that events logged1/1000 happen still in very large amounts.
Some numbers (oversimplyfying and rounding)
We have about 200 million visits per day for the enwiki mobile site . This means about 2300 pageviews per sec, if we are sending 1 load event per pageview EL will (sadly) die, most likely.
If we assume EL handles up to 350 events per second (and now we are at 270 events per sec) I would think that sending 10 events per sec on your case would be pretty safe. That would be sampling about 1/200 for a load event per every pageview. This seems like a good upper bound.
Now, since there are no constrains as to how long you keep your experiment running you can try a lower sampling ratio, say, 1/1000 and keep the experiment running for longer.
On Tue, Jan 6, 2015 at 5:50 PM, Ryan Kaldari rkaldari@wikimedia.org wrote:
The highest volume events we are going to log will be:
- For each of the 166,000 articles, one event when the page loads
- For each of the 166,000 articles, one event when the WikiGrok widget
enters the viewport (about half as often as #1)
These will be active for all mobile users, logged in and logged out, including many high pageview articles.
Given that information, do you have any idea if we are in danger of overloading EventLogging? If so, do you have recommendations on sampling? So far, everyone has said not to worry about it, but it would be good to get a sanity check for this test specifically.
Kaldari
On Tue, Jan 6, 2015 at 4:57 PM, Nuria Ruiz nuria@wikimedia.org wrote:
(cc-ing mobile-tech)
Since we do not the details of how wikigrok is used and its throughput of requests we can not "estimate" sampling ourselves. I imagine wikigrok is been deployed to a number of users and it is with that usage the mobile team could estimate the total throughput expected, with this throughput we can recommend sampling ratios.
Thanks for asking about this without before deploying!
On Tue, Jan 6, 2015 at 4:55 PM, Ryan Kaldari rkaldari@wikimedia.org wrote:
I can elaborate on this after I finished the SWAT deployment.... Gimme 30 minutes or so.
On Tue, Jan 6, 2015 at 4:51 PM, Leila Zia leila@wikimedia.org wrote:
> Hi, > > The mobile team is planning to switch WikiGrok on for non-logged > in users next week (2014-01-12). The widget will be on on 166,029 article > pages in enwiki. There are two EventLogging schema that may collect data > heavily and we want to make sure EL can handle the influx of data. > > The two schema collecting data are: > https://meta.wikimedia.org/wiki/Schema:MobileWebWikiGrok > https://meta.wikimedia.org/wiki/Schema:MobileWebWikiGrokError > and the list of pages affected is in: > wgq_page in enwiki.wikigrok_questions. > > It would be great if someone from the dev side let us know > whether we will need sampling. > > Thanks, > Leila >
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Thanks everyone for the research on this! I'll go ahead and create a card for implementing sampling on the high-throughput WikiGrok events.
Kaldari
On Wed, Jan 7, 2015 at 5:20 PM, Nuria Ruiz nuria@wikimedia.org wrote:
Sorry, I send it too soon, trying again:
We're talking about a total of ~170 events per sec for these pages.
This is to high to log in 1:1 rate, we would need to do 1:10. At this time most events on EL logging log at a much lower rate, events over 1 per sec are the following, as you can see mobile & media viewer are the majority of the throughput.
My preference would be to be less than 400 events per sec until we have done some perf testing to make sure we can handle it (we might be able to as we have done many improvements since we set these thresholds)
MobileWebClickTracking 41.35% (114.15/sec) MediaViewer 21.66% (59.78/sec) MobileWikiAppToCInteraction 12.44% (34.35/sec) PageContentSaveComplete 3.39% (9.35/sec) EchoInteraction 2.69% (7.42/sec) NavigationTiming 2.51% (6.93/sec) MultimediaViewerNetworkPerformance 1.84% (5.07/sec) SaveTiming 1.58% (4.37/sec) Edit 1.39% (3.83/sec) PersonalBar 1.24% (3.43/sec) TimingData 0.83% (2.28/sec) MobileWebUIClickTracking 0.73% (2.02/sec) Popups 0.68% (1.87/sec) MobileWikiAppOnboarding 0.62% (1.70/sec) MultimediaViewerDimensions 0.61% (1.68/sec) UniversalLanguageSelector 0.50% (1.37/sec) PageCreation 0.50% (1.37/sec) MultimediaViewerDuration 0.47% (1.30/sec) MobileWebEditing 0.45% (1.25/sec) MobileWikiAppSearch 0.41% (1.13/sec) CentralAuth 0.40% (1.12/sec)
On Wed, Jan 7, 2015 at 5:12 PM, Nuria Ruiz nuria@wikimedia.org wrote:
We're talking about a total of ~170 events per sec for these pages.
This is to high to log in 1:1 rate, we would need to do 1:10.
On Wed, Jan 7, 2015 at 4:10 PM, Leila Zia leila@wikimedia.org wrote:
Thanks everyone for chiming in. Your comments were very helpful. :-)
Nuria, I checked the per second pageview count for the pages wikigrok will be live on for 3 hours in 2015-01-07 (as a sample). We're talking about a total of ~170 events per sec for these pages. Of course major events can affect this number. This number added to the current 270 events per sec you mentioned will send us over the 350 events per sec limit (if it's a hard limit). What do you think?
Leila
On Wed, Jan 7, 2015 at 10:13 AM, Nuria Ruiz nuria@wikimedia.org wrote:
Given that information, do you have any idea if we are in danger of
overloading EventLogging? Logging broad events (such a page load) 1 to 1 might incur into problems as our traffic is high enough that events logged1/1000 happen still in very large amounts.
Some numbers (oversimplyfying and rounding)
We have about 200 million visits per day for the enwiki mobile site . This means about 2300 pageviews per sec, if we are sending 1 load event per pageview EL will (sadly) die, most likely.
If we assume EL handles up to 350 events per second (and now we are at 270 events per sec) I would think that sending 10 events per sec on your case would be pretty safe. That would be sampling about 1/200 for a load event per every pageview. This seems like a good upper bound.
Now, since there are no constrains as to how long you keep your experiment running you can try a lower sampling ratio, say, 1/1000 and keep the experiment running for longer.
On Tue, Jan 6, 2015 at 5:50 PM, Ryan Kaldari rkaldari@wikimedia.org wrote:
The highest volume events we are going to log will be:
- For each of the 166,000 articles, one event when the page loads
- For each of the 166,000 articles, one event when the WikiGrok
widget enters the viewport (about half as often as #1)
These will be active for all mobile users, logged in and logged out, including many high pageview articles.
Given that information, do you have any idea if we are in danger of overloading EventLogging? If so, do you have recommendations on sampling? So far, everyone has said not to worry about it, but it would be good to get a sanity check for this test specifically.
Kaldari
On Tue, Jan 6, 2015 at 4:57 PM, Nuria Ruiz nuria@wikimedia.org wrote:
(cc-ing mobile-tech)
Since we do not the details of how wikigrok is used and its throughput of requests we can not "estimate" sampling ourselves. I imagine wikigrok is been deployed to a number of users and it is with that usage the mobile team could estimate the total throughput expected, with this throughput we can recommend sampling ratios.
Thanks for asking about this without before deploying!
On Tue, Jan 6, 2015 at 4:55 PM, Ryan Kaldari rkaldari@wikimedia.org wrote:
> I can elaborate on this after I finished the SWAT deployment.... > Gimme 30 minutes or so. > > On Tue, Jan 6, 2015 at 4:51 PM, Leila Zia leila@wikimedia.org > wrote: > >> Hi, >> >> The mobile team is planning to switch WikiGrok on for non-logged >> in users next week (2014-01-12). The widget will be on on 166,029 article >> pages in enwiki. There are two EventLogging schema that may collect data >> heavily and we want to make sure EL can handle the influx of data. >> >> The two schema collecting data are: >> https://meta.wikimedia.org/wiki/Schema:MobileWebWikiGrok >> https://meta.wikimedia.org/wiki/Schema:MobileWebWikiGrokError >> and the list of pages affected is in: >> wgq_page in enwiki.wikigrok_questions. >> >> It would be great if someone from the dev side let us know >> whether we will need sampling. >> >> Thanks, >> Leila >> > > > _______________________________________________ > Analytics mailing list > Analytics@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/analytics > >
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Thanks, Nuria!
On Wed, Jan 7, 2015 at 5:30 PM, Ryan Kaldari rkaldari@wikimedia.org wrote:
Thanks everyone for the research on this! I'll go ahead and create a card for implementing sampling on the high-throughput WikiGrok events.
Kaldari
On Wed, Jan 7, 2015 at 5:20 PM, Nuria Ruiz nuria@wikimedia.org wrote:
Sorry, I send it too soon, trying again:
We're talking about a total of ~170 events per sec for these pages.
This is to high to log in 1:1 rate, we would need to do 1:10. At this time most events on EL logging log at a much lower rate, events over 1 per sec are the following, as you can see mobile & media viewer are the majority of the throughput.
My preference would be to be less than 400 events per sec until we have done some perf testing to make sure we can handle it (we might be able to as we have done many improvements since we set these thresholds)
MobileWebClickTracking 41.35% (114.15/sec) MediaViewer 21.66% (59.78/sec) MobileWikiAppToCInteraction 12.44% (34.35/sec) PageContentSaveComplete 3.39% (9.35/sec) EchoInteraction 2.69% (7.42/sec) NavigationTiming 2.51% (6.93/sec) MultimediaViewerNetworkPerformance 1.84% (5.07/sec) SaveTiming 1.58% (4.37/sec) Edit 1.39% (3.83/sec) PersonalBar 1.24% (3.43/sec) TimingData 0.83% (2.28/sec) MobileWebUIClickTracking 0.73% (2.02/sec) Popups 0.68% (1.87/sec) MobileWikiAppOnboarding 0.62% (1.70/sec) MultimediaViewerDimensions 0.61% (1.68/sec) UniversalLanguageSelector 0.50% (1.37/sec) PageCreation 0.50% (1.37/sec) MultimediaViewerDuration 0.47% (1.30/sec) MobileWebEditing 0.45% (1.25/sec) MobileWikiAppSearch 0.41% (1.13/sec) CentralAuth 0.40% (1.12/sec)
On Wed, Jan 7, 2015 at 5:12 PM, Nuria Ruiz nuria@wikimedia.org wrote:
We're talking about a total of ~170 events per sec for these pages.
This is to high to log in 1:1 rate, we would need to do 1:10.
On Wed, Jan 7, 2015 at 4:10 PM, Leila Zia leila@wikimedia.org wrote:
Thanks everyone for chiming in. Your comments were very helpful. :-)
Nuria, I checked the per second pageview count for the pages wikigrok will be live on for 3 hours in 2015-01-07 (as a sample). We're talking about a total of ~170 events per sec for these pages. Of course major events can affect this number. This number added to the current 270 events per sec you mentioned will send us over the 350 events per sec limit (if it's a hard limit). What do you think?
Leila
On Wed, Jan 7, 2015 at 10:13 AM, Nuria Ruiz nuria@wikimedia.org wrote:
Given that information, do you have any idea if we are in danger of
overloading EventLogging? Logging broad events (such a page load) 1 to 1 might incur into problems as our traffic is high enough that events logged1/1000 happen still in very large amounts.
Some numbers (oversimplyfying and rounding)
We have about 200 million visits per day for the enwiki mobile site . This means about 2300 pageviews per sec, if we are sending 1 load event per pageview EL will (sadly) die, most likely.
If we assume EL handles up to 350 events per second (and now we are at 270 events per sec) I would think that sending 10 events per sec on your case would be pretty safe. That would be sampling about 1/200 for a load event per every pageview. This seems like a good upper bound.
Now, since there are no constrains as to how long you keep your experiment running you can try a lower sampling ratio, say, 1/1000 and keep the experiment running for longer.
On Tue, Jan 6, 2015 at 5:50 PM, Ryan Kaldari rkaldari@wikimedia.org wrote:
The highest volume events we are going to log will be:
- For each of the 166,000 articles, one event when the page loads
- For each of the 166,000 articles, one event when the WikiGrok
widget enters the viewport (about half as often as #1)
These will be active for all mobile users, logged in and logged out, including many high pageview articles.
Given that information, do you have any idea if we are in danger of overloading EventLogging? If so, do you have recommendations on sampling? So far, everyone has said not to worry about it, but it would be good to get a sanity check for this test specifically.
Kaldari
On Tue, Jan 6, 2015 at 4:57 PM, Nuria Ruiz nuria@wikimedia.org wrote:
> (cc-ing mobile-tech) > > Since we do not the details of how wikigrok is used and its > throughput of requests we can not "estimate" sampling ourselves. I imagine > wikigrok is been deployed to a number of users and it is with that usage > the mobile team could estimate the total throughput expected, with this > throughput we can recommend sampling ratios. > > > Thanks for asking about this without before deploying! > > > On Tue, Jan 6, 2015 at 4:55 PM, Ryan Kaldari <rkaldari@wikimedia.org > > wrote: > >> I can elaborate on this after I finished the SWAT deployment.... >> Gimme 30 minutes or so. >> >> On Tue, Jan 6, 2015 at 4:51 PM, Leila Zia leila@wikimedia.org >> wrote: >> >>> Hi, >>> >>> The mobile team is planning to switch WikiGrok on for non-logged >>> in users next week (2014-01-12). The widget will be on on 166,029 article >>> pages in enwiki. There are two EventLogging schema that may collect data >>> heavily and we want to make sure EL can handle the influx of data. >>> >>> The two schema collecting data are: >>> https://meta.wikimedia.org/wiki/Schema:MobileWebWikiGrok >>> https://meta.wikimedia.org/wiki/Schema:MobileWebWikiGrokError >>> and the list of pages affected is in: >>> wgq_page in enwiki.wikigrok_questions. >>> >>> It would be great if someone from the dev side let us know >>> whether we will need sampling. >>> >>> Thanks, >>> Leila >>> >> >> >> _______________________________________________ >> Analytics mailing list >> Analytics@lists.wikimedia.org >> https://lists.wikimedia.org/mailman/listinfo/analytics >> >> > > _______________________________________________ > Analytics mailing list > Analytics@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/analytics > >
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
After talking with Dario and Leila we decided that we will sample the page-impression event at 1:1000. We would, however, like to retain the widget-impression event unsampled if possible. That event happens approximately 50% as often as page-impression. So we're probably talking about somewhere around 60 events per second in that case. Would that be acceptable or should we sample the widget-impression event as well?
Kaldari
On Wed, Jan 7, 2015 at 5:33 PM, Leila Zia leila@wikimedia.org wrote:
Thanks, Nuria!
On Wed, Jan 7, 2015 at 5:30 PM, Ryan Kaldari rkaldari@wikimedia.org wrote:
Thanks everyone for the research on this! I'll go ahead and create a card for implementing sampling on the high-throughput WikiGrok events.
Kaldari
On Wed, Jan 7, 2015 at 5:20 PM, Nuria Ruiz nuria@wikimedia.org wrote:
Sorry, I send it too soon, trying again:
We're talking about a total of ~170 events per sec for these pages.
This is to high to log in 1:1 rate, we would need to do 1:10. At this time most events on EL logging log at a much lower rate, events over 1 per sec are the following, as you can see mobile & media viewer are the majority of the throughput.
My preference would be to be less than 400 events per sec until we have done some perf testing to make sure we can handle it (we might be able to as we have done many improvements since we set these thresholds)
MobileWebClickTracking 41.35% (114.15/sec) MediaViewer 21.66% (59.78/sec) MobileWikiAppToCInteraction 12.44% (34.35/sec) PageContentSaveComplete 3.39% (9.35/sec) EchoInteraction 2.69% (7.42/sec) NavigationTiming 2.51% (6.93/sec) MultimediaViewerNetworkPerformance 1.84% (5.07/sec) SaveTiming 1.58% (4.37/sec) Edit 1.39% (3.83/sec) PersonalBar 1.24% (3.43/sec) TimingData 0.83% (2.28/sec) MobileWebUIClickTracking 0.73% (2.02/sec) Popups 0.68% (1.87/sec) MobileWikiAppOnboarding 0.62% (1.70/sec) MultimediaViewerDimensions 0.61% (1.68/sec) UniversalLanguageSelector 0.50% (1.37/sec) PageCreation 0.50% (1.37/sec) MultimediaViewerDuration 0.47% (1.30/sec) MobileWebEditing 0.45% (1.25/sec) MobileWikiAppSearch 0.41% (1.13/sec) CentralAuth 0.40% (1.12/sec)
On Wed, Jan 7, 2015 at 5:12 PM, Nuria Ruiz nuria@wikimedia.org wrote:
We're talking about a total of ~170 events per sec for these pages.
This is to high to log in 1:1 rate, we would need to do 1:10.
On Wed, Jan 7, 2015 at 4:10 PM, Leila Zia leila@wikimedia.org wrote:
Thanks everyone for chiming in. Your comments were very helpful. :-)
Nuria, I checked the per second pageview count for the pages wikigrok will be live on for 3 hours in 2015-01-07 (as a sample). We're talking about a total of ~170 events per sec for these pages. Of course major events can affect this number. This number added to the current 270 events per sec you mentioned will send us over the 350 events per sec limit (if it's a hard limit). What do you think?
Leila
On Wed, Jan 7, 2015 at 10:13 AM, Nuria Ruiz nuria@wikimedia.org wrote:
>Given that information, do you have any idea if we are in danger of overloading EventLogging? Logging broad events (such a page load) 1 to 1 might incur into problems as our traffic is high enough that events logged1/1000 happen still in very large amounts.
Some numbers (oversimplyfying and rounding)
We have about 200 million visits per day for the enwiki mobile site . This means about 2300 pageviews per sec, if we are sending 1 load event per pageview EL will (sadly) die, most likely.
If we assume EL handles up to 350 events per second (and now we are at 270 events per sec) I would think that sending 10 events per sec on your case would be pretty safe. That would be sampling about 1/200 for a load event per every pageview. This seems like a good upper bound.
Now, since there are no constrains as to how long you keep your experiment running you can try a lower sampling ratio, say, 1/1000 and keep the experiment running for longer.
On Tue, Jan 6, 2015 at 5:50 PM, Ryan Kaldari rkaldari@wikimedia.org wrote:
> The highest volume events we are going to log will be: > 1. For each of the 166,000 articles, one event when the page loads > 2. For each of the 166,000 articles, one event when the WikiGrok > widget enters the viewport (about half as often as #1) > > These will be active for all mobile users, logged in and logged out, > including many high pageview articles. > > Given that information, do you have any idea if we are in danger of > overloading EventLogging? If so, do you have recommendations on sampling? > So far, everyone has said not to worry about it, but it would be good to > get a sanity check for this test specifically. > > Kaldari > > On Tue, Jan 6, 2015 at 4:57 PM, Nuria Ruiz nuria@wikimedia.org > wrote: > >> (cc-ing mobile-tech) >> >> Since we do not the details of how wikigrok is used and its >> throughput of requests we can not "estimate" sampling ourselves. I imagine >> wikigrok is been deployed to a number of users and it is with that usage >> the mobile team could estimate the total throughput expected, with this >> throughput we can recommend sampling ratios. >> >> >> Thanks for asking about this without before deploying! >> >> >> On Tue, Jan 6, 2015 at 4:55 PM, Ryan Kaldari < >> rkaldari@wikimedia.org> wrote: >> >>> I can elaborate on this after I finished the SWAT deployment.... >>> Gimme 30 minutes or so. >>> >>> On Tue, Jan 6, 2015 at 4:51 PM, Leila Zia leila@wikimedia.org >>> wrote: >>> >>>> Hi, >>>> >>>> The mobile team is planning to switch WikiGrok on for >>>> non-logged in users next week (2014-01-12). The widget will be on on >>>> 166,029 article pages in enwiki. There are two EventLogging schema that may >>>> collect data heavily and we want to make sure EL can handle the influx of >>>> data. >>>> >>>> The two schema collecting data are: >>>> https://meta.wikimedia.org/wiki/Schema:MobileWebWikiGrok >>>> https://meta.wikimedia.org/wiki/Schema:MobileWebWikiGrokError >>>> and the list of pages affected is in: >>>> wgq_page in enwiki.wikigrok_questions. >>>> >>>> It would be great if someone from the dev side let us know >>>> whether we will need sampling. >>>> >>>> Thanks, >>>> Leila >>>> >>> >>> >>> _______________________________________________ >>> Analytics mailing list >>> Analytics@lists.wikimedia.org >>> https://lists.wikimedia.org/mailman/listinfo/analytics >>> >>> >> >> _______________________________________________ >> Analytics mailing list >> Analytics@lists.wikimedia.org >> https://lists.wikimedia.org/mailman/listinfo/analytics >> >> > > _______________________________________________ > Analytics mailing list > Analytics@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/analytics > >
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
We cannot guarantee that with 60 events a sec things will still work well (as I said we should schedule some perf testing on this regard so I filed an item for this purpose: https://phabricator.wikimedia.org/T86244)
See that now we go beyond 300 events per sec here and there: http://ibin.co/1nTsNYc1bekd
I recommend sampling those events 1:10.
Thanks,
Nuria
On Thu, Jan 8, 2015 at 12:06 PM, Ryan Kaldari rkaldari@wikimedia.org wrote:
After talking with Dario and Leila we decided that we will sample the page-impression event at 1:1000. We would, however, like to retain the widget-impression event unsampled if possible. That event happens approximately 50% as often as page-impression. So we're probably talking about somewhere around 60 events per second in that case. Would that be acceptable or should we sample the widget-impression event as well?
Kaldari
On Wed, Jan 7, 2015 at 5:33 PM, Leila Zia leila@wikimedia.org wrote:
Thanks, Nuria!
On Wed, Jan 7, 2015 at 5:30 PM, Ryan Kaldari rkaldari@wikimedia.org wrote:
Thanks everyone for the research on this! I'll go ahead and create a card for implementing sampling on the high-throughput WikiGrok events.
Kaldari
On Wed, Jan 7, 2015 at 5:20 PM, Nuria Ruiz nuria@wikimedia.org wrote:
Sorry, I send it too soon, trying again:
We're talking about a total of ~170 events per sec for these pages.
This is to high to log in 1:1 rate, we would need to do 1:10. At this time most events on EL logging log at a much lower rate, events over 1 per sec are the following, as you can see mobile & media viewer are the majority of the throughput.
My preference would be to be less than 400 events per sec until we have done some perf testing to make sure we can handle it (we might be able to as we have done many improvements since we set these thresholds)
MobileWebClickTracking 41.35% (114.15/sec) MediaViewer 21.66% (59.78/sec) MobileWikiAppToCInteraction 12.44% (34.35/sec) PageContentSaveComplete 3.39% (9.35/sec) EchoInteraction 2.69% (7.42/sec) NavigationTiming 2.51% (6.93/sec) MultimediaViewerNetworkPerformance 1.84% (5.07/sec) SaveTiming 1.58% (4.37/sec) Edit 1.39% (3.83/sec) PersonalBar 1.24% (3.43/sec) TimingData 0.83% (2.28/sec) MobileWebUIClickTracking 0.73% (2.02/sec) Popups 0.68% (1.87/sec) MobileWikiAppOnboarding 0.62% (1.70/sec) MultimediaViewerDimensions 0.61% (1.68/sec) UniversalLanguageSelector 0.50% (1.37/sec) PageCreation 0.50% (1.37/sec) MultimediaViewerDuration 0.47% (1.30/sec) MobileWebEditing 0.45% (1.25/sec) MobileWikiAppSearch 0.41% (1.13/sec) CentralAuth 0.40% (1.12/sec)
On Wed, Jan 7, 2015 at 5:12 PM, Nuria Ruiz nuria@wikimedia.org wrote:
We're talking about a total of ~170 events per sec for these pages.
This is to high to log in 1:1 rate, we would need to do 1:10.
On Wed, Jan 7, 2015 at 4:10 PM, Leila Zia leila@wikimedia.org wrote:
Thanks everyone for chiming in. Your comments were very helpful. :-)
Nuria, I checked the per second pageview count for the pages wikigrok will be live on for 3 hours in 2015-01-07 (as a sample). We're talking about a total of ~170 events per sec for these pages. Of course major events can affect this number. This number added to the current 270 events per sec you mentioned will send us over the 350 events per sec limit (if it's a hard limit). What do you think?
Leila
On Wed, Jan 7, 2015 at 10:13 AM, Nuria Ruiz nuria@wikimedia.org wrote:
> >Given that information, do you have any idea if we are in danger > of overloading EventLogging? > Logging broad events (such a page load) 1 to 1 might incur into > problems as our traffic is high enough that events logged1/1000 happen > still in very large amounts. > > Some numbers (oversimplyfying and rounding) > > We have about 200 million visits per day for the enwiki mobile site > . This means about 2300 pageviews per sec, if we are sending 1 load event > per pageview EL will (sadly) die, most likely. > > If we assume EL handles up to 350 events per second (and now we are > at 270 events per sec) I would think that sending 10 events per sec on your > case would be pretty safe. That would be sampling about 1/200 for a load > event per every pageview. This seems like a good upper bound. > > Now, since there are no constrains as to how long you keep your > experiment running you can try a lower sampling ratio, say, 1/1000 and keep > the experiment running for longer. > > > > > > > On Tue, Jan 6, 2015 at 5:50 PM, Ryan Kaldari <rkaldari@wikimedia.org > > wrote: > >> The highest volume events we are going to log will be: >> 1. For each of the 166,000 articles, one event when the page loads >> 2. For each of the 166,000 articles, one event when the WikiGrok >> widget enters the viewport (about half as often as #1) >> >> These will be active for all mobile users, logged in and logged >> out, including many high pageview articles. >> >> Given that information, do you have any idea if we are in danger of >> overloading EventLogging? If so, do you have recommendations on sampling? >> So far, everyone has said not to worry about it, but it would be good to >> get a sanity check for this test specifically. >> >> Kaldari >> >> On Tue, Jan 6, 2015 at 4:57 PM, Nuria Ruiz nuria@wikimedia.org >> wrote: >> >>> (cc-ing mobile-tech) >>> >>> Since we do not the details of how wikigrok is used and its >>> throughput of requests we can not "estimate" sampling ourselves. I imagine >>> wikigrok is been deployed to a number of users and it is with that usage >>> the mobile team could estimate the total throughput expected, with this >>> throughput we can recommend sampling ratios. >>> >>> >>> Thanks for asking about this without before deploying! >>> >>> >>> On Tue, Jan 6, 2015 at 4:55 PM, Ryan Kaldari < >>> rkaldari@wikimedia.org> wrote: >>> >>>> I can elaborate on this after I finished the SWAT deployment.... >>>> Gimme 30 minutes or so. >>>> >>>> On Tue, Jan 6, 2015 at 4:51 PM, Leila Zia leila@wikimedia.org >>>> wrote: >>>> >>>>> Hi, >>>>> >>>>> The mobile team is planning to switch WikiGrok on for >>>>> non-logged in users next week (2014-01-12). The widget will be on on >>>>> 166,029 article pages in enwiki. There are two EventLogging schema that may >>>>> collect data heavily and we want to make sure EL can handle the influx of >>>>> data. >>>>> >>>>> The two schema collecting data are: >>>>> https://meta.wikimedia.org/wiki/Schema:MobileWebWikiGrok >>>>> https://meta.wikimedia.org/wiki/Schema:MobileWebWikiGrokError >>>>> and the list of pages affected is in: >>>>> wgq_page in enwiki.wikigrok_questions. >>>>> >>>>> It would be great if someone from the dev side let us know >>>>> whether we will need sampling. >>>>> >>>>> Thanks, >>>>> Leila >>>>> >>>> >>>> >>>> _______________________________________________ >>>> Analytics mailing list >>>> Analytics@lists.wikimedia.org >>>> https://lists.wikimedia.org/mailman/listinfo/analytics >>>> >>>> >>> >>> _______________________________________________ >>> Analytics mailing list >>> Analytics@lists.wikimedia.org >>> https://lists.wikimedia.org/mailman/listinfo/analytics >>> >>> >> >> _______________________________________________ >> Analytics mailing list >> Analytics@lists.wikimedia.org >> https://lists.wikimedia.org/mailman/listinfo/analytics >> >> > > _______________________________________________ > Analytics mailing list > Analytics@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/analytics > >
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
After further discussion, we've decided to just show WikiGrok to a fraction of users during the test. I currently have it set to show WikiGrok to 10 out of every 62 users or ~16% (the userToken is a base 62 number). That should give us an estimated 27 hits per second. Does that work for everyone?
Kaldari
On Thu, Jan 8, 2015 at 2:06 PM, Nuria Ruiz nuria@wikimedia.org wrote:
We cannot guarantee that with 60 events a sec things will still work well (as I said we should schedule some perf testing on this regard so I filed an item for this purpose: https://phabricator.wikimedia.org/T86244)
See that now we go beyond 300 events per sec here and there: http://ibin.co/1nTsNYc1bekd
I recommend sampling those events 1:10.
Thanks,
Nuria
On Thu, Jan 8, 2015 at 12:06 PM, Ryan Kaldari rkaldari@wikimedia.org wrote:
After talking with Dario and Leila we decided that we will sample the page-impression event at 1:1000. We would, however, like to retain the widget-impression event unsampled if possible. That event happens approximately 50% as often as page-impression. So we're probably talking about somewhere around 60 events per second in that case. Would that be acceptable or should we sample the widget-impression event as well?
Kaldari
On Wed, Jan 7, 2015 at 5:33 PM, Leila Zia leila@wikimedia.org wrote:
Thanks, Nuria!
On Wed, Jan 7, 2015 at 5:30 PM, Ryan Kaldari rkaldari@wikimedia.org wrote:
Thanks everyone for the research on this! I'll go ahead and create a card for implementing sampling on the high-throughput WikiGrok events.
Kaldari
On Wed, Jan 7, 2015 at 5:20 PM, Nuria Ruiz nuria@wikimedia.org wrote:
Sorry, I send it too soon, trying again:
We're talking about a total of ~170 events per sec for these pages.
This is to high to log in 1:1 rate, we would need to do 1:10. At this time most events on EL logging log at a much lower rate, events over 1 per sec are the following, as you can see mobile & media viewer are the majority of the throughput.
My preference would be to be less than 400 events per sec until we have done some perf testing to make sure we can handle it (we might be able to as we have done many improvements since we set these thresholds)
MobileWebClickTracking 41.35% (114.15/sec) MediaViewer 21.66% (59.78/sec) MobileWikiAppToCInteraction 12.44% (34.35/sec) PageContentSaveComplete 3.39% (9.35/sec) EchoInteraction 2.69% (7.42/sec) NavigationTiming 2.51% (6.93/sec) MultimediaViewerNetworkPerformance 1.84% (5.07/sec) SaveTiming 1.58% (4.37/sec) Edit 1.39% (3.83/sec) PersonalBar 1.24% (3.43/sec) TimingData 0.83% (2.28/sec) MobileWebUIClickTracking 0.73% (2.02/sec) Popups 0.68% (1.87/sec) MobileWikiAppOnboarding 0.62% (1.70/sec) MultimediaViewerDimensions 0.61% (1.68/sec) UniversalLanguageSelector 0.50% (1.37/sec) PageCreation 0.50% (1.37/sec) MultimediaViewerDuration 0.47% (1.30/sec) MobileWebEditing 0.45% (1.25/sec) MobileWikiAppSearch 0.41% (1.13/sec) CentralAuth 0.40% (1.12/sec)
On Wed, Jan 7, 2015 at 5:12 PM, Nuria Ruiz nuria@wikimedia.org wrote:
>We're talking about a total of ~170 events per sec for these pages. This is to high to log in 1:1 rate, we would need to do 1:10.
On Wed, Jan 7, 2015 at 4:10 PM, Leila Zia leila@wikimedia.org wrote:
> Thanks everyone for chiming in. Your comments were very helpful. :-) > > Nuria, I checked the per second pageview count for the pages > wikigrok will be live on for 3 hours in 2015-01-07 (as a sample). We're > talking about a total of ~170 events per sec for these pages. Of course > major events can affect this number. This number added to the current 270 > events per sec you mentioned will send us over the 350 events per sec limit > (if it's a hard limit). What do you think? > > Leila > > > > On Wed, Jan 7, 2015 at 10:13 AM, Nuria Ruiz nuria@wikimedia.org > wrote: > >> >Given that information, do you have any idea if we are in danger >> of overloading EventLogging? >> Logging broad events (such a page load) 1 to 1 might incur into >> problems as our traffic is high enough that events logged1/1000 happen >> still in very large amounts. >> >> Some numbers (oversimplyfying and rounding) >> >> We have about 200 million visits per day for the enwiki mobile site >> . This means about 2300 pageviews per sec, if we are sending 1 load event >> per pageview EL will (sadly) die, most likely. >> >> If we assume EL handles up to 350 events per second (and now we are >> at 270 events per sec) I would think that sending 10 events per sec on your >> case would be pretty safe. That would be sampling about 1/200 for a load >> event per every pageview. This seems like a good upper bound. >> >> Now, since there are no constrains as to how long you keep your >> experiment running you can try a lower sampling ratio, say, 1/1000 and keep >> the experiment running for longer. >> >> >> >> >> >> >> On Tue, Jan 6, 2015 at 5:50 PM, Ryan Kaldari < >> rkaldari@wikimedia.org> wrote: >> >>> The highest volume events we are going to log will be: >>> 1. For each of the 166,000 articles, one event when the page loads >>> 2. For each of the 166,000 articles, one event when the WikiGrok >>> widget enters the viewport (about half as often as #1) >>> >>> These will be active for all mobile users, logged in and logged >>> out, including many high pageview articles. >>> >>> Given that information, do you have any idea if we are in danger >>> of overloading EventLogging? If so, do you have recommendations on >>> sampling? So far, everyone has said not to worry about it, but it would be >>> good to get a sanity check for this test specifically. >>> >>> Kaldari >>> >>> On Tue, Jan 6, 2015 at 4:57 PM, Nuria Ruiz nuria@wikimedia.org >>> wrote: >>> >>>> (cc-ing mobile-tech) >>>> >>>> Since we do not the details of how wikigrok is used and its >>>> throughput of requests we can not "estimate" sampling ourselves. I imagine >>>> wikigrok is been deployed to a number of users and it is with that usage >>>> the mobile team could estimate the total throughput expected, with this >>>> throughput we can recommend sampling ratios. >>>> >>>> >>>> Thanks for asking about this without before deploying! >>>> >>>> >>>> On Tue, Jan 6, 2015 at 4:55 PM, Ryan Kaldari < >>>> rkaldari@wikimedia.org> wrote: >>>> >>>>> I can elaborate on this after I finished the SWAT deployment.... >>>>> Gimme 30 minutes or so. >>>>> >>>>> On Tue, Jan 6, 2015 at 4:51 PM, Leila Zia leila@wikimedia.org >>>>> wrote: >>>>> >>>>>> Hi, >>>>>> >>>>>> The mobile team is planning to switch WikiGrok on for >>>>>> non-logged in users next week (2014-01-12). The widget will be on on >>>>>> 166,029 article pages in enwiki. There are two EventLogging schema that may >>>>>> collect data heavily and we want to make sure EL can handle the influx of >>>>>> data. >>>>>> >>>>>> The two schema collecting data are: >>>>>> https://meta.wikimedia.org/wiki/Schema:MobileWebWikiGrok >>>>>> https://meta.wikimedia.org/wiki/Schema:MobileWebWikiGrokError >>>>>> and the list of pages affected is in: >>>>>> wgq_page in enwiki.wikigrok_questions. >>>>>> >>>>>> It would be great if someone from the dev side let us know >>>>>> whether we will need sampling. >>>>>> >>>>>> Thanks, >>>>>> Leila >>>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> Analytics mailing list >>>>> Analytics@lists.wikimedia.org >>>>> https://lists.wikimedia.org/mailman/listinfo/analytics >>>>> >>>>> >>>> >>>> _______________________________________________ >>>> Analytics mailing list >>>> Analytics@lists.wikimedia.org >>>> https://lists.wikimedia.org/mailman/listinfo/analytics >>>> >>>> >>> >>> _______________________________________________ >>> Analytics mailing list >>> Analytics@lists.wikimedia.org >>> https://lists.wikimedia.org/mailman/listinfo/analytics >>> >>> >> >> _______________________________________________ >> Analytics mailing list >> Analytics@lists.wikimedia.org >> https://lists.wikimedia.org/mailman/listinfo/analytics >> >> > > _______________________________________________ > Analytics mailing list > Analytics@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/analytics > >
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
That should be fine, please give us a heads up when you deploy the instrumenting.
On Thu, Jan 8, 2015 at 2:52 PM, Ryan Kaldari rkaldari@wikimedia.org wrote:
After further discussion, we've decided to just show WikiGrok to a fraction of users during the test. I currently have it set to show WikiGrok to 10 out of every 62 users or ~16% (the userToken is a base 62 number). That should give us an estimated 27 hits per second. Does that work for everyone?
Kaldari
On Thu, Jan 8, 2015 at 2:06 PM, Nuria Ruiz nuria@wikimedia.org wrote:
We cannot guarantee that with 60 events a sec things will still work well (as I said we should schedule some perf testing on this regard so I filed an item for this purpose: https://phabricator.wikimedia.org/T86244)
See that now we go beyond 300 events per sec here and there: http://ibin.co/1nTsNYc1bekd
I recommend sampling those events 1:10.
Thanks,
Nuria
On Thu, Jan 8, 2015 at 12:06 PM, Ryan Kaldari rkaldari@wikimedia.org wrote:
After talking with Dario and Leila we decided that we will sample the page-impression event at 1:1000. We would, however, like to retain the widget-impression event unsampled if possible. That event happens approximately 50% as often as page-impression. So we're probably talking about somewhere around 60 events per second in that case. Would that be acceptable or should we sample the widget-impression event as well?
Kaldari
On Wed, Jan 7, 2015 at 5:33 PM, Leila Zia leila@wikimedia.org wrote:
Thanks, Nuria!
On Wed, Jan 7, 2015 at 5:30 PM, Ryan Kaldari rkaldari@wikimedia.org wrote:
Thanks everyone for the research on this! I'll go ahead and create a card for implementing sampling on the high-throughput WikiGrok events.
Kaldari
On Wed, Jan 7, 2015 at 5:20 PM, Nuria Ruiz nuria@wikimedia.org wrote:
Sorry, I send it too soon, trying again:
>We're talking about a total of ~170 events per sec for these pages. This is to high to log in 1:1 rate, we would need to do 1:10. At this time most events on EL logging log at a much lower rate, events over 1 per sec are the following, as you can see mobile & media viewer are the majority of the throughput.
My preference would be to be less than 400 events per sec until we have done some perf testing to make sure we can handle it (we might be able to as we have done many improvements since we set these thresholds)
MobileWebClickTracking 41.35% (114.15/sec) MediaViewer 21.66% (59.78/sec) MobileWikiAppToCInteraction 12.44% (34.35/sec) PageContentSaveComplete 3.39% (9.35/sec) EchoInteraction 2.69% (7.42/sec) NavigationTiming 2.51% (6.93/sec) MultimediaViewerNetworkPerformance 1.84% (5.07/sec) SaveTiming 1.58% (4.37/sec) Edit 1.39% (3.83/sec) PersonalBar 1.24% (3.43/sec) TimingData 0.83% (2.28/sec) MobileWebUIClickTracking 0.73% (2.02/sec) Popups 0.68% (1.87/sec) MobileWikiAppOnboarding 0.62% (1.70/sec) MultimediaViewerDimensions 0.61% (1.68/sec) UniversalLanguageSelector 0.50% (1.37/sec) PageCreation 0.50% (1.37/sec) MultimediaViewerDuration 0.47% (1.30/sec) MobileWebEditing 0.45% (1.25/sec) MobileWikiAppSearch 0.41% (1.13/sec) CentralAuth 0.40% (1.12/sec)
On Wed, Jan 7, 2015 at 5:12 PM, Nuria Ruiz nuria@wikimedia.org wrote:
> >We're talking about a total of ~170 events per sec for these pages. > This is to high to log in 1:1 rate, we would need to do 1:10. > > On Wed, Jan 7, 2015 at 4:10 PM, Leila Zia leila@wikimedia.org > wrote: > >> Thanks everyone for chiming in. Your comments were very helpful. :-) >> >> Nuria, I checked the per second pageview count for the pages >> wikigrok will be live on for 3 hours in 2015-01-07 (as a sample). We're >> talking about a total of ~170 events per sec for these pages. Of course >> major events can affect this number. This number added to the current 270 >> events per sec you mentioned will send us over the 350 events per sec limit >> (if it's a hard limit). What do you think? >> >> Leila >> >> >> >> On Wed, Jan 7, 2015 at 10:13 AM, Nuria Ruiz nuria@wikimedia.org >> wrote: >> >>> >Given that information, do you have any idea if we are in danger >>> of overloading EventLogging? >>> Logging broad events (such a page load) 1 to 1 might incur into >>> problems as our traffic is high enough that events logged1/1000 happen >>> still in very large amounts. >>> >>> Some numbers (oversimplyfying and rounding) >>> >>> We have about 200 million visits per day for the enwiki mobile >>> site . This means about 2300 pageviews per sec, if we are sending 1 load >>> event per pageview EL will (sadly) die, most likely. >>> >>> If we assume EL handles up to 350 events per second (and now we >>> are at 270 events per sec) I would think that sending 10 events per sec on >>> your case would be pretty safe. That would be sampling about 1/200 for a >>> load event per every pageview. This seems like a good upper bound. >>> >>> Now, since there are no constrains as to how long you keep your >>> experiment running you can try a lower sampling ratio, say, 1/1000 and keep >>> the experiment running for longer. >>> >>> >>> >>> >>> >>> >>> On Tue, Jan 6, 2015 at 5:50 PM, Ryan Kaldari < >>> rkaldari@wikimedia.org> wrote: >>> >>>> The highest volume events we are going to log will be: >>>> 1. For each of the 166,000 articles, one event when the page loads >>>> 2. For each of the 166,000 articles, one event when the WikiGrok >>>> widget enters the viewport (about half as often as #1) >>>> >>>> These will be active for all mobile users, logged in and logged >>>> out, including many high pageview articles. >>>> >>>> Given that information, do you have any idea if we are in danger >>>> of overloading EventLogging? If so, do you have recommendations on >>>> sampling? So far, everyone has said not to worry about it, but it would be >>>> good to get a sanity check for this test specifically. >>>> >>>> Kaldari >>>> >>>> On Tue, Jan 6, 2015 at 4:57 PM, Nuria Ruiz nuria@wikimedia.org >>>> wrote: >>>> >>>>> (cc-ing mobile-tech) >>>>> >>>>> Since we do not the details of how wikigrok is used and its >>>>> throughput of requests we can not "estimate" sampling ourselves. I imagine >>>>> wikigrok is been deployed to a number of users and it is with that usage >>>>> the mobile team could estimate the total throughput expected, with this >>>>> throughput we can recommend sampling ratios. >>>>> >>>>> >>>>> Thanks for asking about this without before deploying! >>>>> >>>>> >>>>> On Tue, Jan 6, 2015 at 4:55 PM, Ryan Kaldari < >>>>> rkaldari@wikimedia.org> wrote: >>>>> >>>>>> I can elaborate on this after I finished the SWAT >>>>>> deployment.... Gimme 30 minutes or so. >>>>>> >>>>>> On Tue, Jan 6, 2015 at 4:51 PM, Leila Zia leila@wikimedia.org >>>>>> wrote: >>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> The mobile team is planning to switch WikiGrok on for >>>>>>> non-logged in users next week (2014-01-12). The widget will be on on >>>>>>> 166,029 article pages in enwiki. There are two EventLogging schema that may >>>>>>> collect data heavily and we want to make sure EL can handle the influx of >>>>>>> data. >>>>>>> >>>>>>> The two schema collecting data are: >>>>>>> https://meta.wikimedia.org/wiki/Schema:MobileWebWikiGrok >>>>>>> https://meta.wikimedia.org/wiki/Schema:MobileWebWikiGrokError >>>>>>> and the list of pages affected is in: >>>>>>> wgq_page in enwiki.wikigrok_questions. >>>>>>> >>>>>>> It would be great if someone from the dev side let us know >>>>>>> whether we will need sampling. >>>>>>> >>>>>>> Thanks, >>>>>>> Leila >>>>>>> >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> Analytics mailing list >>>>>> Analytics@lists.wikimedia.org >>>>>> https://lists.wikimedia.org/mailman/listinfo/analytics >>>>>> >>>>>> >>>>> >>>>> _______________________________________________ >>>>> Analytics mailing list >>>>> Analytics@lists.wikimedia.org >>>>> https://lists.wikimedia.org/mailman/listinfo/analytics >>>>> >>>>> >>>> >>>> _______________________________________________ >>>> Analytics mailing list >>>> Analytics@lists.wikimedia.org >>>> https://lists.wikimedia.org/mailman/listinfo/analytics >>>> >>>> >>> >>> _______________________________________________ >>> Analytics mailing list >>> Analytics@lists.wikimedia.org >>> https://lists.wikimedia.org/mailman/listinfo/analytics >>> >>> >> >> _______________________________________________ >> Analytics mailing list >> Analytics@lists.wikimedia.org >> https://lists.wikimedia.org/mailman/listinfo/analytics >> >> >
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
On Thu, Jan 8, 2015 at 2:52 PM, Ryan Kaldari rkaldari@wikimedia.org wrote:
After further discussion, we've decided to just show WikiGrok to a fraction of users during the test. I currently have it set to show WikiGrok to 10 out of every 62 users or ~16% (the userToken is a base 62 number). That should give us an estimated 27 hits per second. Does that work for everyone?
Keep in mind that EventLogging is still running on the single, dinky node that was assigned to it for initial prototyping in 2012. If there is a credible need to process events at a volume far beyond what we are currently processing, we could simply expand capacity by procuring additional nodes; EventLogging is horizontally scalable. Though we may have to be more selective about what we write to the database (vs. Hadoop).