Thanks Marcel! Indeed I saw https://wikitech.wikimedia.org/wiki/Analytics/EventLogging#Access_data_in_Ha... a while ago and asked on #wikimedia-analytics whether this approach might speed up queries for (the previous version of) this schema, the response was a bit ambiguous. Nevertheless I'm really interested in trying this out for speed purposes alone - if you have a moment at the summit this week to answer a question or two about the Hive setup, that would be great.
I think we should reduce the sample rate in any case; will check with the mobile web team before filing a task.
On Mon, Jan 4, 2016 at 6:41 AM, Marcel Ruiz Forns mforns@wikimedia.org wrote:
Thanks Tilman,
It makes sense to reduce the sampling rate of the schema for "Datensparsamkeit and faster queries". However, if you don't specifically need MySQL, and are fine querying through Hive, we could continue storing all events at the current 1% rate in Hadoop.
On Mon, Jan 4, 2016 at 11:28 AM, Tilman Bayer tbayer@wikimedia.org wrote:
Hi Marcel,
yes, this is to be expected, because the schema is now logging more kinds of events than before. However, we could reduce the sampling rate considerably, as JonR and I had already envisaged (https://phabricator.wikimedia.org/T120292#1854136 ; this got lost a bit among the other schema changes, cf. https://phabricator.wikimedia.org/T120292#1864549 ).
On Sun, Jan 3, 2016 at 12:30 PM, Marcel Ruiz Forns mforns@wikimedia.org wrote:
BTW, MobileWebSectionUsage schema is sending a lot of events since Dec 18, 2015. It normally would send around 40 events per second, and it's sending around 120 events per second now. It's now the highest throughput schema in EL by far. Is that expected?
Sorry for using this same thread. If this needs to be taken care of, I will create a new task. Thanks!
On Tue, Dec 29, 2015 at 8:41 PM, Nuria Ruiz nuria@wikimedia.org wrote:
Sorry i misses this but it always has sent events to a real high volume.
On Tue, Dec 22, 2015 at 10:25 AM, Jon Katz jkatz@wikimedia.org wrote:
- Dmitry
Hi Nuria, I will ask Dmitry to confirm, but I think a pause is fine for the next couple of days as long as we are given the timestamps for outage can note it on the schema wiki page. Is this a sudden increase or has it always been sending to high of a volume? Regardless, I imagine a higher sampling rate can probably be applied. -J
On Tue, Dec 22, 2015 at 9:58 AM, Nuria Ruiz nuria@wikimedia.org wrote:
Team:
This schema MobileWikiAppShareAFact is sending a lot of events, maybe is worth thinking whether we need that many. It is again a case where tables are becoming huge and hard to query fast.
cc-ing Jon as schema owner.
Can this data be sampled at a higher sampling rate? I have filed a ticket to this fact: https://phabricator.wikimedia.org/T122224
Thanks,
Nuria
On Tue, Dec 22, 2015 at 8:35 AM, Adam Baso abaso@wikimedia.org wrote: > > Replacing mobile-tech with mobile-l (internal mobile-tech list > discontinued). > > > On Tuesday, December 22, 2015, Nuria Ruiz nuria@wikimedia.org > wrote: >> >> Team: >> >> As part of our effort of converting eventlogging mysql database to >> the >> tokudb engine we need to stop eventlogging events from flowing into >> the >> MobileWikiAppShareAFact table, we are using this one table to see >> how long >> the conversion will take in order to plan for a larger outage >> window. >> >> >> Let us know if data should be backfilled as it can be, we >> anticipate >> events will not flow into table for the better part of one day. >> >> >> Thanks, >> >> Nuria >> >> > > _______________________________________________ > Mobile-l mailing list > Mobile-l@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/mobile-l >
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
-- Marcel Ruiz Forns Analytics Developer Wikimedia Foundation
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
-- Tilman Bayer Senior Analyst Wikimedia Foundation IRC (Freenode): HaeB
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
-- Marcel Ruiz Forns Analytics Developer Wikimedia Foundation
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics