Also, of course, Aaron you can ask me any question on this and I'll try to
help!
On Tue, Oct 6, 2015 at 8:44 PM, Marcel Ruiz Forns <mforns(a)wikimedia.org>
wrote:
Dan, thanks for the careful explanation.
I wanted to add that there is a small documentation on Wikitech for the
reportupdater tool:
https://wikitech.wikimedia.org/wiki/Analytics/Reportupdater
Cheers!
On Tue, Oct 6, 2015 at 6:38 PM, Dan Andreescu <dandreescu(a)wikimedia.org>
wrote:
Hi Aaron,
I like the tool Marcel built in the spring. It's called reportupdater
and it's been pretty stable and useful but it's not documented because we
haven't publicized it yet. What it does is allow you to configure
templates for SQL or shell scripts that take parameters and generate
separated value files as output. You can specify the time granularity that
you want results for and it will re-run jobs for time periods that don't
exist in the output (because of failures, etc.). It also does other useful
things like reports errors like a champ and ensures only one instance is
running at any given time. You can even change your scripts to output new
columns or re-arrange the column order and it will morph the output files
to match the new header (you just can't remove columns - because that's
crazy!).
If you wanna talk more about it I'd like to give you the details
privately because I'd want to start documenting this tool properly as I do.
On Tue, Oct 6, 2015 at 12:16 PM, Aaron Halfaker <ahalfaker(a)wikimedia.org>
wrote:
Hey folks,
I know there was some work in the past on systems to support keeping
database reports up to date. I'm looking into this type of work with Jeph
Paul now and I realized I don't have any good pointers to this past work.
Right now, we're looking at running database reports based on cron jobs and
checking the recentchanges table to make sure that replication isn't too
lagged. Is there a better way?
FWIW, I expect these queries to run daily and have a runtime of up to an
hour.
-Aaron
_______________________________________________
Analytics mailing list
Analytics(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics
_______________________________________________
Analytics mailing list
Analytics(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics
--
*Marcel Ruiz Forns*
Analytics Developer
Wikimedia Foundation