Also, of course, Aaron you can ask me any question on this and I'll try to help!

On Tue, Oct 6, 2015 at 8:44 PM, Marcel Ruiz Forns <mforns@wikimedia.org> wrote:
Dan, thanks for the careful explanation.

I wanted to add that there is a small documentation on Wikitech for the reportupdater tool:
https://wikitech.wikimedia.org/wiki/Analytics/Reportupdater

Cheers!


On Tue, Oct 6, 2015 at 6:38 PM, Dan Andreescu <dandreescu@wikimedia.org> wrote:
Hi Aaron,

I like the tool Marcel built in the spring.  It's called reportupdater and it's been pretty stable and useful but it's not documented because we haven't publicized it yet.  What it does is allow you to configure templates for SQL or shell scripts that take parameters and generate separated value files as output.  You can specify the time granularity that you want results for and it will re-run jobs for time periods that don't exist in the output (because of failures, etc.).  It also does other useful things like reports errors like a champ and ensures only one instance is running at any given time.  You can even change your scripts to output new columns or re-arrange the column order and it will morph the output files to match the new header (you just can't remove columns - because that's crazy!).

If you wanna talk more about it I'd like to give you the details privately because I'd want to start documenting this tool properly as I do.

On Tue, Oct 6, 2015 at 12:16 PM, Aaron Halfaker <ahalfaker@wikimedia.org> wrote:
Hey folks,

I know there was some work in the past on systems to support keeping database reports up to date.  I'm looking into this type of work with Jeph Paul now and I realized I don't have any good pointers to this past work.  Right now, we're looking at running database reports based on cron jobs and checking the recentchanges table to make sure that replication isn't too lagged.  Is there a better way?

FWIW, I expect these queries to run daily and have a runtime of up to an hour.

-Aaron

_______________________________________________
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics



_______________________________________________
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics




--
Marcel Ruiz Forns
Analytics Developer
Wikimedia Foundation



--
Marcel Ruiz Forns
Analytics Developer
Wikimedia Foundation