Who is still using s1-analytics-slave, and for what sorts of things?
The analytics-store is just an alias for dbstore1002, which is about to be duplicated as dbstore2002 in CODFW with all wikis and eventlogging.
We could potentially dub dbstore2002 as "analytics-slave" or similar, and reclaim the old s1-analytics-slave in EQIAD -- which is really db1047, out of warranty, and likely one of a batch of database nodes that will be decommissioned this year.
dbstore2002 has nearly 3x more memory than db1047, and twice as many cores, so you guys would be gaining from this transaction ;-)
Sean
I've been slow to move some datasets off of s1-analytics-slave because it remained available. If I were given ~ a week notice, it would be no problem to move all datasets and work to analytics-store.
Am I reading correctly that you are suggesting that we might have *both* dbstore1002 and dbstore2002 available? Now, as far as having two machine for querying, this would be valuable for spreading the load of regular jobs (e.g. dashboard scripts) from ad-hoc queries.
-Aaron
On Thu, Feb 5, 2015 at 5:42 AM, Sean Pringle springle@wikimedia.org wrote:
Who is still using s1-analytics-slave, and for what sorts of things?
The analytics-store is just an alias for dbstore1002, which is about to be duplicated as dbstore2002 in CODFW with all wikis and eventlogging.
We could potentially dub dbstore2002 as "analytics-slave" or similar, and reclaim the old s1-analytics-slave in EQIAD -- which is really db1047, out of warranty, and likely one of a batch of database nodes that will be decommissioned this year.
dbstore2002 has nearly 3x more memory than db1047, and twice as many cores, so you guys would be gaining from this transaction ;-)
Sean
DBA @ WMF
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
On Thu, Feb 5, 2015 at 6:45 AM, Aaron Halfaker ahalfaker@wikimedia.org wrote:
I've been slow to move some datasets off of s1-analytics-slave because it remained available. If I were given ~ a week notice, it would be no problem to move all datasets and work to analytics-store.
same here.
On Fri, Feb 6, 2015 at 12:45 AM, Aaron Halfaker ahalfaker@wikimedia.org wrote:
I've been slow to move some datasets off of s1-analytics-slave because it remained available. If I were given ~ a week notice, it would be no problem to move all datasets and work to analytics-store.
Am I reading correctly that you are suggesting that we might have *both* dbstore1002 and dbstore2002 available? Now, as far as having two machine for querying, this would be valuable for spreading the load of regular jobs (e.g. dashboard scripts) from ad-hoc queries.
Correct, analytics would have access to both boxes, ideally with some logical production/ad-hoc traffic split.
Important to note that dbstore2002 would not replicate dbstore1002 staging or datasets databases, just the upstream wikis and eventlogging. If you want staging on both you'd have to maintain it twice.
Sean
I know that Erik Moller still uses geowiki a lot got look at the state of the wikis. I'm not ready to prioritize a transition of the geowiki code to use wikimetrics (that's a whole project in itself).
Christian, can we just point the geowiki code to a different database?
On Thu, Feb 5, 2015 at 6:28 PM, Sean Pringle springle@wikimedia.org wrote:
On Fri, Feb 6, 2015 at 12:45 AM, Aaron Halfaker ahalfaker@wikimedia.org wrote:
I've been slow to move some datasets off of s1-analytics-slave because it remained available. If I were given ~ a week notice, it would be no problem to move all datasets and work to analytics-store.
Am I reading correctly that you are suggesting that we might have *both* dbstore1002 and dbstore2002 available? Now, as far as having two machine for querying, this would be valuable for spreading the load of regular jobs (e.g. dashboard scripts) from ad-hoc queries.
Correct, analytics would have access to both boxes, ideally with some logical production/ad-hoc traffic split.
Important to note that dbstore2002 would not replicate dbstore1002 staging or datasets databases, just the upstream wikis and eventlogging. If you want staging on both you'd have to maintain it twice.
Sean
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Hi Kevin,
[ To keep archives happy, it seems this thread got broken off of https://lists.wikimedia.org/pipermail/analytics/2015-February/003307.html ]
On Mon, Feb 09, 2015 at 09:55:46AM -0800, Kevin Leduc wrote:
Christian, can we just point the geowiki code to a different database?
Simple repointing does not work. But migration should be trivial enough [1] nonetheless.
However, IIRC geowiki is kept in “at some point we're scheduling to do it right”-state since more than a year. It has always been postponed. This database change might be a nice incentive for finally doing it.
Have fun, Christian
[1] As depicted in the diagram [2] from my previous email, geowiki uses s1-analytics-slave in two different ways.
1. As a source for wiki data. (Top in the batch of the databases on the very left of the diagram)
For this use, repointing is sufficient.
2. As a database to store daily aggregates. (Below the left-most grey rectangle)
The “erosen_*” tables in the “staging” database hold the daily aggregates geowiki computes. Geowiki also generates the csvs using those tables. As the “staging” database is not replicated between the hosts, a plain repointing would at least cause data loss.
But migration should be as simple as:
* Copying the tables over from s1-analytics-slave to the new database server. * Making sure the “research” user has access from stat1003. * Rerunning geowiki for the lost-data (due to copying).
[2] https://upload.wikimedia.org/wikipedia/commons/b/b0/Geowiki_workflow.png
Hi Kevin, Hi Sean,
On Thu, Feb 05, 2015 at 11:42:26PM +1000, Sean Pringle wrote:
Who is still using s1-analytics-slave, and for what sorts of things?
Geowiki is still using s1-analytics-slave for the erosen_* tables [1] in the staging database.
Kevin, it was mentioned at some point that geowiki might switch to wikimetrics. So I'll let it to Kevin to decide whether geowiki would need migration, or can be switched to wikimetrics soon enough.
Have fun, Christian
[1] The relevant db instance is below the left grey rectangle on https://upload.wikimedia.org/wikipedia/commons/b/b0/Geowiki_workflow.png
(The second s1-analytics-slave reference on the top left does not care whether it's db1047 or dbstore[12]002 and can be switched without issues)
Hi,
just to keep archives happy, it seems the below message saw an out-of-thread reply at
https://lists.wikimedia.org/pipermail/analytics/2015-February/003328.html
and is continued there.
Have fun, Christian
On Thu, Feb 05, 2015 at 11:19:20PM +0100, quelltextlich e.U. - Christian Aistleitner wrote:
Hi Kevin, Hi Sean,
On Thu, Feb 05, 2015 at 11:42:26PM +1000, Sean Pringle wrote:
Who is still using s1-analytics-slave, and for what sorts of things?
Geowiki is still using s1-analytics-slave for the erosen_* tables [1] in the staging database.
Kevin, it was mentioned at some point that geowiki might switch to wikimetrics. So I'll let it to Kevin to decide whether geowiki would need migration, or can be switched to wikimetrics soon enough.
Have fun, Christian
[1] The relevant db instance is below the left grey rectangle on https://upload.wikimedia.org/wikipedia/commons/b/b0/Geowiki_workflow.png
(The second s1-analytics-slave reference on the top left does not care whether it's db1047 or dbstore[12]002 and can be switched without issues)
-- ---- quelltextlich e.U. ---- \ ---- Christian Aistleitner ---- Companies' registry: 360296y in Linz Christian Aistleitner Kefermarkterstrasze 6a/3 Email: christian@quelltextlich.at 4293 Gutau, Austria Phone: +43 7946 / 20 5 81 Fax: +43 7946 / 20 5 81 Homepage: http://quelltextlich.at/
So, bump :-)
- A week's notice would be needed for Halfak to vacate s1-analytics-slave, and presumably others could meet the same target. Or make it a month since there is no desperate rush.
- Geowiki needs some coordination for the switchover to analytics-store staging db.
Anything else?
Hi Sean, no objection on my end either. I’ll have to update a bunch of scripts that populate the EE dashboards [1] but it’s no big deal as long as we clearly communicate the ETA.
[1] http://ee-dashboard.wmflabs.org/dashboards/enwiki-metrics http://ee-dashboard.wmflabs.org/dashboards/enwiki-metrics
On Feb 15, 2015, at 7:43 PM, Sean Pringle springle@wikimedia.org wrote:
So, bump :-)
A week's notice would be needed for Halfak to vacate s1-analytics-slave, and presumably others could meet the same target. Or make it a month since there is no desperate rush.
Geowiki needs some coordination for the switchover to analytics-store staging db.
Anything else? _______________________________________________ Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics