-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
hi,
as Wikimedia deleted the MySQL binlogs required for replicating s3 on the Toolserver, replication is now halted. a full dump/import is required to fix replication. because disk space on yarrow is limited at the moment, and we are about to receive new servers, i might wait until this before reimporting. (however, if that's likely to take a long time, i might do it sooner.)
- river.
Hello, River, could you please to clarify "about to receive" and "long time" let say selecting a proper time measure: days, weeks, months. Our services depend on replicated data, so the delay might influence us and our users somehow.
Mashiah
2009/1/5 River Tarnell river@loreley.flyingparchment.org.uk
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
hi,
as Wikimedia deleted the MySQL binlogs required for replicating s3 on the Toolserver, replication is now halted. a full dump/import is required to fix replication. because disk space on yarrow is limited at the moment, and we are about to receive new servers, i might wait until this before reimporting. (however, if that's likely to take a long time, i might do it sooner.)
- river.
-----BEGIN PGP SIGNATURE-----
iD8DBQFJYWeSIXd7fCuc5vIRAo03AJ9ouEHS98oTwKPugzcUsc8Lp+LmSgCffFmd gK3xul+k9UinKPPHH5Q3PWg= =pnlB -----END PGP SIGNATURE-----
Toolserver-l mailing list Toolserver-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/toolserver-l
Il giorno 05/gen/09, alle ore 12:23, Daniel Kinzler ha scritto:
Mashiah Davidson schrieb:
Hello, River,
could you please to clarify "about to receive" and "long time" let say selecting a proper time measure: days, weeks, months.
It will be about a month until the new boxes are online.
-- daniel
Hmm... that's a long time. Can you restart the replication before please?
Pietrodn powerpdn@gmail.com
Pietrodn schrieb:
Il giorno 05/gen/09, alle ore 12:23, Daniel Kinzler ha scritto:
Mashiah Davidson schrieb:
Hello, River,
could you please to clarify "about to receive" and "long time" let say selecting a proper time measure: days, weeks, months.
It will be about a month until the new boxes are online.
-- daniel
Hmm... that's a long time. Can you restart the replication before please?
It's not a matter ofrestarting. The process is: take a slave server out of rotation and make a dump (I don't have access on the main cluster, so I can't do that), then copy the dump over, import it, and *then* start replication. This takes quite some time (a week, i'd guess), and is a lot of hassle. Also, during import, s3 is going to be completly unavailable for several days, instead of just having old data.
Doing this twice is no fun. So it's not an easy choice. I leave it to river to make it :)
-- daniel
Il giorno 05/gen/09, alle ore 12:34, Daniel Kinzler ha scritto:
Pietrodn schrieb:
Il giorno 05/gen/09, alle ore 12:23, Daniel Kinzler ha scritto:
Mashiah Davidson schrieb:
Hello, River,
could you please to clarify "about to receive" and "long time" let say selecting a proper time measure: days, weeks, months.
It will be about a month until the new boxes are online.
-- daniel
Hmm... that's a long time. Can you restart the replication before please?
It's not a matter ofrestarting. The process is: take a slave server out of rotation and make a dump (I don't have access on the main cluster, so I can't do that), then copy the dump over, import it, and *then* start replication. This takes quite some time (a week, i'd guess), and is a lot of hassle. Also, during import, s3 is going to be completly unavailable for several days, instead of just having old data.
Doing this twice is no fun. So it's not an easy choice. I leave it to river to make it :)
-- daniel
Oh, I understand. I didn't think it was such a long and difficult process.
Pietrodn powerpdn@gmail.com
How long are the binlogs kept for on Wikimedia servers?
Surely it would be possible to take a dump now, import it to s3, start replication, then import the same dump onto the new server, and let it catch up from a month of replag?
Of course, this wouldn't be possible if the binlogs are not kept for that long.
2009/1/5 Pietrodn powerpdn@gmail.com:
Il giorno 05/gen/09, alle ore 12:34, Daniel Kinzler ha scritto:
Pietrodn schrieb:
Il giorno 05/gen/09, alle ore 12:23, Daniel Kinzler ha scritto:
Mashiah Davidson schrieb:
Hello, River,
could you please to clarify "about to receive" and "long time" let say selecting a proper time measure: days, weeks, months.
It will be about a month until the new boxes are online.
-- daniel
Hmm... that's a long time. Can you restart the replication before please?
It's not a matter ofrestarting. The process is: take a slave server out of rotation and make a dump (I don't have access on the main cluster, so I can't do that), then copy the dump over, import it, and *then* start replication. This takes quite some time (a week, i'd guess), and is a lot of hassle. Also, during import, s3 is going to be completly unavailable for several days, instead of just having old data.
Doing this twice is no fun. So it's not an easy choice. I leave it to river to make it :)
-- daniel
Oh, I understand. I didn't think it was such a long and difficult process.
Pietrodn powerpdn@gmail.com
Toolserver-l mailing list Toolserver-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/toolserver-l
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Simon Walker:
How long are the binlogs kept for on Wikimedia servers?
less than a week, which is why our 6 days of replication lag after the maintenance was too long.
- river.
2009/1/5 River Tarnell river@loreley.flyingparchment.org.uk:
less than a week, which is why our 6 days of replication lag after the maintenance was too long.
And if WM-US knows, that WM-EU is expecting problems they can't just change setting to leave data you need?
AJF/WarX
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Artur Fijałkowski:
And if WM-US knows, that WM-EU is expecting problems they can't just change setting to leave data you need?
if by 'WM-EU' you mean the toolserver, then no, there isn't enough disk space on the master to store the logs longer than they are now.
- river.
Simon Walker wrote:
How long are the binlogs kept for on Wikimedia servers?
Surely it would be possible to take a dump now, import it to s3, start replication, then import the same dump onto the new server, and let it catch up from a month of replag?
Of course, this wouldn't be possible if the binlogs are not kept for that long.
No need to reimport the same dump. Dump Import Start replication Normal work with the toolserver *New boxes arrive here* Configure new servers Stop s3 Move mysql data files to the new box Restart s3
Which would also be faster. Importing a dump is slow with this amount of data. The files are bigger, but you avoid the index rebuild. And the servers being on the same location, transfer speed would be really high.
River Tarnell wrote:
there isn't enough disk space on the master to store the logs longer than they are now.
And on other servers? Or even copying the binlogs to knams/toolserver. After all, binlog deletion is a manual process.
The same problem arises every few months. The procedures aren't good enough yet. :-(
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Platonides:
Move mysql data files to the new box
i usually try to avoid that when splitting a database, since it means the innodb data file is much larger than it needs to be (yarrow has both s1 and s3, but the new server will only have s1). it's also a good opportunity to defragment/reindex the tables, which helps performance a bit.
there isn't enough disk space on the master to store the logs longer than they are now.
And on other servers? Or even copying the binlogs to knams/toolserver. After all, binlog deletion is a manual process.
yes, this would be nice, if Wikimedia would warn us before deleting the binlogs.
The same problem arises every few months. The procedures aren't good enough yet. :-(
yes, i agree.
- river.
River,
Good news, does it mean we might have s1 and s3 separated within around a month and it is a good time to adopt scripts and tools for this change? Mashiah
2009/1/5 River Tarnell river@loreley.flyingparchment.org.uk
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Platonides:
Move mysql data files to the new box
i usually try to avoid that when splitting a database, since it means the innodb data file is much larger than it needs to be (yarrow has both s1 and s3, but the new server will only have s1). it's also a good opportunity to defragment/reindex the tables, which helps performance a bit.
there isn't enough disk space on the master to store the logs longer than they are now.
And on other servers? Or even copying the binlogs to knams/toolserver. After all, binlog deletion is a manual process.
yes, this would be nice, if Wikimedia would warn us before deleting the binlogs.
The same problem arises every few months. The procedures aren't good enough yet. :-(
yes, i agree.
- river.
-----BEGIN PGP SIGNATURE-----
iD8DBQFJYlJcIXd7fCuc5vIRAredAJ9FPsBvbZ5RGxWUNPtkTkTZ4VqIGACfdwEH pViyrITfCQDIlejBy7h64ls= =MBPK -----END PGP SIGNATURE-----
Toolserver-l mailing list Toolserver-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/toolserver-l
Hello, Am Monday 05 January 2009 21:54:32 schrieb Mashiah Davidson:
to adopt scripts and tools for this change?
shouldn't be much change. sql-s3 (or sql-s1) will only point to another server then. If you programm correct in the past, no change is needed (only for JOINS, but I guess we will sync commons on the new server too).
Sincerly, DaB.
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Mashiah Davidson:
Good news, does it mean we might have s1 and s3 separated within around a month
yes.
and it is a good time to adopt scripts and tools for this change?
i cannot think of anything you would need to change.
- river.
On Tue, Jan 6, 2009 at 08:57, River Tarnell river@loreley.flyingparchment.org.uk wrote:
Good news, does it mean we might have s1 and s3 separated within around a month
yes.
Would federating all of them instead be still terribly slow?
— Kalan
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Kalan:
Would federating all of them instead be still terribly slow?
yes. MySQL federation is not useful for what people want to use it for (joins). for anything beyond simple single-row queries (like the toolserver database), it's far too slow.
a better alternative would be MySQL Cluster. i've looked at this in the past, but it had some restrictions that made it unsuitable for us. once these are resolved, it would be easier for both users (as all databases would appear to be on the same server) and for us (as it makes it much easier to add additional capacity to the cluster).
- river.
On Tue, Jan 6, 2009 at 12:43, River Tarnell river@loreley.flyingparchment.org.uk wrote:
a better alternative would be MySQL Cluster. i've looked at this in the past, but it had some restrictions that made it unsuitable for us.
Could you tell a little more? Do these limitations come from the TS, the Wikimedia DB servers or the users?
— Kalan
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Kalan:
Could you tell a little more?
binlog replication is not supported. rows can't be larger than 8KB. all indices must be stored in memory at all times. only 20,000 objects (databases, tables, indices) are supported.
there's probably more, but those are a few i see from looking at the cluster manual.
- river.
River Tarnell schrieb:
i cannot think of anything you would need to change.
We schould keep in mind that we have to replicate commonswiki to three servers then. And we should probably do the same for the global_users stuff.
I also suggest to "really" support (that is: provide backups for) user databases on all servers.
-- daniel
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Daniel Kinzler:
We schould keep in mind that we have to replicate commonswiki to three servers then.
yes. that shouldn't be hard, since we already do it for yarrow.
And we should probably do the same for the global_users stuff.
i don't know what that is, but if it's just another database, it would be trivial to add to trainwreck's regex.
I also suggest to "really" support (that is: provide backups for) user databases on all servers.
it's possible, but it increases the size of the backups and the time it takes to make them. probably possible to do it in parallel rather than one at a time...
- river.
River Tarnell wrote:
Platonides:
Move mysql data files to the new box
i usually try to avoid that when splitting a database, since it means the innodb data file is much larger than it needs to be (yarrow has both s1 and s3, but the new server will only have s1). it's also a good opportunity to defragment/reindex the tables, which helps performance a bit.
You can have each table on a different file http://dev.mysql.com/doc/refman/5.0/en/multiple-tablespaces.html
Specially, "If you start the server with this option, new tables are created using .ibd files, but you can still access tables that exist in the shared tablespace" so no worries about s1 being already there.
When moving to a new server, tablespace id mismatches are likely, but seems people have been able to go over them. http://mysqlhints.blogspot.com/2008/10/fixing-innodb-import-tablespace-error...
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Platonides:
You can have each table on a different file
yes, i know. but that isn't the only reason i listed for doing a separate import.
- river.
toolserver-l@lists.wikimedia.org