As most of us already know, replag on enwiki has been going up and up since about 30 June. As it says on status.toolserver.org, "Hight replag because of inserting of many SHA1-hashes." (Note to DaB.: the first word should be spelled "High".)
I asked DaB. on IRC how long this might go on, and he replied one to two weeks. However, I've since done some independent investigation that suggests that his estimate might be a little low.
It turns out that there are three large blocks of consecutive entries in the revision database that need to be populated with SHA1 hashes. Apparently there are three processes running in parallel on the WMF servers that are filling in each of these blocks from the bottom, by numerical order of rev_id. Knowing this, we can estimate how many revisions still need to be populated at any given point; and, taking such estimates at various points in time, can estimate how long the process will take. (Needless to say, this is only an estimate since the rate at which database changes are processed on the toolserver side is variable; also, the blocks of rev_ids are not actually consecutive due to deletions, but we can assume for our purposes that the deleted revisions are distributed uniformly throughout the database.)
It further turns out that it is only possible to compute this estimate for sql-s1-user (thyme), because the enwiki_p view on sql-s1-rr (rosemary) does not have the rev_sha1 field at all (!). It appears that the server on rosemary is receiving millions of database updates each day from WMF and throwing them in the bit bucket.
Anyway, based on four observations spaced at 6 hour intervals, it appears that thyme is populating about 353,000 revisions per hour, or 8.5 million per day. A simple trendline analysis shows that, at this rate, completing the 230,000,000 remaining unpopulated revisions will take about 27 more days (estimated completion Aug 6 at 17:48 UTC).
Anyone who relies on use of the enwiki_p database should expect a prolonged continuation of degraded service and steadily increasing replag.
Isn't there anything we (or TS admins) can do about this? Like asking WMF to populate the SHA1s at a slower rate?
Petr Onderka [[en:User:Svick]]
On Tue, Jul 10, 2012 at 4:19 PM, Russell Blau russblau@imapmail.org wrote:
As most of us already know, replag on enwiki has been going up and up since about 30 June. As it says on status.toolserver.org, "Hight replag because of inserting of many SHA1-hashes." (Note to DaB.: the first word should be spelled "High".)
I asked DaB. on IRC how long this might go on, and he replied one to two weeks. However, I've since done some independent investigation that suggests that his estimate might be a little low.
It turns out that there are three large blocks of consecutive entries in the revision database that need to be populated with SHA1 hashes. Apparently there are three processes running in parallel on the WMF servers that are filling in each of these blocks from the bottom, by numerical order of rev_id. Knowing this, we can estimate how many revisions still need to be populated at any given point; and, taking such estimates at various points in time, can estimate how long the process will take. (Needless to say, this is only an estimate since the rate at which database changes are processed on the toolserver side is variable; also, the blocks of rev_ids are not actually consecutive due to deletions, but we can assume for our purposes that the deleted revisions are distributed uniformly throughout the database.)
It further turns out that it is only possible to compute this estimate for sql-s1-user (thyme), because the enwiki_p view on sql-s1-rr (rosemary) does not have the rev_sha1 field at all (!). It appears that the server on rosemary is receiving millions of database updates each day from WMF and throwing them in the bit bucket.
Anyway, based on four observations spaced at 6 hour intervals, it appears that thyme is populating about 353,000 revisions per hour, or 8.5 million per day. A simple trendline analysis shows that, at this rate, completing the 230,000,000 remaining unpopulated revisions will take about 27 more days (estimated completion Aug 6 at 17:48 UTC).
Anyone who relies on use of the enwiki_p database should expect a prolonged continuation of degraded service and steadily increasing replag. -- Russell Blau russblau@imapmail.org
Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org) https://lists.wikimedia.org/mailman/listinfo/toolserver-l Posting guidelines for this list: https://wiki.toolserver.org/view/Mailing_list_etiquette
Hello, At Tuesday 10 July 2012 20:34:41 DaB. wrote:
Like asking WMF to populate the SHA1s at a slower rate?
yes, that would be helpful. I had already asked in #wikimedia-tech during a talk, but got no response (but it could be that it was just overlooked).
Sincerely, DaB.
Hello, At Tuesday 10 July 2012 16:36:29 DaB. wrote:
It appears that the server on rosemary is receiving millions of database updates each day from WMF and throwing them in the bit bucket.
I'm not sure what "throwing them in the bit bucket." means, but I guess something like "throwing away"; this is not the case, the field was just not visible for users – I change that now. I will also remove the typo of the status soon.
Sincerely, DaB.
On Tue, Jul 10, 2012, at 04:39 PM, DaB. wrote:
Hello, At Tuesday 10 July 2012 16:36:29 russblau wrote:
It appears that the server on rosemary is receiving millions of database updates each day from WMF and throwing them in the bit bucket.
I'm not sure what "throwing them in the bit bucket." means, but I guess something like "throwing away";
"bit bucket" == /dev/null :-)
this is not the case, the field was just not visible for users – I change that now.
I'm glad to know that the appearance was deceiving. Having data from rosemary now, I come up with an estimated completion date there of July 30, assuming that the rate at which updates are received from WMF does not change.
On 11/07/12 16:14, Russell Blau wrote:
I'm glad to know that the appearance was deceiving. Having data from rosemary now, I come up with an estimated completion date there of July 30, assuming that the rate at which updates are received from WMF does not change.
I thought it would be worse. How are you measuring the sha1-populated boundaries?
On Thu, Jul 12, 2012, at 01:29 AM, Platonides wrote:
On 11/07/12 16:14, Russell Blau wrote:
I'm glad to know that the appearance was deceiving. Having data from rosemary now, I come up with an estimated completion date there of July 30, assuming that the rate at which updates are received from WMF does not change.
I thought it would be worse. How are you measuring the sha1-populated boundaries?
I do a series of queries in the form "SELECT rev_sha1 FROM revision WHERE rev_id = NNN", with the NNN's selected by a binary search algorithm, to find the lowest rev_id for which rev_sha1 is "".
Of course, this was preceded by some other queries to establish that there are, in fact, three consecutive blocks of unpopulated revisions, and that the upper boundaries of these blocks have not changed.
toolserver-l@lists.wikimedia.org