https://bugzilla.wikimedia.org/show_bug.cgi?id=71121
Bug ID: 71121 Summary: RepeatingGenerator intermittent failure on test.wikidata Product: Pywikibot Version: core (2.0) Hardware: All OS: All Status: NEW Severity: normal Priority: Unprioritized Component: pagegenerators Assignee: Pywikipedia-bugs@lists.wikimedia.org Reporter: jayvdb@gmail.com Web browser: --- Mobile Platform: ---
We have seen a few intermittent failures of RepeatingGenerator. The most recent is on test.wikidata:
https://travis-ci.org/wikimedia/pywikibot-core/jobs/35913559
IIRC, the previous ones have also been test.wikidata
My guess is there is insufficient recent change data on this wiki, and a looping problem is causing infinite looping.
https://bugzilla.wikimedia.org/show_bug.cgi?id=71121
John Mark Vandenberg jayvdb@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Priority|Unprioritized |High CC| |nullzero.free@gmail.com
https://bugzilla.wikimedia.org/show_bug.cgi?id=71121
--- Comment #1 from Sorawee Porncharoenwase nullzero.free@gmail.com --- The easiest workaround is to remove the test, and when we come up with a better test that guarantees that it won't cause a failure in any case, we can add it later.
Another workaround is to simulate a stream of recentchanges / newpages somehow to prevent insufficient recent change data.
https://bugzilla.wikimedia.org/show_bug.cgi?id=71121
--- Comment #2 from Sorawee Porncharoenwase nullzero.free@gmail.com --- If it's very urgent, you can remove the test right now. I can't code in next few days.
https://bugzilla.wikimedia.org/show_bug.cgi?id=71121
--- Comment #3 from John Mark Vandenberg jayvdb@gmail.com --- It isnt urgent. It only happens occasionally, and only happens on one site. Another one today. https://travis-ci.org/wikimedia/pywikibot-core/jobs/37035200
(In reply to Sorawee Porncharoenwase from comment #1)
The best 'quick' way to do this is move the test into a new class, and then in setUpClass skip the tests if there are not sufficient suitable recentchanges / newpages for the test to run against the live wiki.
https://bugzilla.wikimedia.org/show_bug.cgi?id=71121
--- Comment #4 from Sorawee Porncharoenwase nullzero.free@gmail.com --- @John Mark Vandenberg: class TestPageGenerators runs with family = 'wikipedia', code = 'en', doesn't it? Why should it run on test.wikidata?
Anyway, suppose that this bug really need to be fixed:
(In reply to John Mark Vandenberg from comment #3)
This is impossible because we don't know the future!
One way I can think of is that we might use `multiprocessing` to run the test while the main process wait for, say, fifteen seconds. If the process has not finished yet by that time, we terminate the process and assume that it works correctly.
https://bugzilla.wikimedia.org/show_bug.cgi?id=71121
--- Comment #5 from John Mark Vandenberg jayvdb@gmail.com --- (In reply to Sorawee Porncharoenwase from comment #4)
@John Mark Vandenberg: class TestPageGenerators runs with family = 'wikipedia', code = 'en', doesn't it? Why should it run on test.wikidata?
You're right; the test is supposed to be running against the site en.wikipedia.org, but a bug somewhere in pywikibot could mean that doesnt happen.
I find it hard to believe that en.wp doesnt have four namespace 0 edits on recentchanges for 10 minutes. In fact, 10 mins shouldnt even be required. If I understand correctly, this test is essentially asking the RC feed for four namespace 0 edits, any time in the past. This should always be an instant result.
https://bugzilla.wikimedia.org/show_bug.cgi?id=71121
--- Comment #6 from Sorawee Porncharoenwase nullzero.free@gmail.com --- (In reply to John Mark Vandenberg from comment #5)
This is wrong. RepeatingGenerator will ask the RC feed for the latest edit in the past (to be an indicator of "present") and three more edits in the future. Thus, it might not return an instant result for some sites. For English Wikipedia, however, it should return an instant result.
https://bugzilla.wikimedia.org/show_bug.cgi?id=71121
--- Comment #7 from John Mark Vandenberg jayvdb@gmail.com --- OK. thanks for clarifying; it makes more sense now, but still doesnt explain how it might take 10 mins to fetch 3 namespace 0 edits in enwp.
One way to avoid the problem is to add a timeout to RepeatingGenerator, so the caller can prevent it from locking up forever if new data doesnt arrive.
https://bugzilla.wikimedia.org/show_bug.cgi?id=71121
--- Comment #8 from Sorawee Porncharoenwase nullzero.free@gmail.com --- The most recent one now is on ar.wikipedia: https://travis-ci.org/wikimedia/pywikibot-core/builds/39342240
https://bugzilla.wikimedia.org/show_bug.cgi?id=71121
--- Comment #9 from John Mark Vandenberg jayvdb@gmail.com --- Another test.wd hang: https://travis-ci.org/wikimedia/pywikibot-core/jobs/39560103
https://bugzilla.wikimedia.org/show_bug.cgi?id=71121
--- Comment #10 from John Mark Vandenberg jayvdb@gmail.com --- three recent fr.wikt hangs https://travis-ci.org/wikimedia/pywikibot-core/jobs/40017256 https://travis-ci.org/wikimedia/pywikibot-core/jobs/40011201 https://travis-ci.org/wikimedia/pywikibot-core/jobs/39926632
https://bugzilla.wikimedia.org/show_bug.cgi?id=71121
--- Comment #11 from John Mark Vandenberg jayvdb@gmail.com --- ar.wp https://travis-ci.org/wikimedia/pywikibot-core/jobs/40288852
https://bugzilla.wikimedia.org/show_bug.cgi?id=71121
--- Comment #12 from Sorawee Porncharoenwase nullzero.free@gmail.com --- This weekend I will add a parameter "timeout." It's not an elegant solution, though, because it would be just a workaround -- hiding the real problem without fixing it. Some people might even disagree with this workaround.
https://bugzilla.wikimedia.org/show_bug.cgi?id=71121
Fabian CommodoreFabianus@gmx.de changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |CommodoreFabianus@gmx.de
--- Comment #13 from Fabian CommodoreFabianus@gmx.de --- Couldn't we print additional information instead? I'd prefer that first to determine who's fault it is (if there are really only so few edits). For example start time would be interesting and if it had fetch any pages.
https://bugzilla.wikimedia.org/show_bug.cgi?id=71121
--- Comment #14 from Gerrit Notification Bot gerritadmin@wikimedia.org --- Change 171830 had a related patch set uploaded by John Vandenberg: Disable cache for RepeatingGenerator tests
https://gerrit.wikimedia.org/r/171830
https://bugzilla.wikimedia.org/show_bug.cgi?id=71121
Gerrit Notification Bot gerritadmin@wikimedia.org changed:
What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |PATCH_TO_REVIEW
https://bugzilla.wikimedia.org/show_bug.cgi?id=71121
--- Comment #15 from Gerrit Notification Bot gerritadmin@wikimedia.org --- Change 171830 merged by jenkins-bot: Disable cache for RepeatingGenerator tests
https://gerrit.wikimedia.org/r/171830
https://bugzilla.wikimedia.org/show_bug.cgi?id=71121
John Mark Vandenberg jayvdb@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Status|PATCH_TO_REVIEW |RESOLVED Resolution|--- |FIXED
--- Comment #16 from John Mark Vandenberg jayvdb@gmail.com --- Im pretty sure this is fixed now. Sorry I didnt notice this earlier.
https://bugzilla.wikimedia.org/show_bug.cgi?id=71121
--- Comment #17 from Sorawee Porncharoenwase nullzero.free@gmail.com --- So what's the problem? Cache?
https://bugzilla.wikimedia.org/show_bug.cgi?id=71121
--- Comment #18 from John Mark Vandenberg jayvdb@gmail.com --- Yes. TestRequest was forcing all subsequent queries to return the same result, consisting of the same pages, so it would never find new pages to yield.
pywikipedia-bugs@lists.wikimedia.org