valhallasw added a subscriber: valhallasw. valhallasw added a comment.
In https://phabricator.wikimedia.org/T84944#941417, @Mpaa wrote:
I guess each request in the QueryGenerator _Iter_ loop is independent from the previous. If one is unlucky enough, the random start of a batch might fail in an already seen interval?
From https://en.wikipedia.org/w/api.php?action=help&modules=query%2Brandom :
Pages are listed in a fixed sequence, only the starting point is random. This means that if, for example, "Main Page" is the first random page on your list, "List of fictional monkeys" will *always* be second, "List of people on stamps of Vanuatu" third, etc.
This means there can be overlaps if multiple requests to query=random are made, the same page /can/ be returned twice. The assumption there can be no duplicates is therefore false.
TASK DETAIL https://phabricator.wikimedia.org/T84944
REPLY HANDLER ACTIONS Reply to comment or attach files, or !close, !claim, !unsubscribe or !assign <username>.
EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: valhallasw Cc: valhallasw, jayvdb, Aklapper, Mpaa, pywikipedia-bugs
pywikipedia-bugs@lists.wikimedia.org