valhallasw added a comment.
The underlying randomness algorithm is as follows:
- each page is stored with a random number, `page_random`, between 0 and 1 - generator=random runs `SELECT * FROM page WHERE page_random > {value} LIMIT {limit}`, with value a random number between 0 and 1, and LIMIT the number of pages to retrieve
I suppose the API could actually expose page_random as opaque 'continue' parameter, which would then allow actual continuation, and hence provide full random-without-replacement?
As for //our// users: they would typically use -random from the command line, and iirc generators from the command line are always filtered for uniqueness.
TASK DETAIL https://phabricator.wikimedia.org/T84944
REPLY HANDLER ACTIONS Reply to comment or attach files, or !close, !claim, !unsubscribe or !assign <username>.
EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: valhallasw Cc: gerritbot, valhallasw, jayvdb, Aklapper, Mpaa, pywikipedia-bugs
pywikipedia-bugs@lists.wikimedia.org