On Mon, 2003-03-31 at 02:07, Andre Engels wrote:
If I remember correctly, there are 1000 pages being selected once a day or so, which are then cycled through. So once you get a significant portion of this 1000 pages, you would indeed often be getting the same pages again.
No, that's not correct. Random selection is made from the set of all articles. Each page is assigned a random number. The set of pages is sorted by the random number, and a random index into this set is selected. The selected page's random number is then reassigned to a new random number so it should not be reselected even if the same random index came up on a subsequent random load.
If that's not random enough, it may be due to an allegedly defective RAND() function in MySQL 3.x.
At one time, random selection was made by picking 1000 articles at random, then picking random indexes from that queue for the next X number of random selections (a 1 in 800 chance of resetting the queue on each random load). This was abandoned because the queue system was problematic (high probability of duplicates; too-slow queue refill operation; would sometimes bring up deleted pages; on smaller wikis it didn't update, etc).
-- brion vibber (brion @ pobox.com)