From: Brion Vibber brion@pobox.com
On Thu, 2003-05-01 at 08:53, Nick Reinking wrote:
Off the top of my head, I can't think of any simple mathematical way
to
do want you want to do. (that being making articles selected randomly less likely to be selected randomly over time)
Not sure who wants to do that... For me it's quite sufficient to make the probability of selection random, and let the size of the selection field make multiple selections _reasonably_ rare.
With over 100,000 articles, it's unlikely the user will ever see the same article come up twice. A previous flawed algorithm had the property of making a few articles much more likely to be selected.
I thought the reason we were having this discussion was because articles were repeating too often, indicating some fault in the randomizer? I have commonly seen an article twice in a sequence of 3 -5 random page requests - about a dozen times in the past month. On one occasion I saw an article 3 times in a sequence of 6. Which is definitely not statistically random.
A calculator is only as smart as the fool pressing the buttons. Same with pseudorandom number generators. The PRNG appears to have been working perfectly, allowing us to misuse it with amazing precision. For an explanation of what went wrong, see my first post on this topic.
As for Brion's comment about MySQL updating RAND() seeding -- I assumed they would be using a persistent seed, but that comment suggests they're making the same mistake our own code makes: initialising the seed on every connection, from the system time. Just because you ask for microseconds doesn't mean you get microseconds. The PHP manual doesn't say how microtime() is implemented, what if it's using the old 18.2Hz clock?
-- Tim Starling.
_________________________________________________________________ Hotmail now available on Australian mobile phones. Go to http://ninemsn.com.au/mobilecentral/hotmail_mobile.asp
wikitech-l@lists.wikimedia.org