Random page bias - Wikitech-l

9 Jan 2005


      Hello,
If I'm not mistaken, the random page scheme makes some pages
much more likely to be retrieved than others. If I understand
correctly, each page is assigned a random number (cur_random)
when it's created. To retrieve a random page, a random number x
is generated and the page with cur_random > x and lower than 
any other cur_random is returned.
The probability of retrieving a page is proportional to the
gap between its cur_random and the next lower cur_random.
If cur_random is generated uniformly, the gaps have an 
exponential distribution. This implies there will be some
gaps that are much larger than others. I guess that we could
directly test this by dumping out cur_random and computing
the gaps.
Since the bias (if it exists) is independent of everything 
else (category, age, size, authors, etc etc) I think it 
would be no more than a curiosity, except for people who 
are doing some kind of comprehensive statistical stuff.
Sorry if this is just a rerun of some earlier discussion.
I did look but couldn't find anything.
best,
Robert Dodier