[Foundation-l] Is random article truly random

Domas Mituzas midom.lists at gmail.com
Tue Oct 18 15:15:48 UTC 2011


Short answer: no

Long answer:

we have uneven chances for different pages to show up. 
It is based on the idea that every page gets inserted into discreetly random position in a certain linear space, so you end up with [[Poisson distribution]], which from a distance seems to return stuff randomly enough, but one page can have 1000x higher chance to be returned than other. 

Well, frankly, we have some pages that have infinitely larger chance to be returned than others (there are over 1000 pages with random collisions, yay [[Birthday paradox]]), as others don't have any chance at all, some of values we save with precision of 12 decimals, others with 18 ;-)
So, largest gap is 0.0001 whereas smallest (collisions aside) is 0.0000000000001, so even with non-collision articles, the 'chance gap' can be of a billion times ;-)

So, if we put these numbers into buckets, we see that there're 1259 articles that have 10x higher chance than 3.6M, which have 10x higher chance than other 4.49M which have 10x higher chance than a poor set of 700k pages, which still have 10x higher chance than 71k pages, which still have 10x higher chance than 7k, which still have higher chance than 700, which still have infinitely larger chance than remaining 1000 which will never show up on Special:Random.

I won't even go into discussions how this all gets distorted by all the feature requests like 'give me random page from a category X'.

;-)

Cheers,
Domas



More information about the wikimedia-l mailing list