On Sat, Aug 24, 2013 at 6:38 PM, Maarten Dammers <maarten(a)mdammers.nl>wrote;wrote:
If you compare our current implementation to wheel of
fortune [1]; all our
articles are evenly spread around.
Weighted would be putting bot articles closer to each other so you would
hit them less often. You just need a good algorithm to calculate this
distribution.
You could implement this algorithm as an extension in MediaWiki that
updates page_random with this different distribution. This way you don't
need to update the database schema, just the logic at the page save.
As a simple implementation: normally articles are saved with a random sort
key between 0 and 1. If bot articles were saved with a random sort key
between 0 and 0.1, they would expect to be seen by Special:RandomPage 10
times less often than they were previously. (Ie, if there were a b% chance
of getting a bot page from Special:RandomPage previously, there would now
be a (b/10)% chance of getting a bot page.)
--scott
ps. If you want to be numerically precise, you need to be more careful with
the edge conditions. For example, if there are no non-bot articles, then
90% of the time the algorithm will end up wrapping around and choosing the
lowest-sorted bot article, which would warp the expected distribution. It
would be more correct to "re-roll the dice" in that case, which introduces
an extra term into the probability which ends up resolving the apparent
contradiction when b=100%..
--
(
http://cscott.net)