On 23/08/13 10:48, Lars Aronsson wrote:
But it is not obvious how a bug report or feature
request should be written. A naive approach would be
to ask for a random article that wasn't created by a
bot, but this is not to the point.
That was my solution when this issue came up on the English Wikipedia:
http://www.mediawiki.org/wiki/Special:Code/MediaWiki/4256
The configured SQL excluded pages most recently edited by Rambot.
Derek Ramsey was opposed to it, since he thought his US census stubs
deserved eyeballs just as much as any hand-written article, but IIRC I
managed to get this solution deployed, at least for a year or two.
Users want bot
generated articles to come up, only not so often. And
some manually written article stubs are also less wanted.
Perhaps the random function should be weighted by
article length or by the number of page views? But is
it practical to implement such a weighted random
function? Are the necessary data in the database?
It would not be especially simple. The existing database schema does
not allow weighted random selection. A special data structure could be
used, or it could be implemented (inefficiently) in Lucene.
An approximation would be to select, say, 100 articles from the
database using page_random, then calculate a weight for each of those
100 articles using complex criteria, then do a weighted random
selection from those 100 articles.
Article length is in the database, but page view count is not.
-- Tim Starling