The probability of displaying a "bad" page would be:
B q ((p B)^N - 1) / (p B - 1) + B (p B)^N
(modulo errors), where B is the fraction of bad pages, p is the
probability of repeating, q is the probability of displaying (so p+q =
1), and N is the allowed number of repetitions.
--
LF
On 23 August 2013 14:37, C. Scott Ananian <cananian(a)wikimedia.org> wrote:
This "make a second draw" approach would
also let you tune how often you
saw the "bad" articles. That is, if it's a bad article, then flip a coin
to see if you should make a second draw. Repeat if the new article is bad,
but never make more than N draws. Someone with time on their hands and a
statistical bent could compute how often "good" and "bad" articles
come up
as a function of the ratio of good and bad articles, the coin flip
probability, and the limit N.
--scott
On Aug 22, 2013 10:47 PM, "Lars Aronsson" <lars(a)aronsson.se> wrote:
On 08/23/2013 03:57 AM, Tim Starling wrote:
An approximation would be to select, say, 100
articles from the
database using page_random, then calculate a weight for each of those
100 articles using complex criteria, then do a weighted random
selection from those 100 articles.
Interesting. An even easier/coarser approximation
would be to make a second draw only when the
first draw doesn't meet some criteria (e.g.
bot-created, shorter than L bytes, lacks illustration).
On an average day, Special:Random (and its
translation Special:Slumpsida) seems to be
called some 9000 times on sv.wikipedia
--
Lars Aronsson (lars(a)aronsson.se)
Aronsson Datateknik -
http://aronsson.se
______________________________**_________________
Wikitech-l mailing list
Wikitech-l(a)lists.wikimedia.org
https://lists.wikimedia.org/**mailman/listinfo/wikitech-l<https://lists.…
_______________________________________________
Wikitech-l mailing list
Wikitech-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l