[Foundation-l] Another look a bot creation of articles

Andrew Su asu at gnf.org
Wed Jul 23 18:35:06 UTC 2008


Lars,

Perhaps you and I are the only ones on this list who are interested, but
since I'm enjoying the discussion...

[snip]
> Now, in a growing wiki the country (the whole) is the knowledge
> that readers have, and which they could potentially write about.
> The strategic targets are the actual edits they will contribute,
> which is a lot smaller than the whole country. Planting a lot of
> stubs is carpet bombing, dropping stubs on various topics, hoping
> to find the topics of those future manual edits.  The 50% number
> in your report means that 50% of those future edits (targets) were
> hit by the stub carpet bombing.  But that number doesn't say
> anything of the precision of the carpet bombing.  How many of the
> planted stubs failed to attract any manual edits?

As of this moment, you're absolutely right, the vast majority of gene
stubs have gone unedited.  But I'm sure everyone here recognizes that
there's a "critical mass" aspect to growth, and the recent publication
is only the first step in this process.  Also, it's worth noting that
there are tens (hundreds?) of thousands of graduate students worldwide
whose mission is to discover the function of a human gene or genes.  I'm
quite confident that over time, >95% of those gene pages will be edited.
Unfortunately, there's no good way to predict precisely which gene pages
will take off first, so we limited ourselves to the top ~9000 genes
(ranked by # of citations).  Ultimately, this was a threshold that EN WP
was comfortable with, and I think a good tradeoff between short-term
stagnant articles and long-term growth.

> That would be an interesting study, especially if you could repeat
> it with different size and quality of the stubs.

While the scientist in me says that this type of controlled experiment
would be interesting, practically speaking I don't think this is a
realistic project.  Why would any wiki community support purposely
creating suboptimal stubs?  The goal of this Gene Wiki effort (and I
have to assume all bot-creation efforts) is to create the best stubs
possible.  And then it's up to the community to decide whether the
benefit of creating those stubs (factoring in the potential for
downstream manual edits) is worthwhile.  

Cheers,
-andrew



More information about the foundation-l mailing list