[Wikipedia-l] Page deletion policy

Michael R. Irwin mri_icboise at surfbest.net
Mon Aug 26 08:32:11 UTC 2002

Toby Bartels wrote:
> Fredbauder wrote regarding:

<snip preliminaries and background>

>(or whatever the title is, I can't access Wikipedia right now).
>   I argue that deletion is better than a lousy stub like
>   "A large city in southern [[Arizona]].".
>   A ''good'' stub is better still than deletion,
>   but that takes some work, stub though it may be,
>   and people not willing to do work for their stubs
>   shouldn't write them, IMO.

This may be a primary point for discussion.

IMO We should be grateful for the smallest quantum
of information a contributor provides and create
easy opportunities to provide it.

Why?  Because small quantums lead to repeat quantums
which lead to steady streams of quantums.  If a 
stream of quantums exist then there exists a finite
time in which a significant packet by any measure
will accumulate.  Further, if it can be shown that
steady accumulation of quantums and packets leads
to exponential growth in the number of quantum 
providers, the rate at which they contribution, or
the summed total product; then there exists a growth rate 
at which our goal of deep, broad, and reliable can be achieved
within human lifetimes absent any limiting scaling
factors which leads to loss of useful participation as 
fast or faster than it grows. 

A leading online free Encyclopedia which has depth,
breadth, and reliable content will require massive
effort to create.

Consider 200 million people currently connected to the internet.

Most of them fairly well educated and over half of them
literate in English.  If one million wander by in
2004 four or five times to look at Wikipedia material
and one percent of them leave a link, correct an error,
or write a sentence on 20 percent of their visits then we 
will have collected aproximately:

1 percent of 1 million = 10K

(10K)*(.33)*(1 applicable link)

- (10K)*(.33)*(1 error)

+ (10K)*(.33)*(100 byte sentence)

= + 3,000 links - 3,000 errors + 300K content

This twikification is apparently a minor influence
compared to regular contributors.  Unless we find
in 2005 that of 1.5 million visitors, 500K are now repeat
visitors and 2 percent of these are likely to create
a paragraph, insert 5 best links of a google search,
edit an article for readibility, insert crosslinks between
specialized areas of personal interest, or research
and write one stub to fill in a blank area of intense
personal interest.

Now we have in addition to the above for the first
million, something like this for the 500K:

(2 percent) * (1/5 probability of contribution) * 500K = 2K/category

+ (2K) * (500bytes/paragraph)  or 1Mb
+ (5 best links) * (2K)        or 10,000 links
+ (2K) * (1 reability improvement) or 2,000 articles improved
+ (2K) * (5 cross links)       or 10,000 new crosslinks
+ (2K) * (researched stubs)    or 2,000 stubs

Still fairly minor compared to the 100K plus articles
we should easily have by 2005.  But consider 2006

Assuming approximately the same 500K vistor growth to 
2 million.  
One million of whom are new casual dropins same 
as above. 
500K are repeat visitors of whom 2 percent become 
contributors same as above.

250K of whom 1 percent are repeat significant contributors 
who do at least one of the following:

1.) adopt a couple of articles of interest
     (weekly error correction, augmentation and fast response to
questions on talk page)
2.) write a couple of draft stubs 
     (5 plus paragraphs, 2 external ref., 2 links)
3.) research and improve a stub significantly 
     (5 plus more paragraphs, minus 2 errors, plus 2 ext. ref, 2 open
      and 1 paragraph of brilliant prose)
4.) edit several articles for readability
     (minus 5 comma splices, minus 2 errors, minus 2 typos, plus some
5.) contest or question several facts for checking
     (figure 25% are quickly answered, 25% stimulate discussion, 50%
6.) present a significant (say over 1%) minority view for inclusion
     (Priceless! New content by definition! .... under 1% are run off as
7.) read a cluster of specialized articles of interest and crosslink
     (100 cross links, 50 percent inappropriate or errors, 50 useful)
8.) insert an index list of detailed subtopics within a subject
     (??? attractive to students or specialists???)
9.) add some highly specialized detail to a topic of professional
     (say ten pages of highly technical gobblety gook ... ala Alex 8) )
10.) refactor a lengthy article into a summary article with links to new

1 percent (250K) times .10 equals 250 contributors in each of ten

1.  (250) * 2 adopted articles   or 500 newly adopted articles
2.   + (250) * 2 stubs           or 500 draft stubs  or 250K content
3.   + (250) * (5para)(500bytes) - 250(2 errors) + 250(2 ext.ref.) + (2
open queries)

laziness sets in.   

The point is that contributing volunteers grow into 
contributing in their own way from the initial starting point
of a single edit or contribution through whatever path they
choose of a myriad of potential ways to contribute.

A few poor stubs lying around to help new readers become
contributors should pay off handsomely in the long run, if
poor stubs help entice an initial contribution.  What could
be less threatening than an almost empty page with one
sentence describing your home town or a city you have driven
through?  Later when we have thousands of regular contributors
the poor stub contribution will fall off because poor stubs
are hard to find.   Right now with less than 200 regular
contributors and less than a thousand repeat contributors,
any growth in the number of contributors is very significant.

People discouraged from useful participation by the incomplete work 
in progress can be impressed in the future after the contributing 
community has grown sufficiently to grow the content easily accessible
to the main page to the high quality necessary to impress
non contributing consumers.

> # What should we do with a page that has a useless stub?
>   I mean a page that gives only a definition and a WWW link, or less.
>   This may give some information to some people, to be sure,
>   but I think that it does more harm than good not to be linked with "?".

Your argument here seems to be that the incomplete material
is hard to find.   This may be true for a dedicated writer looking
for new empty titles to write under.

I assure you that many attempting to use the reference material
specific to the topic in question will encounter the incomplete
material.  Murphy's law applies here.  Since you wish to augment the
material, it is hard to find; since they wish to use the incomplete
it manifests quickly via a seach or excellent branch index link
Hopefully someone will choose to edit, to at least
add a quantum of information.

>   An example was [[Pluto (god)]], which I voted for deletion
>   about a week ago.  (The page is now perfectly fine.)
>   This was purely for the sake of being provocative,
>   and I don't intend to start deleting such things now.
>   I'm undecided about what we should do in these cases;
>   perhapse it's sufficient to make sure that they're
>   listed on [[Wikipedia:Find and fix a stub]]?

Perhaps the policy (recommendations for voluntary participation
methods) could/should be to add at least one of
the following:

1. one sentence
2. one external link
3. one internal link
4. other easy or tiny quantities of information

It seems to me that this is within an order of magnitude
of the effort required to request deletion.

With several hundred Wikipedians browsing around following
this policy the stubs should start growing even if no one
chooses to do detailed and extensive research or real work
in augmenting it.

>   I'd listen to people that use that page; I don't.
>   The only thing that I'm certain about is that they shouldn't be ''started''.
>   Every stub writer should read [[Wikipedia:How to write the perfect stub]]
>   (or whatever it is); writing good stubs is not the ''easy'' way out.
>   And "A large city in southern [[Arizona]]." is not "work".

So why should Wikipedia contribution be work?  Contributors
will either enjoy contributing or leave.  What are you going
to do in a few years when new stubs to initiate with a high
quality new draft are scarce and take serious work to find?  
If Wikipedia becomes successful as a deep, broad, reliable 
resource then contributions beyond wikification are going to 
become increasingly difficult to make.

For example:  I recently became interested in neural nets and
started reading up a bit on neuron organization in the human
brain.   Somebody has done a pretty good job in this area
already on the Wikipedia, at least to a layman's summary and
fundamentals level.   The best contribution I could easily and
reliably make to the "neuron" article after a few hours of
technical reading was a question on the talk page regarding
the precise nature of synapse connections between adjacent
neurons.  I may answer this in the near future or a neurologist,
physiologist, or biologist may wander by and answer with an 
appropriate edit. Alternatively another none specialist may 
research it extensively and decide that they can answer the 
question definatively or at least clarify it reliably.  

Does the fact that a specialist could spot and rephrase this 
sentence (that I find ambiguous) more accurately with trivial effort 
make it an insignificant contribution?   I do not think so.  It 
effects how I model my "neuron" class/object in a potential 
future neural network I may write to toy around with.  This
makes it highly significant to me.  It is also of significance
to every future reader, whether it is in error or merely unclear, 
until it is corrected. A single quantum of information, yet a 
highly significant contribution waiting to happen.

Coquille is a small town in Oregon.   Is this a trivial stub
which should be deleted?   Not if a fellow Coquille H.S.
graduate shows up, adds a description of the annual "Gay 90s"
parade, and sticks around to help fill details on the Coquille
River valley and watershed.  Particularly if a third Coquille 
H.S. graduate shows up, adds the years Coquille took the OR state 
football and track championships (to the new stub on OR state
championships and Coquille's article), notes that a graduate 
competed in the Olympic trials but did not make the cut, adds 
some non classified detail on the stealth bombers he flies and
drops some chit chat on my talk page.  If I let him know the
other person has an account and that he should drop a message,
suddenly we have an asynchronous 3-way social event.  It is more 
likely that all participants will be repeat users and contributors
and very possibly could attract further attention from other
Coquille residents or graduates or current friends and 

Coquille is a town of 5,000 and I know many people that
could add useful detail to the Wikipedia.  Tucson is a
city of how many people with how many H.S. graduates?
I think we can live with a stub until somebody from Tucson
shows up to expand it and correct errors.  Likewise a 
network of detailed technical stubs related to neurons .... 
over 50 neurotransmitters! .... this impending list could act
as an attractor for neurology students who could answer
my question in a timely manner.  I can hear it now:
"Wikipedia is not a technical almanac, leave this to the
human genome or proteanome project."  Perhaps if it
was, at some appropriate level appropriately accessed
a dictionary or almanac, it would have correct unambiguous 
explanation of how neurons intercommunicate.  Surely a
neurology student looking up neurotransmitters for some
obscure reason would have fixed the layman's overview
by now?  Perhaps if each neurotransmitter had a few links
to online technical papers it would attract some attention
of neurology students stranded away from technical libraries
over Christmas break?

In conclusion/confusion, let us quickly reexamine that "lousy" stub:

I presume a good title such as:
Tucson, Arizona or whatever the current style is

"A large city in southern [[Arizona]]."

What is precise population?  Census data is available
online and via public signs at the city boundary. A 
resident or former resident could likely add a lower
boundary accurate to within a few thousand.

large - qualitative subjective description.  There is room 
for improvement but it has significant meaning.
Large for Arizona or for New York?  For the Western US or
Eastern Seaboard.  Largest in Arizona or merely fifth largest
in the state?

city - not a town, county, or a military base.
Are there any military bases nearby?  Is it in the
fallout patterns for nuclear tests conducted in the
50s and 60s?  Any historically significant cattle 
trails, military confrontations, events?  Where did
the name Tucson come from?  Was the city built on
top of ancient ruins or located near a river or
oasis also used by ancient peoples?

southern - bottom half
Is it on the Mexican border?
Is it affected by weather patterns off the Gulf of Mexico?
Any famous flash floods or other disasters?
Does it still have an artificial lake with a wave machine?

Arizona -  one of fifty states, a link to further information
What other links are appropriate?  Sister cities, trade
partners, famous monuments or nearby national parks, etc.?

It (the "lousy" stud) seems like a fine starting point to me for 
anyone wishing to add a quantum or few of information in whatever
applicable units, directions, or subjects they prefer.  Each quantum 
they add can presumably be linked in some way to multiple other
quantums.  Leading to grand unified quantum entanglement
theory or something similarly silly sounding (related to
probability of spontaneous source apparition and retention 
or repeat emission capture) yet critical to nurturing exponential 
growth in the size of the contributing community.

Well, I have enjoyed chattering with you immensely about
this qualitatively significant issue but I think I shall go
fool around elsewhere a while.  Ta ta.

Seriously!  Small changes in exponential growth curves
are tremendously significant.  Sign changes are even
more dramatic.  If the stubs are not significantly
damaging your productivity, I think we should leave them,
they should expand fairly quickly if they have good 
high quality article titles for linking from elsewhere.
If they prompt a new user to contribute something while
waiting for expansion to serious draft or high quality
stub/article status, then they will have served a pivotal purpose
in leveraging our community participation growth curve up a bit.

A second contribution is much less scary than the first one.

Mike Irwin

More information about the Wikipedia-l mailing list