[WikiEN-l] A quick references survey

Steve Bennett stevage at gmail.com
Mon Jan 30 21:42:12 UTC 2006


[also originally sent yesterday but didn't seem to get through the
moderator, perhaps due to my mail server?]

Hi all,
  I'd like to undertake a more thorough survey of wikipedia
referencing standards, but I've started with a quick "pilot" study.

Methodology: Click "random article". Discard results which are not
articles. Count the number of "external links", "references",
"paragraphs".

Terms: A bit fuzzy, I'm treating a web page which gives more
information as an "external link", and a page or book or whatever
which is claimed to be the source of the information (or is clearly
the source) as a "reference". Paragraphs are, well, paragraphs, but it
must be said that longer articles generally have longer paragraphs
than shorter ones do. So lines would probably be better...

Preliminary results:
Sample size:30 pages, of which 17 were stubs.

Number with no links: 21
Number with no references: 24
Average number of links: 0.67
Average number of references: 0.54


I found very few book references, one of which was patently false
("James Maxwell's book of James Maxwells not as cool as me, by James
Maxwell"). Similarly a list of newspaper articles turned out to all
have been written by the subject (a journalist). One page (out of 30)
actually gave ISBN references (Chepstow Bridge).

Conclusions:
None yet, really, since the methodology isn't very solid and the
sample set is small. But notably: More than half the articles were
stubs. Hardly any articles had any real "references". Most of the
external links were band websites, company websites etc. Of the few
refernces, one was blatantly false and a few were "bad". So it's
probably a little early to be claiming that all material added to
Wikipedia MUST be sourced or it will be removed. Because based on
this, only around 15% of Wikipedia would survive. (Which is more than
I would have predicted).


Any suggestions for improved methodology? It might be nice to harness
the wikipedia population to collect some more general article quality
metrics...

Steve



More information about the WikiEN-l mailing list