On Tue, Jan 22, 2008 at 05:52:52PM -0700, Todd Allen wrote:
On Jan 22, 2008 5:14 PM, Anthony wikimail@inbox.org wrote:
On Jan 21, 2008 10:33 PM, Ian Woollard ian.woollard@gmail.com wrote:
On 21/01/2008, David Goodman dgoodmanny@gmail.com wrote:
this seems a little circular.
It does *seem* so, but it's not circular, since nobody is made notable by noting themselves.
how do we tell who is notable in the first
place?
Because other people note them in turn. ;-)
and how to we get out of this trap?
If you think about it, this is the same kind of problem faced by search engines. When you do a search for web pages they give you what we can call, for the sake of this argument, the 'most notable' web pages that contains the words you're looking for, where notability is related by how many web pages link to a page, and how many link to the pages that link to them, and so on.
Which is the same thing, And it's a solved problem.
So in principle the same formal algorithms (e.g. PageRank) can be applied to the wikipedia concept of notability (but of course notability in this case, not over webpages, instead over all the books, films, magazines, people's comments etc. etc.) And we would get an unambiguous number that corresponds to notability.
Seems to me that would correspond more to popularity than to "notability". These two concepts are different, right?
And what's the cutoff which qualifies as "notable enough"? Google's PageRank works because there's no cutoff. If I type in "Wikipedia", I get the page with the highest PageRank for "Wikipedia". But if I type in "Capriccio" I get the page with the highest PageRank for "Capriccio". I don't get a message saying "Sorry, no links for 'capriccio' are notable enough".
Of course in the real world, we aren't running the algorithm, and we expect that editors to more or less know who and what are notable and who aren't, and it may look very different at first. But I think if you look at what the people are doing, it amounts to essentially the same idea as what google do with webpages; but run in peoples heads in a distributed way, they keep track of the most notables for the subjects they are interested in in much the same way.
Right?
I think some people are treating notability that way, and I think this comparison to Google is a good example of why it's such a bad idea.
WikiEN-l mailing list WikiEN-l@lists.wikimedia.org To unsubscribe from this mailing list, visit: http://lists.wikimedia.org/mailman/listinfo/wikien-l
Google is meant to be indiscriminate and to index everything (or at least everything which a robots.txt or the like doesn't explicitly ask it not to). Wikipedia is explicitly not intended to be so.
So, what do we ask? "How much independent reliable source material is available on this subject?" If the answer is "a large amount", we write a full article. If the answer is "a little bit", we might have a suitable list or parent article to make a mention. If the answer is "none", we write nothing. All of these are in keeping with our core policies. "No original research" indicates that we don't second guess reliable sources, just follow their lead. If their lead is "this isn't important enough to write about", we don't say "Well they're wrong", we follow their lead-and don't write. Undue weight indicates much the same thing-sources indicate how much weight we give something. If they decide "very little", we write very little. If they decide "none", we write nothing. We don't decide, reliable and independent sources do. Period. That's our metric. We don't need any other. Nice, simple, and no need for editors to decide at all.
As to determining reliability, this is a solved question. Nature and Science are reliable sources. The Weekly World News is not. The New York Times is a reliable source. A tabloid sensationalist rag is not. In some cases there might be an edge case, but in most cases we can use simple metrics. Is the source widely regarded as reliable? Is it written, peer-reviewed, editorially controlled, and/or fact-checked by professionals? Is it cited by other sources known to be reliable? Again, we let others decide, we don't need to do so ourselves, and we -shouldn't- be doing so ourselves.
The above is, in my opinion very over-simplified. It is also subjective. Yes, we can recognise really good reliable sources and really bad reliable sources, but most of the time we are in the middle.
The problem really is thinking that sources determine whether we write about something, rather than needing sources to write. A current Afd is interesting:
http://en.wikipedia.org/wiki/Wikipedia:Articles_for_deletion/Deletionist_ver...
The keeps are saying there are sources. The deletes are saying, yes, but it is not encyclopedic. I !voted keep as the notability guideline agrees with you. If there are good sources we have an article. However, I still do not think this article is appropriate. We need to improve the inclusion criteria so we include what is worthy of being in an encyclopia and reject stuff that is not worthy. Just going for sources leads to lots of crap on wikipedia and perhaps to rejecting stuff that does not have conventional sources.
We need to think hard about this and not just accept over-simplified solutions.
Brian.
-- Freedom is the right to say that 2+2=4. From this all else follows.
WikiEN-l mailing list WikiEN-l@lists.wikimedia.org To unsubscribe from this mailing list, visit: http://lists.wikimedia.org/mailman/listinfo/wikien-l