Le 2013-05-06 20:08, Paul A. Houle a écrit :
From my viewpoint, biases are an issue of statistical
sampling.
It's not just the sampling which matter, how you process them to infer
conclusions is just as important. Except in the case of an hypotetical
clone army where each clone lived excatly the same experiences in the
same order, I doubt human will ever come with the very same biases.
Also even large statistical sampling can't prevent you from "black
swan" occurence.
Wikipedia is an encyclopedia by humans for humans
so of course it
has a anthropocentric background, in which the mass of all the
concepts swirling around the Earth like an atmosphere curves the
graph, keeping the Sun in orbit around our world.
I don't think that human can be "objective" as opposed to infer
assumption on non-subjective non-biased experiences. Those said it's
clear to my mind that we are able to build conceptual models which can
more or less successfully make predictive assumptions (provided that we
trust or own memory and our ability to compare memory and sensitive
perceptions). So as I understand it, while we tend to be
anthropocentric, this cognitive bias is the first epsitemologic barrier
which, to my mind, must be exceeded. Copernic and Darwin for example are
two important step away from anthropocentric suppositions.
I find Wikipedia categories useful today, warts
and all. They've
got two things going for them:
(1) Class and out-of-class dichotomies are the atom of ontology.
Well-designed categories have an operational definition that allows
class members to be determined with practically perfect precision
To my mind, there's no such thing as perfection or any absolute
phenomenon, or at least there no way to decide actual reality of such a
thing through our subjective experiences. In my humble opinion,
well-designed categories help you find useful information for your
current situation.
(2) They are densely populated.
Look at the categories on this guy's web page
http://en.wikipedia.org/wiki/Arnold_Schwarzenegger
each one of those categories states a useful and correct fact, even
if the organization of those facts is entirely haphazard.
For instance, it would be better if he was coded as an "American"
and an "Austrian", "Californian", "Los Angelino" and he
is also a
"Bodybuilder" and an "Actor" and a zillion other things and then
infer
that he was a "American Bodybuilder", "Austrian Actor" and such.
But
it's not that easy because he was an "Austrian soldier" but not an
"American soldier" and I'd feel uncomfortable calling him an
"Austrian
Politician". A lot of nuance is encoded in that sticky mess.
What would it better to code it in an other way if it would bring to a
situation where you arise chances to infer invalid assertions? Just
because a locution is composed with several word doesn't mean that you
could get the same meaning from seperated words, just like you can't
take each letter of a word. Now you will probably want that the
"Austrian Politician" be categorized in the "Politician" category.
What's wrong with that?
It's very easy to analyze those categories and
produce desired
concepts like "Car" and "Bodybuilder" from junky categories like
"Front-wheel drive vehicle," "General Motors Concept Cars",
"Bodybuilder Actor" and "Actor Bodybuilder", in fact, that's
exactly
what the semantic web is for.
There is so much rich and precise information in the categories that
you get great results despite sampling error caused by low recall in
the categories.
I'd love to see better structure, but not at the cost of fact
density or precision.
If we can take advantage of the knowledge in the graph to exert
gentle pressure that improves categorization in Wikipedia that would
be great. It's definitely time for the social industry to move beyond
"tags"
Do you have examples of what you would like to be able to do that you
can't do with the current situation? I mean you may even create a
"categories that are valide for my special purpose".
--
Association Culture-Libre
http://www.culture-libre.org/