Re: [Wikidata-l] Question about wikipedia categories.

7 May 2013


      Le 2013-05-07 15:50, Paul A. Houle a écrit :
...
Statistical methods can deal with black swans,  but you've got to get
away from normal distributions and also model the risk that your 
model
is wrong.
Do you have recommandations on things I could read on this topic? To me 
it's seems hard to evaluate probability like "I exist", when probability 
is something which come well after one own existence in an "existential 
chain". Of course, let's suppose that I do not exist, but some 
"existialism demon" fool me with the illusion that I do. Then I don't 
exist, so "I" can't worry on existence since "I" don't exist in the 
first place. Now how would you evaluate the chance that "I" exists? 1/2, 
1?
...
Since training sets come from the same place sausage comes from,
training sets in machine learning rarely teach the algorithm the
correct prior distribution of the class.  Punch a new prior into the
system and it will perform much better.
Once again, do you have recommandations on things I could read on this 
topic?
...
Some kinds of sampling biases can be somewhat overcome.
Involvement of multiple people smoothes out individual bias.
(Kurzweil's project of stealing a human soul with a neural network is
already being scoops by projects that are stealing statistical models
of many souls.)
I'm affraid that I would need more references here too.
...
Language zone Wikipedias are obviously biased towards the
viewpoint of people in that language zone.  Mostly that's a good
thing,  because a Chinese knowledge base that reflected an Anglophone
bias would seem unnatural to Chinese speakers.
   And that's the point.  Useful systems don't "eliminate bias" but
they are given the bias that they need in order to do their job.
While I agree with you on this point, I must admit that I don't feel at 
ease to say it. I mean, being satisfied with "it does the job" may 
probably already be a cultural bias.
...
Most of these are connected to the larger mass through just a few
categories that would be hard to express as restriction types.
Wikipedia is reasonable to require concepts to have a category 
because
really,  if you want to assert something exists and can't find some
category that this thing is a member of,  I wouldn't be so sure that
this thing exists.
The concept exists indenpantly of the ontological status of the object 
to which it refers. Take "nothingness"[1], in fact the english wikipedia 
article give not a great definition: "Nothingness is the state of being 
nothing". Well no, nothingness actualy refers to nothing, and any 
statement which give existential attribute to nothingness is wrong. The 
only correct statements on nothingness are totologies of "nothingness 
doesn't exist". But the concept of nothingness do exist, through the 
thought which sustains it. The thought exists, but it doesn't mean that 
what the thought is refering to also exists. And as you can see with the 
nothing article, you can find categories for the concept, even if no 
attribute would apply to what the concept denotes.
[1] https://en.wikipedia.org/wiki/Nothing
...
I'm not sure if there is anything I can't do with the current
situation, but bear in mind that I'm going to look at DBpedia,
Wikidata and Freebase facts too and be willing to do data cleaning
processing and hand cleaning of results that I cannot accept.  It's a
tricky and somewhat expensive process (though it's cheaper than
conventional ontology construction),  so cleaner data makes this
process cheaper and quicker and available to more end users
personalized to their own needs to define the categories they need.

Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l
-- 
Association Culture-Libre
http://www.culture-libre.org/

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

Re: [Wikidata-l] Question about wikipedia categories.