My (very belated) thoughts on this issue:
Wiki content grows in a messy way, and it stays messy until the messiness causes
problems. Once it causes problems, people are motivated to clean it up.
I propose to implement hierarchical search based on very simple, predictable
rules, e.g. by having a configurable list of transitive relationships that get
evaluated to a certain depth. I'd go for subclasses, geographical inclusion, and
subspecies at first.
Doing this will NOT produce good results. You would have to implement a lot of
special cases and heuristics to work around dirty data. I say: let it produce
bad results, tell people why the results are bad, and what they can do about it!
The Wikimedia community is AMAZING at making good use of whatever capabilities
the software, and adapting content to make the software produce the results they
want. By providing limited but clearly defined software support for hierarchical
search, we allow the community to optimize the content to work with that search.
Keeping the rules simple means that other consumers can then follow the same
rules, and the content will work for them as well.
Am 29.09.2018 um 19:25 schrieb Gerard Meijssen:
There is also the age old conundrum where some want to enforce their rules for
the good all all because (argument of the day follows).
First of all, Wikidata is very much a child of Wikipedia. It has its own
structures and people have endeavoured to build those same structures in
Wikidata never mind that it is a very different medium and never mind that there
are 280+ Wikipedias that might consider things to be different. The start of
Wikidata was also an auspicious occasion where it was thought to be OK to adopt
an external German authority. That proved to be a disaster and there are still
residues of this awful decision. It took not long to show the short comings of
this schedule and it was replaced by something more sensible.
However, we got something really Wiki and it was all too wild. It took not long
for me to ask for someone to explain the current structures and nobody
volunteered. So I did what I do best, I largely ignored the results of the
classes and subclasses. It does not work for me. It works against me so me
current strategy is to ignore this nonsense and concentrate on including data.
The reason is simple; once data is included, it is easy to slice it and dice
it.structure it as we see fit at a later date.
So when our priority becomes to make our data reusable, more open we should
agree on it. So far we have not because we choose to fight each other. Some have
ideas, some have invested too much in what we have at this time. When we are to
make our data reusable, we should agree on what it is exactly we aim to achieve.
Is it to support Commons, it is to support some external standard that is
academically sound. I would always favour what is practical and easily measured.
I would support Commons first. It has the benefit that it will bring our
communities together in a clear objective. It has the benefit that changes in
the operations of Wikidata support the whole of the Wikimedia universe and
consequentially financial, technical and operational needs and investments are
easily understood. It also means that all the bureaucracy that has materialised
will show to be in the way when it is.
So my question is not if we are a Wiki, my question is are we a Wiki enough and
willing to change our way for our own good.
On Sat, 29 Sep 2018 at 16:38, Thad Guidry <thadguidry(a)gmail.com
Wikidata has the ability of crowdsourcing...unfortunately, it is not
Its because Wikidata does not yet provide a voting feature on
statements...where as the vote gets higher...more resistance to change the
statement is required.
But that breaks the notion of a "wiki" for some folks.
And there we circle back to Gerard's age old question of ... should Wikidata
really be considered a wiki at all for the benefit of society ? or should
it apply voting/resistance to keep it tidy, factual and less messy.
We have the technology to implement voting/resistance on statements. I
personally would utilize that feature and many others probably would as
well. Crowdsourcing the low voted facts back to applications like
OpenRefine, or the recently sent out Survey vote mechanism for spam analysis
on the low voted statements could highlight where things are untidy and
implement vote casting to clean them up.
"...the burden of proof has to be placed on authority, and it should be
dismantled if that burden cannot be met..."
On Sat, Sep 29, 2018 at 2:49 AM Ettore RIZZA <ettorerizza(a)gmail.com
The Wikidata's ontology is a mess, and I do not see how it could be
otherwise. While the creation of new properties is controlled, any fool
can decide that a woman <https://www.wikidata.org/wiki/Q467>is no longer
a human or is part of family. Maybe I'm a fool too? I wanted to remove
the claim that a ship <https://www.wikidata.org/wiki/Q11446> is an
instance of "ship type" because it produces weird circular inferences
my application; but maybe that makes sense to someone else.
There will never be a universal ontology on which everyone agrees. I
wonder (sorry to think aloud) if Wikidata should not rather facilitate
the use of external classifications. Many external ids are knowledge
organization systems (ontologies, thesauri, classifications ...) I dream
of a simple query that could search, in Wikidata, "all elements of the
same class as 'poodle' according to the classification of imagenet
Wikidata mailing list
Wikidata mailing list