Hi Roger / Alphos,
On Sat, Jul 7, 2018 at 10:01 PM, Alphos OGame <alphos.ogame(a)gmail.com>
Traceability of information does not pertain with who
information on Wikidata. Could be an unregistered user, could be a bot,
could be a Wikimedian in residence or even the Pope himself, it doesn't
make a difference in the world.
What matters in traceability of a piece of data is *where* it comes from,
and that piece of metadata is achieved through referencing.
That does not belong in licencing.
CC-BY would mean reusers would have to mention individual users (including
vandals, sigh) who took part in compiling datasets on Wikidata, which you
seem to be oblivious to.
When Bing and Google display Wikipedia snippets, they add a Wikipedia
hyperlink, not a list of all contributors.
That satisfies the CC-BY licence, which states explicitly,
1. You may satisfy the conditions in Section 3(a)(1)
<https://creativecommons.org/licenses/by-sa/4.0/legalcode#s3a1> in any
reasonable manner based on the medium, means, and context in which You
Share the Licensed Material. For example, it may be reasonable to satisfy
the conditions by providing a URI or hyperlink to a resource that includes
the required information.
Few people would say that listing every contributor and vandal who has ever
had a hand in a dataset would be "reasonable". This applies even more so in
the case of voice-based digital assistants (imagine ...). So I don't see
this as a real concern.
CC-BY does not help track where a work comes from,
only who took part in making it.
As you say, what matters in traceability is where a piece of data comes
from. Referencing is indeed important there (Wikidata has historically had
a poor track record in that respect).
But you seem to overlook the much simpler fact that end users have to be
able to tell that data they encounter come from Wikidata and not some other
unnamed database. How can they look up a Wikidata reference if the re-user
doesn't even tell them that Wikidata is where the data is from?
Telling end users that data come from Wikidata is important for several
It brings new contributors to Wikidata. It provides visibility for
Wikimedia and its volunteer community. And when data is wrong, it enables
people to find out why the data is wrong, and where they have to go to fix
the error. Many eyes make all bugs shallow, but this is only true if people
know where to go to fix a bug.
Telling end users that data come from Wikidata is surely the most
fundamental level of traceability. And a prime reason why Google and Bing
tell users that their snippets come from Wikipedia is that Wikipedia's
licence demands attribution. That's a good thing.
Datasets reusers don't need that kind of
information when all they want is a list of countries of which Heads of
State have spouses whose given name starts with a D .
When it comes to reliability, there again, who imported it is not
particularly of importance, keeping in mind that Wikidata is not the most
user-friendly wiki in the Wikimedia Ecosystem, as it is not the simplest to
grasp for human minds, and as such not the most frequently vandalized
(three huzzahs for small favors !).
What matters in reliability of a piece of data stems from the *source* of a
particular source of information, which, once again, is indicated by a
reference, and the credit you give to said source.
That again does not belong in licencing.
CC-BY would mean reusers would know that a crapton of people they have no
idea even exist took part in compiling a dataset they require (instead of
just the dataset and references pertaining to it), which again you seem to
be oblivious to. It doesn't protect reusers against vandalism, it does
however make their dataset a whole lot larger by adding the names of a
whole lot of people they don't know or care about.
It really doesn't matter how you put it, the arguments you've put forward
so far simply don't make any kind of sense against CC0 or for CC-BY.
If however you do insist on knowing which user added a particular piece of
data (or reference/metadata, for that matter) to an item, Wikidata keeps an
edit history just in case. It is not necessary for the licence currently in
effect on Wikidata (which, need I remind you, is still CC0), but it is
there nonetheless should you need it.
Now, do we need to keep this needlessly long and tedious thread alive under
another name or could we please drop it and carry on with our lives.
Roger / Alphos
 Maybe for a prophecy or something ? Well, if anyone need it, it's
hopelessly simple, so here goes : https://tinyurl.com/yb6dh3r6
2018-07-07 17:59 GMT+02:00 mathieu lovato stumpf guntz <
I agree this is misconception that a copyright license make any direct
change to data reliability. But attribution requirement does somewhat
indirectly have an impact on it, as it legally enforce traceability. That
is I strongly disagree with the following assertion: "a license that
requires BY sucks so hard for data [because] attribution requirements
very quickly". To my mind it is equivalent
to say that we will throw away
traceability because it is subjectively judged too large a burden,
providing any start of evidence that it indeed
can't be managed, at least
with Wikimedia current ressources.
Now, I don't say traceability is the sole factor one should take into
account in data reliability, but certainly it is one of them. Maybe we
should first come with clear criteria to put in a equation that enable to
calculate reliability of information. Since it's in the core goals of the
Wikimedia strategy, it would certainly worth the effort to establish
metrics about reliability of information the
movement is spreading.
Le 04/07/2018 à 13:00, Andra Waagmeester a écrit :
> I agree with Maarten and to add to that. It is a huge misconception that
> CC0 makes data unreliable. It is only a legal statement about
> nothing more, nothing less. Statements
without proper references and
> qualifiers make data unreliable, but Wikidata has a decent mechanism to
> capture that needed provenance.
> On Wed, Jul 4, 2018 at 12:50 PM, Maarten Dammers <maarten(a)mdammers.nl
> <mailto:email@example.com>> wrote:
> Hi Mathieu,
> On 04-07-18 11:07, mathieu stumpf guntz wrote:
> Le 19/05/2018 à 03:35, Denny Vrandečić a écrit :
> Regarding attribution, commonly it is assumed that you
> have to respect it transitively. That is one of the
> reasons a license that requires BY sucks so hard for data:
> unlike with text, the attribution requirements grow very
> quickly. It is the same as with modified images and
> collages: it is not sufficient to attribute the last
> author, but all contributors have to be attributed.
> If we want our data to be trustable, then we need
> traceability. That is reporting this chain of sources as
> extensively as possible, whatever the license require or not
> as attribution. CC-0 allow to break this traceability, which
> make an aweful license to whoever is concerned with obtaining
> reliable data.
> A license is not the way to achieve this. We have references for
> This is why I think that whoever wants to be part of a
> large federation of data on the web, should publish under
As long as one aim at making a federation of untrustable data
banks, that's perfect. ;)
So I see you started forum shopping (trying to get the Wikimedia-l
people in) and making contentious trying to be funny remarks.
That's usually a good indication a thread is going nowhere.
No, Wikidata is not going to change the CC0. You seem to be the
only person wanting that and trying to discredit Wikidata will not
help you in your crusade. I suggest the people who are still
interested in this to go to
<https://phabricator.wikimedia.org/T193728> and make useful
comments over there.
Wikidata mailing list
Wikidata mailing list
Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wik
i/Mailing_lists/Guidelines and https://meta.wikimedia.org/
New messages to: Wikimedia-l(a)lists.wikimedia.org
Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/
wiki/Mailing_lists/Guidelines and https://meta.wikimedia.org/
New messages to: Wikimedia-l(a)lists.wikimedia.org