The only infoboxes I have touched on Wikipedia in relation to Wikidata are
the ones I created with data from Wikidata with PrepBio and not the other
way around. As far as I know there is no tool available to import Wikidata
statements from Wikipedia infoboxes. This is why it took so long to get rid
of the persondata infoboxes, because the data was not formatted in a way
that was easily importable into Wikidata. Eventually the persondata was
deleted because the birth/death data was updated in Wikidata, albeit in a
different way. Unfortunately we lost all of the alternate spellings that
could have been added to the aliases on Wikidata, but I was delighted that
Maarten Dammers was able to upload aliases for artists into Wikidata last
week from ULAN, which means we now have way more aliases per artist
available for searching than we ever had on Wikipedia.
On Fri, Dec 18, 2015 at 9:24 AM, Peter Southwood <
peter.southwood(a)telkomsa.net> wrote:
Wikipedia is not about infoboxes, they are (and are
intended to be) a
small to very small part of the article in most cases. Similarly,
Wikipedias are not databases, so also without being a lawyer, I think your
interpretation is wrong.
Cheers,
Peter
-----Original Message-----
From: Wikimedia-l [mailto:wikimedia-l-bounces@lists.wikimedia.org] On
Behalf Of Andreas Kolbe
Sent: Friday, 18 December 2015 10:06 AM
To: Wikimedia Mailing List
Subject: Re: [Wikimedia-l] Quality issues
Gerard,
Of course you can't license or copyright facts, but as the WMF legal
team's page on this topic[1] outlines, there are database and compilation
rights that exist independently of copyright. IANAL, but as I read that
page, if you simply go ahead and copy all the infobox, template etc.
content from a Wikipedia, this "would likely be a violation" even under US
law (not to mention EU law).
I don't know why Wikipedia was set up with a CC BY-SA licence rather than a
CC0 licence, and the attribution required under CC BY-SA is unduly
cumbersome, but attribution has always seemed to me like a useful concept.
The fact that people like VDM Publishing who sell Wikipedia articles as
books are required to say that their material comes from Wikipedia is
useful, for example.
Naturally it fosters re-use if you make Wikidata CC0, but that's precisely
the point: you end up with a level of "market dominance" that just ain't
healthy.
[1]
https://meta.wikimedia.org/wiki/Wikilegal/Database_Rights
On Thu, Dec 17, 2015 at 2:33 PM, Gerard Meijssen <
gerard.meijssen(a)gmail.com>
wrote:
Hoi,
Andreas, the law is an arse. However the law has it that you cannot
license facts. When in distributed processes data is retrieved from
Wikipedia, it is the authors who may contest their rights. There is no
such thing as collective rights for Wikipedia, all Wikipedias.
You may not like this and that is fine.
DBpedia has its license in the current way NOT because they care about
the license but because they are not interested in a row with
Wikipedians on the subject. They are quite happy to share their data
with Wikidata and make data retrieved in their processes with a CC-0.
Thanks,
GerardM
On 17 December 2015 at 15:17, Andreas Kolbe <jayen466(a)gmail.com> wrote:
On Wed, Dec 16, 2015 at 11:12 AM, Andrea Zanni
<zanni.andrea84(a)gmail.com
wrote:
> On Sun, Dec 13, 2015 at 9:35 PM, Jane Darnell <jane023(a)gmail.com>
wrote:
>
> > Andrea,
> > I totally agree on the mission/vision thing, but am not sure
> > what you
> mean
> > exactly by scale - do you mean that Wikidata shouldn't try to be
> > so granular that it has a statement to cover each factoid in any
Wikipedia
>
article, or do you mean we need to talk about what constitutes
notability
> > in order not to grow Wikidata exponentially to the point the
> > servers
> crash?
> > Jane
> >
> >
> Hi Jane, I explained myself poorly (sometime English is too
> difficult
:-)
>
> What I mean is that the scale of the error *could* be of another
> scale, another order of magnitude.
> The propagation of the error is multiplied, it's not just a single
error
on
> a wikipage: it's an error propagated in many wikipages, and then
Google,
etc.
A single point of failure.
Exactly: a single point of failure. A system where a single point of
failure can have such consequences, potentially corrupting knowledge
forever, is a bad system. It's not robust.
In the op-ed, I mentioned the Brazilian aardvark hoax[1] as an
example of error propagation (which happened entirely without
Wikidata's and the Knowledge Graph's help). It took the New Yorker
quite a bit of research
to
piece together and confirm what happened,
research which I
understand
would
> not have happened if the originator of the hoax had not been willing
> to talk about his prank.
>
> It was the same with the fake Maurice Jarre quotes in Wikipedia[2]
> that made their way into mainstream press obituaries a few years
> ago. If the hoaxer had not come forward, no one would have been the
> wiser. The fake quotes would have remained a permanent part of the
historical
record.
More recent cases include the widely repeated (including by
Associated Press, for God's sake, to this day) claim that Joe
Streater was involved
in
the Boston College basketball point shaving
scandal[3] and the
Amelia Bedelia hoax.[4]
If even things people insert as a joke propagate around the globe as
a result of this vulnerability, then there is a clear and present
potential for purposeful manipulation. We've seen enough cases of
that, too.[5]
This is not the sort of system the Wikimedia community should be
helping
to
build. The very values at the heart of the
Wikimedia movement are
about transparency, accountability, multiple points of view,
pluralism, democracy, opposing dominance and control by vested
interests, and so forth.
What is the way forward?
Wikidata should, as a matter of urgency, rescind its decision to
make its content available under the CC0 licence. Global propagation
without attribution is a terrible idea.
Quite apart from that, in my opinion Wikidata's CC0 licensing also
infringes Wikipedia contributors' rights as enshrined in Wikipedia's
CC BY-SA licence, a point Lydia Pintscher did not even contest on
the
Signpost
talk page. As I understand her response,[6] she
restricts herself to
asserting that the responsibility for any potential licence
infringement lies with Wikidata contributors rather than with her
and Wikimedia Deutschland. That's passing the buck.
If Wikidata is not prepared to follow CC BY-SA, the way DBpedia
does[7], the next step should be a DMCA takedown notice for material
mass-imported from Wikipedia.
And of course, Wikidata needs to step up its efforts to cite
verifiable sources.
[1]
http://www.newyorker.com/tech/elements/how-a-raccoon-became-an-aardv
ark
[2]
http://www.theguardian.com/commentisfree/2009/may/04/journalism-obitua
ries-shane-fitzgerald
[3]
http://awfulannouncing.com/2014/guilt-wikipedia-joe-streater-became-fa
lsely-attached-boston-college-point-shaving-scandal.html
http://www.newsweek.com/2015/04/03/manipulating-wikipedia-promote-bogu
s-business-school-316133.html
and
http://www.dailydot.com/lifestyle/wikipedia-plastic-surgery-otto-placi
k-labiaplasty/
and many others
[6]
https://en.wikipedia.org/w/index.php?title=Wikipedia_talk:Wikipedia_Si
gnpost/2015-12-09/Op-ed&diff=695228403&oldid=695228022
[7]
http://wiki.dbpedia.org/terms-imprint
Of course, the opposite is also true: it's a
single point of
openness, correction, information.
I was just wondering if this different scale is a factor in making
Wikipedia and Wikidata different enough to accept/reject Andreas
arguments.
Andrea
> On Sun, Dec 13, 2015 at 7:10 PM, Andrea Zanni <
zanni.andrea84(a)gmail.com>
> wrote:
>
> > I really feel we are drowning in a glass of water.
> > The issue of "data quality" or "reliability" that Andreas
> > raises is
well
> > known:
> > what I don't understand if the "scale" of it is much bigger on
Wikidata
> > than Wikipedia,
> > and if this different scale makes it much more important. The
> > scale
of
> > the
> > > issue is maybe something worth discussing, and not the issue
itself?
Is
> the
> > fact that Wikidata is centralised different from statements on
> Wikipedia? I
> > don't know, but to me this is a more neutral and interesting
question.
> > >
> > > I often say that the Wikimedia world made quality an
"heisemberghian"
> > > > feature: you always have to check if it's there.
> > > > The point is: it's been always like this.
> > > > We always had to check for quality, even when we used
> > > > Britannica or authority controls or whatever "reliable"
sources
we wanted.
Wikipedia,
> and
> > now Wikidata, is made for everyone to contribute, it's open
> > and
honest
in
> > being open, vulnerable, prone to errors. But we are
> > transparent, we
say
> > > that in advance, we can claim any statement to the smallest
detail.
Of
> > > course it's difficult, but we can do it. Wikidata, as Lydia
> > > said,
can
> > > actually have conflicting
statements in every item: we "just"
> > > have
to
> put
> > > them there, as we did to Wikipedia.
> > >
> > > If Google uses our data and they are wrong, that's bad for
> > > them. If
> they
> > > correct the errors and do not give us the corrections, that's
> > > bad
for
us
> > and not ethical from them. The point is: there is no license
> > (for
what
I
> > know) that can force them to contribute to Wikidata. That is,
> > IMHO,
the
> > problem with "over-the-top"
actors: they can harness
> > collective
> intelligent
> > and "not give back." Even with CC-BY-SA, they could store (as
> > they
are
> > > probably already doing) all the data in their knowledge vault,
which
is
> > > secret as it is an incredible asset for them.
> > >
> > > I'd be happy to insert a new clause of "forced
transparency"
> > > in
> CC-BY-SA
> > or
> > > CC0, but it's not there.
> > >
> > > So, as we are working via GLAMs with Wikipedia for getting
reliable
> >
sources and content, we are working with them also for good
statements
> and
> > data. Putting good data in Wikidata makes it better, and I
> > don't
> understand
> > what is the problem here (I understand, again, the issue of
> > putting
too
> > > much data and still having a small community).
> > > For example: if we are importing different reliable databases,
andthe
> > > > institutions behind them find it useful and helpful to have an
> > aggregator
> > > > of identifiers and authority controls, what is the issue?
> > > > There is
> > value
> > > in
> > > > aggregating data, because you can spot errors and
inconsistencies.
> It's
> > > not
> > > > easy, of course, to find a good workflow, but, again, that is
> *another*
> > > > problem.
> > > >
> > > > So, in conclusion: I find many issues in Wikidata, but not on
> > > > the mission/vision, just in the complexity of the project, the
> > > > size of
> the
> > > > dataset, the size of the community.
> > > >
> > > > Can we talk about those?
> > > >
> > > > Aubrey
> > > >
> > > >
> > > >
> > > > On Sun, Dec 13, 2015 at 6:40 PM, Andreas Kolbe
> > > > <jayen466(a)gmail.com
>
> > > wrote:
> > > >
> > > > > On Sun, Dec 13, 2015 at 5:32 PM, geni <geniice(a)gmail.com>
wrote:
> >
> >
> > > > > On 13 December 2015 at 15:57, Andreas Kolbe <
jayen466(a)gmail.com>
> > > wrote:
> > > > >
> > > > > > Jane,
> > > > > >
> > > > > > The issue is that you can't cite one Wikipedia article
> > > > > > as a
> source
> > in
> > > > > > another.
> > > > > >
> > > > >
> > > > >
> > > > > However you can within the same article per [[WP:LEAD]].
> > > > >
> > > >
> > > >
> > > > Well, of course, if there are reliable sources cited in the
> > > > body
of
the
> > > article that back up the statements made in the lead. You
> > > still
need
to
> > > cite a reliable source though; that's Wikipedia 101.
> >
_______________________________________________
> > Wikimedia-l mailing list, guidelines at:
> >
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
> > Wikimedia-l(a)lists.wikimedia.org
> > Unsubscribe:
https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> > > <mailto:wikimedia-l-request@lists.wikimedia.org
?subject=unsubscribe>
> > >
> > _______________________________________________
> > Wikimedia-l mailing list, guidelines at:
> >
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
> > Wikimedia-l(a)lists.wikimedia.org
> > Unsubscribe:
https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> > > <mailto:wikimedia-l-request@lists.wikimedia.org
?subject=unsubscribe>
> >
_______________________________________________
> > Wikimedia-l mailing list, guidelines at:
> >
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
> > Wikimedia-l(a)lists.wikimedia.org
> > Unsubscribe:
https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
>
<mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscr
> ibe>
>
_______________________________________________
Wikimedia-l mailing list, guidelines at:
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
Wikimedia-l(a)lists.wikimedia.org
Unsubscribe:
https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
<mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscrib
e>
_______________________________________________
Wikimedia-l mailing list, guidelines at:
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
Wikimedia-l(a)lists.wikimedia.org
Unsubscribe:
https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
<mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe>
_______________________________________________
Wikimedia-l mailing list, guidelines at:
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
Wikimedia-l(a)lists.wikimedia.org
Unsubscribe:
https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
<mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe>
_______________________________________________
Wikimedia-l mailing list, guidelines at:
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
New messages to: Wikimedia-l(a)lists.wikimedia.org
Unsubscribe:
https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
<mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe>
-----
No virus found in this message.
Checked by AVG -
www.avg.com
Version: 2016.0.7294 / Virus Database: 4489/11202 - Release Date: 12/18/15
_______________________________________________
Wikimedia-l mailing list, guidelines at:
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
New messages to: Wikimedia-l(a)lists.wikimedia.org
Unsubscribe:
https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
<mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe>