Automatically copying over infoboxes is something I don't advise. Unlike current
infoboxes, which are rarely sourced, every point of data on Wikidata should be DIRECTLY
and INDIVIDUALLY sourced. We can use the same source 37 times, but each bit of information
that would ordinarily have a field on an infobox needs to have its own source, we
can't just say "everything on this page is from XXXX". If we do automatic
importing, it's going to be an uphill battle from day one to source things.
On Nov 15, 2012, at 4:50 PM, Gregor Hagedorn <g.m.hagedorn(a)gmail.com> wrote:
If the data is
actually copyrightable, then yes. Facts as such are not
copyrightable. But if there was a bot transferring stuff from infoboxes, it
should at least check for any actual text (e.g. long values with spaces), and
not transfer it, because of license reasons.
I agree. Just to clarify what "actual text" should mean: Although a
short sentence with several words may occasionally be a copyrightable
text (e.g. a poem), it is very rarely so. On Wikipedia infoboxes, due
to scope, purpose and style, this can almost be excluded.
It is not desirable to exclude brief scope notes or source notes,
which occasionally occur in Wikipedia infoboxes, just because they
contain several words. I personally would recommend an extraction
dryrun and manually check for parameters that have more than perhaps
12-15 words, whether they are creative (= copyrightable) or plain
expressions of fact or sources (= not copyrightable).
Gregor
_______________________________________________
Wikidata-l mailing list
Wikidata-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l