No benefit to removing the old ids...in fact...It would make things more
difficult for me and others in a few older databases. I would like to keep
the old IDs in Wikidata around for posterity and provenance ...some of us
still have really old databases with cruft and old IDs from years and years
ago, some from the start of the Internet :) If you remove the old IDs it
will make it that much harder for me to reconcile some of them.
+1 Being able to query the Wikidata API with an older ID and it showing me
that it is an old ID and letting me know there is now a preferred ID, would
be fantastic.
Thad
+ThadGuidry <https://www.google.com/+ThadGuidry>
On Thu, Oct 1, 2015 at 3:19 AM, Markus Krötzsch <
markus(a)semantic-mediawiki.org> wrote:
On 01.10.2015 00:58, Ricordisamoa wrote:
I think Tom is referring to external identifiers
such as MusicBrainz
artist ID <https://www.wikidata.org/wiki/Property:P434> etc. and whether
Wikidata items should show all of them or 'preferred' ones only as we
did for VIAF redirects
<
https://www.wikidata.org/wiki/Wikidata:Requests_for_permissions/Bot/SamoaBo…
.
There are also other cases where external sites have duplicates that are
not reconciled (yet). For example, Q46843 has multiple GeoNames Ids:
http://sws.geonames.org/7602447
http://sws.geonames.org/2954602
The second was suggested by Freebase, the first is what Wikipedia had. I
think the first is better (polygon rather than bounding box), so I made
this preferred. This is a situation where we should keep multiple
identifiers, since the external database really has two ids that are not
integrated yet.
Now if the external site reconciles the ids, we have these options:
(1) Keep everything as is (one main id marked as "preferred")
(2) Make the redirect ids deprecated on Wikidata (show people that we are
aware of the ids but they should not be used)
(3) Delete the redirect ids
I think (2) would be cleanest, since it avoids that unaware users re-add
the old ids. (3) would also be ok once the old id is no longer in
circulation.
Is there any benefit in removing old ids completely? I guess constraint
reports will work better (but maybe constraint reports should not count
deprecated statements in single value contraints ...). Other than this, I
don't see a big reason to spend time on removing some ids. It's not wrong
to claim that these are ids, just slightly redundant, and the old ids might
still be useful for integrating with web sources that were not updated when
the redirect happened.
Markus