Re: [Wikidata] Multiple properties/identifiers for the same resource

29 Apr 2016

Could I be so bold to suggest that in Wikidata we should strive
to use external URI's for identifiers not Strings.

For example in Wikidata, there are a lot of UniProt accessions.
e.g. behind the property https://www.wikidata.org/wiki/P352
and there is a formatter for a URL.

I think this is the wrong way round, there should be an URL/URI there
and a formatter to generate a local string for display purposes.

And of course for chembl the URL/URI to use would be

    <http://rdf.ebi.ac.uk/resource/chembl/molecule/CHEMBL101690?

There a 2 advantages to this. It allows easier federates queries from
the source databases into wikidata (no URI conversions etc..)
The second is that these URIs are clearly not ambiguous.

Regards,
Jerven

On 28/04/16 23:49, Julie McMurry wrote:
...
  "One
should also point out to the authorities maintaining these IDs  that they should
spend some effort on producing a workable solution for
 this. It seems they should be the first to provide a resolver service
 (or maybe it would be an "ID search engine" if it is so complicated).

 With the qualifiers in place, Wikidata can also be used to achieve this,
 of course, but it seems we are just manually reverse engineering
 something that should be done at the site of whoever is controlling the
 ID registration."

 Well said, Markus. A most hearty agreement here on my side and one
 colleagues and I have been trying to raise awareness of for a long time
 now (http://bit.ly/id-guidance). One of the challenges is that databases
 are already being asked to do more with less. They can see the utility
 of such a service to others, but when I've asked DBs before (not naming
 names), traction has been limp (I've yet to ask Chembl). Sometimes it
 works out though. For instance, KEGG used to have 12 different
 type-specific URLs, corresponding to:

 kegg.compound
 kegg.disease
 kegg.drug
 kegg.environ
 kegg.genes
 kegg.genome
 kegg.glycan
 kegg.metagenome
 kegg.module
 kegg.orthology
 kegg.pathway
 kegg.reaction

 Thankfully, they've collapsed those to a single URL pattern.

 The databases that find it the toughest are not those who simply don't
 embed typing, but rather those that don't embed typing AND ALSO have
 local identifiers that would otherwise collide. For instance, a
 prominent bio database is in this boat (not naming names) and would like
 to make things better but it is hard and messy due to the collisions.

 FYI 345 of the 560+ records in the identifiers.org
 <http://identifiers.org> corpus are type-specific at the level of
 identifiers.org <http://identifiers.org>'s namespace; these roll up to
 ~300 providers.

 The question though is what WikiData is trying to accomplish. Say you
 encounter the chembl ID CHEMBL308052
 <http://linkedchemistry.info/chembl/chemblid/CHEMBL308052> do you need
 to retrieve the type of the entity for reasons other than determining
 what URL to use?

 How are you representing entity labels / IDs to users?

 Best,
 Julie

 _______________________________________________
 Wikidata mailing list
 Wikidata(a)lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata

-- 
-------------------------------------------------------------------
Jerven Bolleman                        Jerven.Bolleman(a)sib.swiss
SIB Swiss Institute of Bioinformatics  Tel: +41 (0)22 379 58 85
CMU, rue Michel Servet 1               Fax: +41 (0)22 379 58 58
1211 Geneve 4,
Switzerland     www.sib.swiss - www.uniprot.org
Follow us at https://twitter.com/#!/uniprot
-------------------------------------------------------------------

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

Re: [Wikidata] Multiple properties/identifiers for the same resource