The Gene Wiki team is experiencing a problem that may suggest some areas
for improvement in the general wikidata experience.
When our project was getting started, we had some fairly long public
debates about how we should structure the data we wanted to load [1].
These resulted in a data model that, we think, remains pretty much true to
the semantics of the data, at the cost of distributing information about
closely related things (genes, proteins, orthologs) across multiple,
interlinked items. Now, as long as these semantic links between the
different item classes are maintained, this is working out great. However,
we are consistently seeing people merging items that our model needs to be
distinct. Most commonly, we see people merging items about genes with
items about the protein product of the gene (e.g. [2]]). This happens
nearly every day - especially on items related to the more popular
Wikipedia articles. (More examples [3])
Merges like this, as well as other semantics-breaking edits, make it very
challenging to build downstream apps (like the wikipedia infobox) that
depend on having certain structures in place. My question to the list is
how to best protect the semantic models that span multiple entity types in
wikidata? Related to this, is there an opportunity for some consistent way
of explaining these structures to the community when they exist?
I guess the immediate solutions are to (1) write another bot that watches
for model-breaking edits and reverts them and (2) to create an article on
wikidata somewhere that succinctly explains the model and links back to the
discussions that went into its creation.
It seems that anyone that works beyond a single entity type is going to
face the same kind of problems, so I'm posting this here in hopes that
generalizable patterns (and perhaps even supporting code) can be realized
by this community.
[1]
https://www.wikidata.org/wiki/Wikidata_talk:WikiProject_Molecular_biology#D…
[2]
https://www.wikidata.org/w/index.php?title=Q417782&oldid=262745370
[3]
https://s3.amazonaws.com/uploads.hipchat.com/25885/699742/rTrv5VgLm5yQg6z/m…