Dear Thad,
The second part of your email has good points in it, too. As you say, one must allow for adjustments in the intended meaning of a property in real life, and adjusting too much could be dangerous. The method you suggest (creating a new property and deprecating the old one, rather than modifying the meaning of the old one) has been used in Wikidata as well. Another thing that is very common in Wikidata are community-controlled mass edits: users offer to write conversion tools and the community decides how to use them. This can help to convert large amounts of data to a new format or to detect and cross-check errors. The constraint mechanism you see today is also a community-created way of avoiding unintended uses (be they caused by changed definitions or by other causes).
Finally, some basic things that you probably know already:
* Property datatypes in Wikidata cannot be changed after creation (ensuring that the data always remains structurally valid for this type). Technically speaking, this is the only "schema" we have (you mentioned SQL: things are similar there; the database schema does not include a definition of what you mean by the country column in the band table; at best it tells you that the country should be a key in the countries table).
* The experience with Wikipedia has led to many mechanisms for counteracting spam and vandalism. Patrolling, (semi)protection of pages, spam fighting robots and scripts, etc. can similarly be used on Wikidata to fight against deliberate attacks on high-visibility pages such a properties. In a sense, structured data is making it much easier to detect problems with automated tools. This should be effective against most deliberate attempts to cause trouble by changing property labels etc. (but we will need to do more there I guess).
Cheers,
Markus
On 08.01.2015 20:37, Thad Guidry wrote: ...
And that is my worry. That the Schema model is publicly editable at any time.
...
We have the same problem in Freebase, where if by public agreement, we change the meaning of a Property so much that it might cause erroneous data statements, then we deprecate that Property and create a new one, splitting off the various statements into their proper form and letting the Community know, and also performing the data tasks to subscribe the old data to the new Schema.
The pollution of data would happen if by agreement P17's Discussion page drastically changed the intended meaning of it, then all the data that used P17 would need to be cleaned up.
How does Wikidata intend to deal with those kinds of changes to Property meanings in the future ? and the data cleanup involved ?
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l