Aubrey,
Thanks for the explanation and enlightening examples.
Here are my two cents’ worth.

Having an open community defining semantics to cross domain items is powerful and challenging.
To mention one problem, I have come across several statements where ‘instance of’ has been confused with ‘about’, which is harmful when you want to query for a type.
Perhaps manual curation or bots can help out more?

The challenge is that as wikidata complexity grows, it becomes more difficult for the community to chose the right properties and classes.
A somewhat related example from Wikipedia; the article about Amy_Brand was written using the inbox template “adult biography” instead of “person”, which had the consequence of the entity being classified as “adult actor” in dbpedia.

Best regards
Joakim

On May 17, 2017, at 8:17 AM, Andrea Zanni <zanni.andrea84@gmail.com> wrote:

[Sorry for cross-posting. I just sent this mail to the Wikicite mailing list, but I thought it could be of interest also here. Moreover, the community here could provide some important insight to better explain wikidata to "newbies" and set the right path for a meaningful conference.]


Dear all,
sorry for the messy email, but I'd like to to express a small concern I have regarding the awesome Wikicite conference we'll have in few days.

My main point is that Wikidata is *complex*.
It's not just the data model (which is not easy, per se), but the whole idea behind wikidata, the policies, the community, the workflow, the tools, and the whole helo of vagueness that sorrounds it.

My favourite example about this is this little story:
https://en.wikipedia.org/wiki/Blind_men_and_an_elephant
(which was actually a very good metaphor used by my professors introducing the Information Science course...)

Different people with different background work with data in different manners: the word "data" (as "information") means anything and its contrary, so we must be careful. You know better than me that everyone projects her own dreams and delusions upon Wikidata, so we must work towards a good understanding at the beginning, to avoid the painful and time-consuming job of making people un-learn things they think they know.
This was quite evident especially last year, when a lot of librarians and wikimedians were in the same room, and everyone knew many things about metadata and their metadata model: librarians had difficulties grasping things about Wikibase, Wikidata, policies and communities, and wikimedians about bibliographic models and complexities.

We spent at least 30 minutes explaining Wikidata from the beginning, also adding some "color" about strategies, policies and community.

I'll try to make examples.

* we need to explain that in Wikidata (and Wikibase) everything is an *item*. Everything. Every items has properties, values, qualifiers. What's not possible is to have sets or clusters of items, and to give properties to such sets and clusters. This is somehow related to the book topic.

* we need to explain that there are at least 2 different possible strategies: create few general properties and many specific items,
or create many specific properties and less general items. Wikidata chose the former.

* we need to explain the "scaffolding principle", meaning that we don't need to put *all the info in all the items*. We need to create and organize items that are *queriable*, in such a way that I can make a query that get all the data and details I need, scattered among different items. If the items in questions are built "on top of each other", this is doable. It is actually very important to understand this, because people get confused about how many things to say inside one item. This principle (and the former) was explained to me by Tobias1984, and helped me a lot in my understanding of Wikidata.

I think that this kind of insight is crucial for working with Wikidata in a meaningful way, because wikidata offers *one product*, with *one data model*, and it simply impossible to adapt and stretch the Complexity Of The World to Wikidata without loss of information.
I'm sure many of you know this perfectly, but other people maybe don't, or at least they will struggle with it.
We all come from different backgrounds and are emotionally very attached to our models and our crucial pieces of information we don't want to lose: in this sensse, Wikidata is much more a "negotiation" than Wikipedia, because Wikipedia is not structured and llows for much more messiness.
 
Every model and decision we will make will be a trade-off, and I think it's worth trying to save time and effort trying to establish these boundaries at the beginning.
Of course, there are many insights about Wikidata I don't have, but those are kinda the things we want to understand first in these kind of conferences.
Hope this makes sense.

Aubrey







_______________________________________________
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata