Hi all,
On Saturday, I facilitated a workshop at the U.S. National Archives entitled "An Ambitious Wikidata Tutorial" as part of WikiConference USA 2015.
Slides are available at: http://www.slideshare.net/_Emw/an-ambitious-wikidata-tutorial https://commons.wikimedia.org/wiki/File:An_Ambitious_Wikidata_Tutorial.pdf
The demo of Wikidata's new SPARQL endpoint at https://query.wikidata.org/ was the most exciting part of the workshop for me, and caught the audience's attention too [1]. Stas Malyshev et al., thank you for developing Wikidata Query Service -- it's an incredibly awesome tool!
I've added the query we used to the list of examples at https://www.mediawiki.org/wiki/Wikibase/Indexing/SPARQL_Query_Examples#Polit... .
We also:
- Live-edited the item about Nobel laureate Barbara McClintock [2]
- Saw how Wikidata is used in Histropedia [3]
- Discussed *instance of* (P31), *subclass of* (P279), and *part of* (P361) and how to avoid "bad smells" [4]
- Learned about the RDF/OWL exports and how to explore them locally with Protege [5]
- Talked about modeling causation on Wikidata, in the context of the American Civil War [6]
- Covered Wikidata vocabulary, the Wikidata API, where to find things, unit quantity properties (e.g. area, length, GDP per capita), etc.
I'd estimate we had 30 to 40 attendees.
Later the same day, Katie (Aude) presented on integrating Wikidata into Wikipedia through Lua. Elvira (Emitraka) also presented on how the Gene Wiki project has been enhancing Wikidata with items about genes, items about diseases, and their causal connection -- and how the Gene Wiki group is working to make corresponding infoboxes on Wikipedia more relevant to the layperson. Both talks were well-attended and excellent.
Cheers, Eric https://www.wikidata.org/wiki/User:Emw
1. SPARQL on Wikidata. Slides 39 - 43. http://www.slideshare.net/_Emw/an-ambitious-wikidata-tutorial#39
2. Barbara McClintock live edit. Slide 18. http://www.slideshare.net/_Emw/an-ambitious-wikidata-tutorial#18
3. Histropedia. Slide 22. http://www.slideshare.net/_Emw/an-ambitious-wikidata-tutorial#22
4. Avoiding bad smells in classification. Slide 35. http://www.slideshare.net/_Emw/an-ambitious-wikidata-tutorial#35
5. How to explore Wikidata RDF/OWL dumps locally. Slides 44 - 48. http://www.slideshare.net/_Emw/an-ambitious-wikidata-tutorial#44 6. Causation on Wikidata. Slides 49 - 51. http://www.slideshare.net/_Emw/an-ambitious-wikidata-tutorial#50
Great that you could make a presentation.
A remark however of what I could read : "instance of" IS NOT transitive.
2015-10-12 20:47 GMT+02:00 Emw emw.wiki@gmail.com:
Hi all,
On Saturday, I facilitated a workshop at the U.S. National Archives entitled "An Ambitious Wikidata Tutorial" as part of WikiConference USA 2015.
Slides are available at: http://www.slideshare.net/_Emw/an-ambitious-wikidata-tutorial https://commons.wikimedia.org/wiki/File:An_Ambitious_Wikidata_Tutorial.pdf
The demo of Wikidata's new SPARQL endpoint at https://query.wikidata.org/ was the most exciting part of the workshop for me, and caught the audience's attention too [1]. Stas Malyshev et al., thank you for developing Wikidata Query Service -- it's an incredibly awesome tool!
I've added the query we used to the list of examples at https://www.mediawiki.org/wiki/Wikibase/Indexing/SPARQL_Query_Examples#Polit... .
We also:
Live-edited the item about Nobel laureate Barbara McClintock [2]
Saw how Wikidata is used in Histropedia [3]
Discussed *instance of* (P31), *subclass of* (P279), and *part of*
(P361) and how to avoid "bad smells" [4]
- Learned about the RDF/OWL exports and how to explore them locally
with Protege [5]
- Talked about modeling causation on Wikidata, in the context of the
American Civil War [6]
- Covered Wikidata vocabulary, the Wikidata API, where to find things,
unit quantity properties (e.g. area, length, GDP per capita), etc.
I'd estimate we had 30 to 40 attendees.
Later the same day, Katie (Aude) presented on integrating Wikidata into Wikipedia through Lua. Elvira (Emitraka) also presented on how the Gene Wiki project has been enhancing Wikidata with items about genes, items about diseases, and their causal connection -- and how the Gene Wiki group is working to make corresponding infoboxes on Wikipedia more relevant to the layperson. Both talks were well-attended and excellent.
Cheers, Eric https://www.wikidata.org/wiki/User:Emw
- SPARQL on Wikidata. Slides 39 - 43.
http://www.slideshare.net/_Emw/an-ambitious-wikidata-tutorial#39
- Barbara McClintock live edit. Slide 18.
http://www.slideshare.net/_Emw/an-ambitious-wikidata-tutorial#18
- Histropedia. Slide 22.
http://www.slideshare.net/_Emw/an-ambitious-wikidata-tutorial#22
- Avoiding bad smells in classification. Slide 35.
http://www.slideshare.net/_Emw/an-ambitious-wikidata-tutorial#35
- How to explore Wikidata RDF/OWL dumps locally. Slides 44 - 48.
http://www.slideshare.net/_Emw/an-ambitious-wikidata-tutorial#44 6. Causation on Wikidata. Slides 49 - 51. http://www.slideshare.net/_Emw/an-ambitious-wikidata-tutorial#50
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Thanks for the catch, Thomas! I've fixed that in the slide 36 and uploaded the corrected version to http://www.slideshare.net/_Emw/an-ambitious-wikidata-tutorial and Commons.
Best, Eric
On Mon, Oct 12, 2015 at 2:56 PM, Thomas Douillard < thomas.douillard@gmail.com> wrote:
Great that you could make a presentation.
A remark however of what I could read : "instance of" IS NOT transitive.
2015-10-12 20:47 GMT+02:00 Emw emw.wiki@gmail.com:
Hi all,
On Saturday, I facilitated a workshop at the U.S. National Archives entitled "An Ambitious Wikidata Tutorial" as part of WikiConference USA 2015.
Slides are available at: http://www.slideshare.net/_Emw/an-ambitious-wikidata-tutorial https://commons.wikimedia.org/wiki/File:An_Ambitious_Wikidata_Tutorial.pdf
The demo of Wikidata's new SPARQL endpoint at https://query.wikidata.org/ was the most exciting part of the workshop for me, and caught the audience's attention too [1]. Stas Malyshev et al., thank you for developing Wikidata Query Service -- it's an incredibly awesome tool!
I've added the query we used to the list of examples at https://www.mediawiki.org/wiki/Wikibase/Indexing/SPARQL_Query_Examples#Polit... .
We also:
Live-edited the item about Nobel laureate Barbara McClintock [2]
Saw how Wikidata is used in Histropedia [3]
Discussed *instance of* (P31), *subclass of* (P279), and *part of*
(P361) and how to avoid "bad smells" [4]
- Learned about the RDF/OWL exports and how to explore them locally
with Protege [5]
- Talked about modeling causation on Wikidata, in the context of the
American Civil War [6]
- Covered Wikidata vocabulary, the Wikidata API, where to find
things, unit quantity properties (e.g. area, length, GDP per capita), etc.
I'd estimate we had 30 to 40 attendees.
Later the same day, Katie (Aude) presented on integrating Wikidata into Wikipedia through Lua. Elvira (Emitraka) also presented on how the Gene Wiki project has been enhancing Wikidata with items about genes, items about diseases, and their causal connection -- and how the Gene Wiki group is working to make corresponding infoboxes on Wikipedia more relevant to the layperson. Both talks were well-attended and excellent.
Cheers, Eric https://www.wikidata.org/wiki/User:Emw
- SPARQL on Wikidata. Slides 39 - 43.
http://www.slideshare.net/_Emw/an-ambitious-wikidata-tutorial#39
- Barbara McClintock live edit. Slide 18.
http://www.slideshare.net/_Emw/an-ambitious-wikidata-tutorial#18
- Histropedia. Slide 22.
http://www.slideshare.net/_Emw/an-ambitious-wikidata-tutorial#22
- Avoiding bad smells in classification. Slide 35.
http://www.slideshare.net/_Emw/an-ambitious-wikidata-tutorial#35
- How to explore Wikidata RDF/OWL dumps locally. Slides 44 - 48.
http://www.slideshare.net/_Emw/an-ambitious-wikidata-tutorial#44 6. Causation on Wikidata. Slides 49 - 51. http://www.slideshare.net/_Emw/an-ambitious-wikidata-tutorial#50
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Hi!
Slides are available at: http://www.slideshare.net/_Emw/an-ambitious-wikidata-tutorial https://commons.wikimedia.org/wiki/File:An_Ambitious_Wikidata_Tutorial.pdf
The demo of Wikidata's new SPARQL endpoint at https://query.wikidata.org/ was the most exciting part of the workshop for me, and caught the audience's attention too [1]. Stas Malyshev et al., thank you for developing Wikidata Query Service -- it's an incredibly awesome tool!
Thank you! I am glad to see the interest for it in the community.
Thanks a lot to you and the others for representing Wikidata at the conference and sharing your experience!
Cheers Lydia
It's very pleasant to hear from someone else who thinks of Wikidata as a knowledge base (or at least hopes that Wikidata can be considered as a knowledge base). Did you get any pushback on this or on your stated Wikidata goal of structuring the sum of all human knowledge?
Did you get any pushback on your section on classification in Wikidata? It seems to me that some of that is rather controversial in the Wikidata community. I was a bit surprised to see class reasoning used on diseases. This depends on a particular modelling methodology.
peter
On 10/12/2015 11:47 AM, Emw wrote:
Hi all,
On Saturday, I facilitated a workshop at the U.S. National Archives entitled "An Ambitious Wikidata Tutorial" as part of WikiConference USA 2015.
Slides are available at: http://www.slideshare.net/_Emw/an-ambitious-wikidata-tutorial https://commons.wikimedia.org/wiki/File:An_Ambitious_Wikidata_Tutorial.pdf
I was a bit surprised to see class reasoning used on diseases.
I was not aware of that, do you have links ?
I was a bit surprised to see class reasoning used on diseases.
This depends on a particular modelling methodology.
It's not surprising as the meaning of properties is community defined (or sub community defined) so any community can use reasoning technology they want to use as which is consistent with the intended meaning of properties. As Wikidata do only stores statements anyone can use reasoning technologies on top of this that are community accepted. The drawback of this approach have been discussed on another thread some days ago : it could become tricky to understand for a simple user the path that lead to a statement addition and we have to be careful to always provide informations on which bot added inferred statements with that reasoning technology or rule from which data.
I however noticed in heated recent debates that some users on frwiki were sensible to the argument that Wikidata only does store statements. This kind of users feared that Wikidata would induce an alignment of semantics of words and items to the enwiki semantic, They believes in the linguistic hypothesis that words in a language carry some kind of language dependant meaning on their own and feared some kind of "cultural contagion" by some kind of mechanism where the specific meaning of english word would contaminate the french word. It has of course been said many time that Wikidata was not focused on words and linguistic but on definitions mainly, and that one definition equals one item, that wikidata was the sum of all knowledge, but the argument that finally seemed to be effective was the one that Wikidata do only store statements and do not einforce constraint. It seems to be effective to convince them that Wikidata is indeed POV agnostic.
2015-10-16 19:14 GMT+02:00 Peter F. Patel-Schneider pfpschneider@gmail.com :
It's very pleasant to hear from someone else who thinks of Wikidata as a knowledge base (or at least hopes that Wikidata can be considered as a knowledge base). Did you get any pushback on this or on your stated Wikidata goal of structuring the sum of all human knowledge?
Did you get any pushback on your section on classification in Wikidata? It seems to me that some of that is rather controversial in the Wikidata community. I was a bit surprised to see class reasoning used on diseases. This depends on a particular modelling methodology.
peter
On 10/12/2015 11:47 AM, Emw wrote:
Hi all,
On Saturday, I facilitated a workshop at the U.S. National Archives
entitled
"An Ambitious Wikidata Tutorial" as part of WikiConference USA 2015.
Slides are available at: http://www.slideshare.net/_Emw/an-ambitious-wikidata-tutorial
https://commons.wikimedia.org/wiki/File:An_Ambitious_Wikidata_Tutorial.pdf
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
On 10/17/2015 12:55 AM, Thomas Douillard wrote:
I was a bit surprised to see class reasoning used on diseases.
I was not aware of that, do you have links ?
See slide 38 of http://www.slideshare.net/_Emw/an-ambitious-wikidata-tutorial
I was a bit surprised to see class reasoning used on diseases. This depends on a particular modelling methodology.
It's not surprising as the meaning of properties is community defined (or sub community defined) so any community can use reasoning technology they want to use as which is consistent with the intended meaning of properties. As Wikidata do only stores statements anyone can use reasoning technologies on top of this that are community accepted. The drawback of this approach have been discussed on another thread some days ago : it could become tricky to understand for a simple user the path that lead to a statement addition and we have to be careful to always provide informations on which bot added inferred statements with that reasoning technology or rule from which data.
What is the community-defined meaning of subclass of and diseases then?
Here is what I see in Wikidata.
https://www.wikidata.org/wiki/Q128581 breast cancer has a https://www.wikidata.org/wiki/Property:P279 subclass of link to both https://www.wikidata.org/wiki/Q12136 disease and https://www.wikidata.org/wiki/Q18556617 thoracic cancer
https://www.wikidata.org/wiki/Property:P279 subclass of is linked via https://www.wikidata.org/wiki/Property:P1628 equivalent property to http://www.w3.org/2000/01/rdf-schema#subClassOf which is the subclass relationship between classes.
https://www.wikidata.org/wiki/Property:P279 subclass of has English description all of these items are instances of those items; this item is a class of that item. Not to be confused with Property:P31 (instance of). which is rather confusing, but appears to be gloss of the RDFS meaning of http://www.w3.org/2000/01/rdf-schema#subClassOf
Someone looking at all this is thus lead to believe that https://www.wikidata.org/wiki/Property:P279 subclass of is the same as the RDFS meaning of http://www.w3.org/2000/01/rdf-schema#subClassOf
So diseases are classes. They then have instances. They can be reasoned with using techniques borrowed from RDFS.
This is a particular modelling methodology. It has its benefits. It requires a certain view of disease and diseases. The particular instantiation of this modelling methodology, where there is a redundant link to the top of the disease hierarchy and that top loops back to itself, has its own benefits and drawbacks.
A bigger problem than the one you state, I think, is how outsiders can determine that this modelling methodology is in place and understand it adequately to effectively use the information or to contribute more information. There is nothing on the discussion pages for the various diseases that I looked at.
The modelling methodology used here is useful in many other places, including human occupations, creative work genres, cuisines, and sports. Is Wikidata uniform in applying this methodology? If this is not the case, then how is the use of this methodology signalled?
I however noticed in heated recent debates that some users on frwiki were sensible to the argument that Wikidata only does store statements. This kind of users feared that Wikidata would induce an alignment of semantics of words and items to the enwiki semantic, They believes in the linguistic hypothesis that words in a language carry some kind of language dependant meaning on their own and feared some kind of "cultural contagion" by some kind of mechanism where the specific meaning of english word would contaminate the french word. It has of course been said many time that Wikidata was not focused on words and linguistic but on definitions mainly, and that one definition equals one item, that wikidata was the sum of all knowledge, but the argument that finally seemed to be effective was the one that Wikidata do only store statements and do not einforce constraint. It seems to be effective to convince them that Wikidata is indeed POV agnostic.
In my discussion above, I tried to stay away from using the human-language descriptions, preferring an external formal definition. Unfortunately, Wikidata does not have an internal formal definition beyond the simple description of the data structures. This lack, I think, is what makes the human-language descriptions so important in Wikidata. My view is that a stronger formal basis for Wikidata would help to reduce the possibility that descriptions in dominant human languages do indeed push out the other descriptions.
2015-10-16 19:14 GMT+02:00 Peter F. Patel-Schneider <pfpschneider@gmail.com mailto:pfpschneider@gmail.com>:
It's very pleasant to hear from someone else who thinks of Wikidata as a knowledge base (or at least hopes that Wikidata can be considered as a knowledge base). Did you get any pushback on this or on your stated Wikidata goal of structuring the sum of all human knowledge? Did you get any pushback on your section on classification in Wikidata? It seems to me that some of that is rather controversial in the Wikidata community. I was a bit surprised to see class reasoning used on diseases. This depends on a particular modelling methodology. peter On 10/12/2015 11:47 AM, Emw wrote: > Hi all, > > On Saturday, I facilitated a workshop at the U.S. National Archives entitled > "An Ambitious Wikidata Tutorial" as part of WikiConference USA 2015. > > Slides are available at: > http://www.slideshare.net/_Emw/an-ambitious-wikidata-tutorial > https://commons.wikimedia.org/wiki/File:An_Ambitious_Wikidata_Tutorial.pdf
You're right, I tend to think having a metaclass for types of deaseases would be useful, really.
Please submit your suggestions and correction on https://fr.wikipedia.org/w/index.php?search=Help%3AClassification&title=... :) There is an opened RfC on adopting such basic classification basic principles things as an help page : https://www.wikidata.org/wiki/Wikidata:Requests_for_comment/Adopt_Help:Class...
2015-10-17 16:25 GMT+02:00 Peter F. Patel-Schneider pfpschneider@gmail.com :
On 10/17/2015 12:55 AM, Thomas Douillard wrote:
I was a bit surprised to see class reasoning used on diseases.
I was not aware of that, do you have links ?
See slide 38 of http://www.slideshare.net/_Emw/an-ambitious-wikidata-tutorial
I was a bit surprised to see class reasoning used on diseases. This depends on a particular modelling methodology.
It's not surprising as the meaning of properties is community defined
(or sub
community defined) so any community can use reasoning technology they
want to
use as which is consistent with the intended meaning of properties. As Wikidata do only stores statements anyone can use reasoning technologies
on
top of this that are community accepted. The drawback of this approach
have
been discussed on another thread some days ago : it could become tricky
to
understand for a simple user the path that lead to a statement addition
and we
have to be careful to always provide informations on which bot added
inferred
statements with that reasoning technology or rule from which data.
What is the community-defined meaning of subclass of and diseases then?
Here is what I see in Wikidata.
https://www.wikidata.org/wiki/Q128581 breast cancer has a https://www.wikidata.org/wiki/Property:P279 subclass of link to both https://www.wikidata.org/wiki/Q12136 disease and https://www.wikidata.org/wiki/Q18556617 thoracic cancer
https://www.wikidata.org/wiki/Property:P279 subclass of is linked via https://www.wikidata.org/wiki/Property:P1628 equivalent property to http://www.w3.org/2000/01/rdf-schema#subClassOf which is the subclass relationship between classes.
https://www.wikidata.org/wiki/Property:P279 subclass of has English description all of these items are instances of those items; this item is a class of that item. Not to be confused with Property:P31 (instance of). which is rather confusing, but appears to be gloss of the RDFS meaning of http://www.w3.org/2000/01/rdf-schema#subClassOf
Someone looking at all this is thus lead to believe that https://www.wikidata.org/wiki/Property:P279 subclass of is the same as the RDFS meaning of http://www.w3.org/2000/01/rdf-schema#subClassOf
So diseases are classes. They then have instances. They can be reasoned with using techniques borrowed from RDFS.
This is a particular modelling methodology. It has its benefits. It requires a certain view of disease and diseases. The particular instantiation of this modelling methodology, where there is a redundant link to the top of the disease hierarchy and that top loops back to itself, has its own benefits and drawbacks.
A bigger problem than the one you state, I think, is how outsiders can determine that this modelling methodology is in place and understand it adequately to effectively use the information or to contribute more information. There is nothing on the discussion pages for the various diseases that I looked at.
The modelling methodology used here is useful in many other places, including human occupations, creative work genres, cuisines, and sports. Is Wikidata uniform in applying this methodology? If this is not the case, then how is the use of this methodology signalled?
I however noticed in heated recent debates that some users on frwiki were sensible to the argument that Wikidata only does store statements. This
kind
of users feared that Wikidata would induce an alignment of semantics of
words
and items to the enwiki semantic, They believes in the linguistic
hypothesis
that words in a language carry some kind of language dependant meaning on their own and feared some kind of "cultural contagion" by some kind of mechanism where the specific meaning of english word would contaminate
the
french word. It has of course been said many time that Wikidata was not focused on words and linguistic but on definitions mainly, and that one definition equals one item, that wikidata was the sum of all knowledge,
but
the argument that finally seemed to be effective was the one that
Wikidata do
only store statements and do not einforce constraint. It seems to be
effective
to convince them that Wikidata is indeed POV agnostic.
In my discussion above, I tried to stay away from using the human-language descriptions, preferring an external formal definition. Unfortunately, Wikidata does not have an internal formal definition beyond the simple description of the data structures. This lack, I think, is what makes the human-language descriptions so important in Wikidata. My view is that a stronger formal basis for Wikidata would help to reduce the possibility that descriptions in dominant human languages do indeed push out the other descriptions.
2015-10-16 19:14 GMT+02:00 Peter F. Patel-Schneider <
pfpschneider@gmail.com
mailto:pfpschneider@gmail.com>:
It's very pleasant to hear from someone else who thinks of Wikidata
as a
knowledge base (or at least hopes that Wikidata can be considered as
a
knowledge base). Did you get any pushback on this or on your stated
Wikidata
goal of structuring the sum of all human knowledge? Did you get any pushback on your section on classification in
Wikidata? It
seems to me that some of that is rather controversial in the Wikidata community. I was a bit surprised to see class reasoning used on
diseases.
This depends on a particular modelling methodology. peter On 10/12/2015 11:47 AM, Emw wrote: > Hi all, > > On Saturday, I facilitated a workshop at the U.S. National
Archives entitled
> "An Ambitious Wikidata Tutorial" as part of WikiConference USA
> > Slides are available at: > http://www.slideshare.net/_Emw/an-ambitious-wikidata-tutorial >
https://commons.wikimedia.org/wiki/File:An_Ambitious_Wikidata_Tutorial.pdf
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Hoi, So in order to understand all this and participate it is necessary to submit to the FRENCH Wikipedia.. Why not Thai ?
The mind boggles Thanks, GerardM
On 17 October 2015 at 16:42, Thomas Douillard thomas.douillard@gmail.com wrote:
You're right, I tend to think having a metaclass for types of deaseases would be useful, really.
Please submit your suggestions and correction on https://fr.wikipedia.org/w/index.php?search=Help%3AClassification&title=... :) There is an opened RfC on adopting such basic classification basic principles things as an help page : https://www.wikidata.org/wiki/Wikidata:Requests_for_comment/Adopt_Help:Class...
2015-10-17 16:25 GMT+02:00 Peter F. Patel-Schneider < pfpschneider@gmail.com>:
On 10/17/2015 12:55 AM, Thomas Douillard wrote:
I was a bit surprised to see class reasoning used on diseases.
I was not aware of that, do you have links ?
See slide 38 of http://www.slideshare.net/_Emw/an-ambitious-wikidata-tutorial
I was a bit surprised to see class reasoning used on diseases. This depends on a particular modelling methodology.
It's not surprising as the meaning of properties is community defined
(or sub
community defined) so any community can use reasoning technology they
want to
use as which is consistent with the intended meaning of properties. As Wikidata do only stores statements anyone can use reasoning
technologies on
top of this that are community accepted. The drawback of this approach
have
been discussed on another thread some days ago : it could become tricky
to
understand for a simple user the path that lead to a statement addition
and we
have to be careful to always provide informations on which bot added
inferred
statements with that reasoning technology or rule from which data.
What is the community-defined meaning of subclass of and diseases then?
Here is what I see in Wikidata.
https://www.wikidata.org/wiki/Q128581 breast cancer has a https://www.wikidata.org/wiki/Property:P279 subclass of link to both https://www.wikidata.org/wiki/Q12136 disease and https://www.wikidata.org/wiki/Q18556617 thoracic cancer
https://www.wikidata.org/wiki/Property:P279 subclass of is linked via https://www.wikidata.org/wiki/Property:P1628 equivalent property to http://www.w3.org/2000/01/rdf-schema#subClassOf which is the subclass relationship between classes.
https://www.wikidata.org/wiki/Property:P279 subclass of has English description all of these items are instances of those items; this item is a class of that item. Not to be confused with Property:P31 (instance of). which is rather confusing, but appears to be gloss of the RDFS meaning of http://www.w3.org/2000/01/rdf-schema#subClassOf
Someone looking at all this is thus lead to believe that https://www.wikidata.org/wiki/Property:P279 subclass of is the same as the RDFS meaning of http://www.w3.org/2000/01/rdf-schema#subClassOf
So diseases are classes. They then have instances. They can be reasoned with using techniques borrowed from RDFS.
This is a particular modelling methodology. It has its benefits. It requires a certain view of disease and diseases. The particular instantiation of this modelling methodology, where there is a redundant link to the top of the disease hierarchy and that top loops back to itself, has its own benefits and drawbacks.
A bigger problem than the one you state, I think, is how outsiders can determine that this modelling methodology is in place and understand it adequately to effectively use the information or to contribute more information. There is nothing on the discussion pages for the various diseases that I looked at.
The modelling methodology used here is useful in many other places, including human occupations, creative work genres, cuisines, and sports. Is Wikidata uniform in applying this methodology? If this is not the case, then how is the use of this methodology signalled?
I however noticed in heated recent debates that some users on frwiki
were
sensible to the argument that Wikidata only does store statements. This
kind
of users feared that Wikidata would induce an alignment of semantics of
words
and items to the enwiki semantic, They believes in the linguistic
hypothesis
that words in a language carry some kind of language dependant meaning
on
their own and feared some kind of "cultural contagion" by some kind of mechanism where the specific meaning of english word would contaminate
the
french word. It has of course been said many time that Wikidata was not focused on words and linguistic but on definitions mainly, and that one definition equals one item, that wikidata was the sum of all knowledge,
but
the argument that finally seemed to be effective was the one that
Wikidata do
only store statements and do not einforce constraint. It seems to be
effective
to convince them that Wikidata is indeed POV agnostic.
In my discussion above, I tried to stay away from using the human-language descriptions, preferring an external formal definition. Unfortunately, Wikidata does not have an internal formal definition beyond the simple description of the data structures. This lack, I think, is what makes the human-language descriptions so important in Wikidata. My view is that a stronger formal basis for Wikidata would help to reduce the possibility that descriptions in dominant human languages do indeed push out the other descriptions.
2015-10-16 19:14 GMT+02:00 Peter F. Patel-Schneider <
pfpschneider@gmail.com
mailto:pfpschneider@gmail.com>:
It's very pleasant to hear from someone else who thinks of Wikidata
as a
knowledge base (or at least hopes that Wikidata can be considered
as a
knowledge base). Did you get any pushback on this or on your
stated Wikidata
goal of structuring the sum of all human knowledge? Did you get any pushback on your section on classification in
Wikidata? It
seems to me that some of that is rather controversial in the
Wikidata
community. I was a bit surprised to see class reasoning used on
diseases.
This depends on a particular modelling methodology. peter On 10/12/2015 11:47 AM, Emw wrote: > Hi all, > > On Saturday, I facilitated a workshop at the U.S. National
Archives entitled
> "An Ambitious Wikidata Tutorial" as part of WikiConference USA
> > Slides are available at: > http://www.slideshare.net/_Emw/an-ambitious-wikidata-tutorial >
https://commons.wikimedia.org/wiki/File:An_Ambitious_Wikidata_Tutorial.pdf
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Oops, I meant https://www.wikidata.org/wiki/Help:Classification of course. We made the page translatable even if it's not an accepted policy, at least it's well founded and solid.
2015-10-17 17:06 GMT+02:00 Gerard Meijssen gerard.meijssen@gmail.com:
Hoi, So in order to understand all this and participate it is necessary to submit to the FRENCH Wikipedia.. Why not Thai ?
The mind boggles Thanks, GerardM
On 17 October 2015 at 16:42, Thomas Douillard thomas.douillard@gmail.com wrote:
You're right, I tend to think having a metaclass for types of deaseases would be useful, really.
Please submit your suggestions and correction on https://fr.wikipedia.org/w/index.php?search=Help%3AClassification&title=... :) There is an opened RfC on adopting such basic classification basic principles things as an help page : https://www.wikidata.org/wiki/Wikidata:Requests_for_comment/Adopt_Help:Class...
2015-10-17 16:25 GMT+02:00 Peter F. Patel-Schneider < pfpschneider@gmail.com>:
On 10/17/2015 12:55 AM, Thomas Douillard wrote:
I was a bit surprised to see class reasoning used on diseases.
I was not aware of that, do you have links ?
See slide 38 of http://www.slideshare.net/_Emw/an-ambitious-wikidata-tutorial
I was a bit surprised to see class reasoning used on diseases. This depends on a particular modelling methodology.
It's not surprising as the meaning of properties is community defined
(or sub
community defined) so any community can use reasoning technology they
want to
use as which is consistent with the intended meaning of properties. As Wikidata do only stores statements anyone can use reasoning
technologies on
top of this that are community accepted. The drawback of this approach
have
been discussed on another thread some days ago : it could become
tricky to
understand for a simple user the path that lead to a statement
addition and we
have to be careful to always provide informations on which bot added
inferred
statements with that reasoning technology or rule from which data.
What is the community-defined meaning of subclass of and diseases then?
Here is what I see in Wikidata.
https://www.wikidata.org/wiki/Q128581 breast cancer has a https://www.wikidata.org/wiki/Property:P279 subclass of link to both https://www.wikidata.org/wiki/Q12136 disease and https://www.wikidata.org/wiki/Q18556617 thoracic cancer
https://www.wikidata.org/wiki/Property:P279 subclass of is linked via https://www.wikidata.org/wiki/Property:P1628 equivalent property to http://www.w3.org/2000/01/rdf-schema#subClassOf which is the subclass relationship between classes.
https://www.wikidata.org/wiki/Property:P279 subclass of has English description all of these items are instances of those items; this item is a class of that item. Not to be confused with Property:P31 (instance of). which is rather confusing, but appears to be gloss of the RDFS meaning of http://www.w3.org/2000/01/rdf-schema#subClassOf
Someone looking at all this is thus lead to believe that https://www.wikidata.org/wiki/Property:P279 subclass of is the same as the RDFS meaning of http://www.w3.org/2000/01/rdf-schema#subClassOf
So diseases are classes. They then have instances. They can be reasoned with using techniques borrowed from RDFS.
This is a particular modelling methodology. It has its benefits. It requires a certain view of disease and diseases. The particular instantiation of this modelling methodology, where there is a redundant link to the top of the disease hierarchy and that top loops back to itself, has its own benefits and drawbacks.
A bigger problem than the one you state, I think, is how outsiders can determine that this modelling methodology is in place and understand it adequately to effectively use the information or to contribute more information. There is nothing on the discussion pages for the various diseases that I looked at.
The modelling methodology used here is useful in many other places, including human occupations, creative work genres, cuisines, and sports. Is Wikidata uniform in applying this methodology? If this is not the case, then how is the use of this methodology signalled?
I however noticed in heated recent debates that some users on frwiki
were
sensible to the argument that Wikidata only does store statements.
This kind
of users feared that Wikidata would induce an alignment of semantics
of words
and items to the enwiki semantic, They believes in the linguistic
hypothesis
that words in a language carry some kind of language dependant meaning
on
their own and feared some kind of "cultural contagion" by some kind of mechanism where the specific meaning of english word would contaminate
the
french word. It has of course been said many time that Wikidata was not focused on words and linguistic but on definitions mainly, and that one definition equals one item, that wikidata was the sum of all
knowledge, but
the argument that finally seemed to be effective was the one that
Wikidata do
only store statements and do not einforce constraint. It seems to be
effective
to convince them that Wikidata is indeed POV agnostic.
In my discussion above, I tried to stay away from using the human-language descriptions, preferring an external formal definition. Unfortunately, Wikidata does not have an internal formal definition beyond the simple description of the data structures. This lack, I think, is what makes the human-language descriptions so important in Wikidata. My view is that a stronger formal basis for Wikidata would help to reduce the possibility that descriptions in dominant human languages do indeed push out the other descriptions.
2015-10-16 19:14 GMT+02:00 Peter F. Patel-Schneider <
pfpschneider@gmail.com
mailto:pfpschneider@gmail.com>:
It's very pleasant to hear from someone else who thinks of
Wikidata as a
knowledge base (or at least hopes that Wikidata can be considered
as a
knowledge base). Did you get any pushback on this or on your
stated Wikidata
goal of structuring the sum of all human knowledge? Did you get any pushback on your section on classification in
Wikidata? It
seems to me that some of that is rather controversial in the
Wikidata
community. I was a bit surprised to see class reasoning used on
diseases.
This depends on a particular modelling methodology. peter On 10/12/2015 11:47 AM, Emw wrote: > Hi all, > > On Saturday, I facilitated a workshop at the U.S. National
Archives entitled
> "An Ambitious Wikidata Tutorial" as part of WikiConference USA
> > Slides are available at: > http://www.slideshare.net/_Emw/an-ambitious-wikidata-tutorial >
https://commons.wikimedia.org/wiki/File:An_Ambitious_Wikidata_Tutorial.pdf
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
The main problem is that Instance Of is not being used properly sometimes. In general, wrong classifications across Wikidata lead to weird assumptions.
Better documentation, and even helper rules to help prevent wrong classifications is what is needed and its forthcoming.
Lydia has mentioned that these kinds of problems will eventually become less and less as the Roadmap features eventually land into production.
I am looking forward to next year, and the year after, to see the quality improve.
Thad +ThadGuidry https://www.google.com/+ThadGuidry
Yes, in the pizza example Emw showed, the definition of "food" is important. If what I ate this morning is a food, then pizza is a subclass of food. This is consistent with the first sentence of https://fr.wikipedia.org/wiki/Nourriture of frwiki. And the fact that "pizza" is an instance of food is a mistake, unfortunately a pretty common one on Wikidata. We should write a query to find all such examples where an item is both an instance and a subclass of the same class.
Now pizza is clearly a type of meal it could be relevant in a food classification and could be very well be an instance of it, as it's a preparation common people used to put whatever they can put on it, similarly to https://www.wikidata.org/wiki/Q12486
2015-10-18 18:31 GMT+02:00 Thad Guidry thadguidry@gmail.com:
The main problem is that Instance Of is not being used properly sometimes. In general, wrong classifications across Wikidata lead to weird assumptions.
Better documentation, and even helper rules to help prevent wrong classifications is what is needed and its forthcoming.
Lydia has mentioned that these kinds of problems will eventually become less and less as the Roadmap features eventually land into production.
I am looking forward to next year, and the year after, to see the quality improve.
Thad +ThadGuidry https://www.google.com/+ThadGuidry
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
I tried this to know whether there was items who are both instance of a subclass of human and intance of human, unfortunately the query timeouts :/ https://query.wikidata.org/#prefix%20wdt%3A%20%3Chttp%3A%2F%2Fwww.wikidata.o...
2015-10-18 19:34 GMT+02:00 Thomas Douillard thomas.douillard@gmail.com:
Yes, in the pizza example Emw showed, the definition of "food" is important. If what I ate this morning is a food, then pizza is a subclass of food. This is consistent with the first sentence of https://fr.wikipedia.org/wiki/Nourriture of frwiki. And the fact that "pizza" is an instance of food is a mistake, unfortunately a pretty common one on Wikidata. We should write a query to find all such examples where an item is both an instance and a subclass of the same class.
Now pizza is clearly a type of meal it could be relevant in a food classification and could be very well be an instance of it, as it's a preparation common people used to put whatever they can put on it, similarly to https://www.wikidata.org/wiki/Q12486
2015-10-18 18:31 GMT+02:00 Thad Guidry thadguidry@gmail.com:
The main problem is that Instance Of is not being used properly sometimes. In general, wrong classifications across Wikidata lead to weird assumptions.
Better documentation, and even helper rules to help prevent wrong classifications is what is needed and its forthcoming.
Lydia has mentioned that these kinds of problems will eventually become less and less as the Roadmap features eventually land into production.
I am looking forward to next year, and the year after, to see the quality improve.
Thad +ThadGuidry https://www.google.com/+ThadGuidry
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Hoi, Semantics prove what? Your excercise with pizza has me wonder what you call food. The definition of food as in the French Wikipedia is not an argument for me. My understanding of French is not sufficient and, arguably any language has its own approach. Thanks, GerardM
On 18 October 2015 at 19:34, Thomas Douillard thomas.douillard@gmail.com wrote:
Yes, in the pizza example Emw showed, the definition of "food" is important. If what I ate this morning is a food, then pizza is a subclass of food. This is consistent with the first sentence of https://fr.wikipedia.org/wiki/Nourriture of frwiki. And the fact that "pizza" is an instance of food is a mistake, unfortunately a pretty common one on Wikidata. We should write a query to find all such examples where an item is both an instance and a subclass of the same class.
Now pizza is clearly a type of meal it could be relevant in a food classification and could be very well be an instance of it, as it's a preparation common people used to put whatever they can put on it, similarly to https://www.wikidata.org/wiki/Q12486
2015-10-18 18:31 GMT+02:00 Thad Guidry thadguidry@gmail.com:
The main problem is that Instance Of is not being used properly sometimes. In general, wrong classifications across Wikidata lead to weird assumptions.
Better documentation, and even helper rules to help prevent wrong classifications is what is needed and its forthcoming.
Lydia has mentioned that these kinds of problems will eventually become less and less as the Roadmap features eventually land into production.
I am looking forward to next year, and the year after, to see the quality improve.
Thad +ThadGuidry https://www.google.com/+ThadGuidry
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
The coherence beetween the modelling principles we adopt and common language definitions is a strong argument to me. This means we're not that wrong, and that any person can understand. It's just essential that the two are consistent, as we're trying in WIkidata to define topics the best we can, just as Wikipedia articles does. This implies that we also will get some kind of international language with statements that reflects the subtelety of the definitions of the different languages.
I also strongly values the "1 item = 1 definition" principle, even at the cost of item disconnections if definitions of some close notions in different language.
2015-10-19 13:09 GMT+02:00 Gerard Meijssen gerard.meijssen@gmail.com:
Hoi, Semantics prove what? Your excercise with pizza has me wonder what you call food. The definition of food as in the French Wikipedia is not an argument for me. My understanding of French is not sufficient and, arguably any language has its own approach. Thanks, GerardM
On 18 October 2015 at 19:34, Thomas Douillard thomas.douillard@gmail.com wrote:
Yes, in the pizza example Emw showed, the definition of "food" is important. If what I ate this morning is a food, then pizza is a subclass of food. This is consistent with the first sentence of https://fr.wikipedia.org/wiki/Nourriture of frwiki. And the fact that "pizza" is an instance of food is a mistake, unfortunately a pretty common one on Wikidata. We should write a query to find all such examples where an item is both an instance and a subclass of the same class.
Now pizza is clearly a type of meal it could be relevant in a food classification and could be very well be an instance of it, as it's a preparation common people used to put whatever they can put on it, similarly to https://www.wikidata.org/wiki/Q12486
2015-10-18 18:31 GMT+02:00 Thad Guidry thadguidry@gmail.com:
The main problem is that Instance Of is not being used properly sometimes. In general, wrong classifications across Wikidata lead to weird assumptions.
Better documentation, and even helper rules to help prevent wrong classifications is what is needed and its forthcoming.
Lydia has mentioned that these kinds of problems will eventually become less and less as the Roadmap features eventually land into production.
I am looking forward to next year, and the year after, to see the quality improve.
Thad +ThadGuidry https://www.google.com/+ThadGuidry
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Hi Peter,
The community-defined meaning of *subclass of* (P279) is that of rdfs:subClassOf [1]. Similarly, the community-defined meaning of *instance of* (P31) is that of rdf:type [2, 3].
There are some open problems with how to handle qualifiers on *instance of* and *subclass of* in RDF/OWL exports of P31 as rdf:type and P279 as rdfs:subClassOf, but that does not negate the community's decision to tie its two most basic membership properties to those W3C standard properties. In the current RDF/OWL exports that follow the community interpretation of P31 and P279, e.g. wikidata-taxonomy.nt.gz and wikidata-instances.nt.gz in [4], statements that have qualifiers on either of those properties are simply omitted.
The community's definition of disease is less established. However, there is consensus that diseases like cancer (Q12078) and malaria (Q12156) are classes. An instance of disease would be a particular case of a disease, i.e. a particular case of an abnormal condition in a particular organism. For example, it would be the particular case of throat cancer that caused U.S. President Ulysses S. Grant to die, as reflected in the Wikidata statement "Ulysses S. Grant *cause of death *throat cancer" [5].
Wikidata has no items on actual instances of disease to my knowledge -- although it does have at least one item about an instance of a symptom [6]. That of course does not mean that such instances of disease do not exist or that they could not theoretically be modeled in some local Wikibase installation (e.g. in a physician's office or a hospital) that uses Wikidata vocabulary to track actual instances of disease, e.g. a particular case of pancreatic cancer in a patient.
If you have questions or concerns regarding how diseases are modeled, I would recommend contacting Wikidata editor and disease ontologist Elvira Mitraka (Emitraka) [7], as well as WikiProject Medicine [8] or WikiProject Molecular Biology [9].
Regarding how outsiders can become aware of modeling methodology, I recommend reading https://www.wikidata.org/wiki/Help:Basic_membership_properties and engaging with particular domain modeling groups on Wikidata, e.g. the wikiprojects mentioned above. This mailing list and Wikidata Project Chat [10] are also good places to ask questions.
Finally, regarding your question "Is Wikidata uniform in applying this methodology?", the answer is no. Wikidata's use of *subclass of* and *instance of* varies among (and sometimes within) different domains of knowledge like human occupations, creative work genres, cuisines, and sports. The basic difference in usage among those domains is using *instance of* where others would use *subclass of*.
For example, pizza (https://www.wikidata.org/wiki/Q177) is currently modeled as an instance of food and (transitively) a subclass of food. Problematic indeed! Disease modeling achieves the same goal of easy queryability by making statements like "malaria *subclass of* disease" and "malaria *subclass of* parasitic protozoa infectious disease" [11], where the latter value transitively resolves to disease. This is not only rather redundant, but also makes the *subclass of* hierarchy cyclic and thus not a directed acyclic graph (DAG) due to the situation you note in the item about disease itself. But at least it avoids the more severe problem of being ontologically incorrect as seen in the item on pizza -- and all chemical elements, e.g. hydrogen (Q556) [12].
Regards, Eric https://www.wikidata.org/wiki/User:Emw
1. http://www.w3.org/TR/rdf-schema/#ch_subclassof 2. http://www.w3.org/TR/rdf-schema/#ch_type 3. is a -> instance of. https://www.wikidata.org/w/index.php?title=Property_talk:P31&oldid=25407... 4. http://tools.wmflabs.org/wikidata-exports/rdf/index.php?content=dump_downloa... 5. Ulysses S. Grant: cause of death. https://www.wikidata.org/wiki/Q34836#P509 6. George H. W. Bush vomiting incident. https://www.wikidata.org/wiki/Q5540112 7. https://www.wikidata.org/wiki/User:Emitraka 8. WikiProject Medicine on Wikidata. https://www.wikidata.org/wiki/Wikidata_talk:WikiProject_Medicine 9. Wikiproject Molecular Biology on Wikidata. https://www.wikidata.org/wiki/Wikidata:WikiProject_Molecular_biology 10. https://www.wikidata.org/wiki/Wikidata:Project_chat 11. Malaria: subclass of. https://www.wikidata.org/w/index.php?title=Q12156&oldid=259072228#P279 12. Hydrogen. https://www.wikidata.org/w/index.php?title=Q556&oldid=258289050
On Sat, Oct 17, 2015 at 10:25 AM, Peter F. Patel-Schneider < pfpschneider@gmail.com> wrote:
On 10/17/2015 12:55 AM, Thomas Douillard wrote:
I was a bit surprised to see class reasoning used on diseases.
I was not aware of that, do you have links ?
See slide 38 of http://www.slideshare.net/_Emw/an-ambitious-wikidata-tutorial
I was a bit surprised to see class reasoning used on diseases. This depends on a particular modelling methodology.
It's not surprising as the meaning of properties is community defined
(or sub
community defined) so any community can use reasoning technology they
want to
use as which is consistent with the intended meaning of properties. As Wikidata do only stores statements anyone can use reasoning technologies
on
top of this that are community accepted. The drawback of this approach
have
been discussed on another thread some days ago : it could become tricky
to
understand for a simple user the path that lead to a statement addition
and we
have to be careful to always provide informations on which bot added
inferred
statements with that reasoning technology or rule from which data.
What is the community-defined meaning of subclass of and diseases then?
Here is what I see in Wikidata.
https://www.wikidata.org/wiki/Q128581 breast cancer has a https://www.wikidata.org/wiki/Property:P279 subclass of link to both https://www.wikidata.org/wiki/Q12136 disease and https://www.wikidata.org/wiki/Q18556617 thoracic cancer
https://www.wikidata.org/wiki/Property:P279 subclass of is linked via https://www.wikidata.org/wiki/Property:P1628 equivalent property to http://www.w3.org/2000/01/rdf-schema#subClassOf which is the subclass relationship between classes.
https://www.wikidata.org/wiki/Property:P279 subclass of has English description all of these items are instances of those items; this item is a class of that item. Not to be confused with Property:P31 (instance of). which is rather confusing, but appears to be gloss of the RDFS meaning of http://www.w3.org/2000/01/rdf-schema#subClassOf
Someone looking at all this is thus lead to believe that https://www.wikidata.org/wiki/Property:P279 subclass of is the same as the RDFS meaning of http://www.w3.org/2000/01/rdf-schema#subClassOf
So diseases are classes. They then have instances. They can be reasoned with using techniques borrowed from RDFS.
This is a particular modelling methodology. It has its benefits. It requires a certain view of disease and diseases. The particular instantiation of this modelling methodology, where there is a redundant link to the top of the disease hierarchy and that top loops back to itself, has its own benefits and drawbacks.
A bigger problem than the one you state, I think, is how outsiders can determine that this modelling methodology is in place and understand it adequately to effectively use the information or to contribute more information. There is nothing on the discussion pages for the various diseases that I looked at.
The modelling methodology used here is useful in many other places, including human occupations, creative work genres, cuisines, and sports. Is Wikidata uniform in applying this methodology? If this is not the case, then how is the use of this methodology signalled?
I however noticed in heated recent debates that some users on frwiki were sensible to the argument that Wikidata only does store statements. This
kind
of users feared that Wikidata would induce an alignment of semantics of
words
and items to the enwiki semantic, They believes in the linguistic
hypothesis
that words in a language carry some kind of language dependant meaning on their own and feared some kind of "cultural contagion" by some kind of mechanism where the specific meaning of english word would contaminate
the
french word. It has of course been said many time that Wikidata was not focused on words and linguistic but on definitions mainly, and that one definition equals one item, that wikidata was the sum of all knowledge,
but
the argument that finally seemed to be effective was the one that
Wikidata do
only store statements and do not einforce constraint. It seems to be
effective
to convince them that Wikidata is indeed POV agnostic.
In my discussion above, I tried to stay away from using the human-language descriptions, preferring an external formal definition. Unfortunately, Wikidata does not have an internal formal definition beyond the simple description of the data structures. This lack, I think, is what makes the human-language descriptions so important in Wikidata. My view is that a stronger formal basis for Wikidata would help to reduce the possibility that descriptions in dominant human languages do indeed push out the other descriptions.
2015-10-16 19:14 GMT+02:00 Peter F. Patel-Schneider <
pfpschneider@gmail.com
mailto:pfpschneider@gmail.com>:
It's very pleasant to hear from someone else who thinks of Wikidata
as a
knowledge base (or at least hopes that Wikidata can be considered as
a
knowledge base). Did you get any pushback on this or on your stated
Wikidata
goal of structuring the sum of all human knowledge? Did you get any pushback on your section on classification in
Wikidata? It
seems to me that some of that is rather controversial in the Wikidata community. I was a bit surprised to see class reasoning used on
diseases.
This depends on a particular modelling methodology. peter On 10/12/2015 11:47 AM, Emw wrote: > Hi all, > > On Saturday, I facilitated a workshop at the U.S. National
Archives entitled
> "An Ambitious Wikidata Tutorial" as part of WikiConference USA
> > Slides are available at: > http://www.slideshare.net/_Emw/an-ambitious-wikidata-tutorial >
https://commons.wikimedia.org/wiki/File:An_Ambitious_Wikidata_Tutorial.pdf
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Hi!
The community-defined meaning of /subclass of/ (P279) is that of rdfs:subClassOf [1]. Similarly, the community-defined meaning of /instance of/ (P31) is that of rdf:type [2, 3].
Are you sure it is always correct? AFAIK there are some specific rules and meanings in OWL that classes should adhere to, also same thing can not be an individual and a class, and others (not completely sure of the whole list, as I don't have enough background in RDF/OWL). But I'm not sure existing data actually follows that.
There are some open problems with how to handle qualifiers on /instance of/ and /subclass of/ in RDF/OWL exports of P31 as rdf:type and P279 as rdfs:subClassOf, but that does not negate the community's decision to tie its two most basic membership properties to those W3C standard properties. In the current RDF/OWL exports that follow the community
I'm not sure I understand how that works in practice. I.e., if we say that P31 *is* rdf:type, then it can't be qualified in RDF/OWL and we can not represent part (albeit small, qualified properties are about 0.2% of all such properties) of our data.
I mean, we can certainly have data sets which include P31 statements from the data translated to rdf:type unless they have qualifiers, and that can be very useful pragmatically, no question about it. But can we really say P31 is the same as rdf:type and use it whenever we choose to represent Wikidata data as RDF? I'm not sure about that.
For example, pizza (https://www.wikidata.org/wiki/Q177) is currently modeled as an instance of food and (transitively) a subclass of food.
Here we have another practical issue - if we adhere to the strict notion that pizza is only a subclass, then we would practically never have any instances in the database for wide categories of things. I.e. since a particular food item is rarely notable enough to be featured in Wikidata, no food would have instances. It may be formally correct but I'm afraid it's not like most people think - for most people, pizza is a food, not a "subclass of food". Same with chemistry - as virtually no actual physical chemical compound (as in "this brown liquid in my test tube I prepared this morning by mixing contents of those three other test tubes") of would be notable enough to gain entry in Wikidata, nothing in chemistry would ever be an instance. Theoretically it may be sound, but practically I'm not sure it would work well, even more - that it is *already* what the consensus on Wikidata is.
On 10/18/2015 01:59 PM, Stas Malyshev wrote:
[Emw]
Hi!
The community-defined meaning of /subclass of/ (P279) is that of rdfs:subClassOf [1]. Similarly, the community-defined meaning of /instance of/ (P31) is that of rdf:type [2, 3].
Are you sure [that] is always correct? AFAIK there are some specific rules and meanings in OWL that classes should adhere to, also same thing can not be an individual and a class, and others (not completely sure of the whole list, as I don't have enough background in RDF/OWL). But I'm not sure existing data actually follows that.
OWL does not currently allow classes to be directly treated as individuals. This is more of an engineering decision than a philosophical one, however. In RDFS classes are also individuals.
There are some open problems with how to handle qualifiers on /instance of/ and /subclass of/ in RDF/OWL exports of P31 as rdf:type and P279 as rdfs:subClassOf, but that does not negate the community's decision to tie its two most basic membership properties to those W3C standard properties. In the current RDF/OWL exports that follow the community
I'm not sure I understand how that works in practice. I.e., if we say that P31 *is* rdf:type, then it can't be qualified in RDF/OWL and we can not represent part (albeit small, qualified properties are about 0.2% of all such properties) of our data.
I mean, we can certainly have data sets which include P31 statements from the data translated to rdf:type unless they have qualifiers, and that can be very useful pragmatically, no question about it. But can we really say P31 is the same as rdf:type and use it whenever we choose to represent Wikidata data as RDF? I'm not sure about that.
Nor am I.
For example, pizza (https://www.wikidata.org/wiki/Q177) is currently modeled as an instance of food and (transitively) a subclass of food.
Here we have another practical issue - if we adhere to the strict notion that pizza is only a subclass, then we would practically never have any instances in the database for wide categories of things. I.e. since a particular food item is rarely notable enough to be featured in Wikidata, no food would have instances. It may be formally correct but I'm afraid it's not like most people think
- for most people, pizza is a food, not a "subclass of food".
Well pizza is a kind of food, and a kind that is important enough to get a name in some languages. I agree that it would be nice, however, to be able to model the way that we think that people think, and thus be able to make pizza an instance of some food class instead of requiring that it be (only) a subclass of some general class.
Same with chemistry - as virtually no actual physical chemical compound (as in "this brown liquid in my test tube I prepared this morning by mixing contents of those three other test tubes") of would be notable enough to gain entry in Wikidata, [nearly] nothing in chemistry would ever be an instance. Theoretically it may be sound, but practically I'm not sure it would work well, even more - that it is *already* what the consensus on Wikidata is.
I have come around to the position that it is preferrable to model these sort of domains using multiple levels of the class hierarchy. For food, there would be a class (possibly called food) whose instances are those things that are actually eaten (like the pizza I ate in Bethlehem last week). There would also be a class (possibly also called food, but maybe food type) whose instances are the (notable?) classes of food (like pizza, but maybe also like bad pizza from a hole-in-the-wall restaurant). This lets you have your cake and describe it too.
I have also come around to the position that this situation is very common. Also, people seem to be generally capable of working with such modelling, at least informally in their heads.
However, this modelling methodology needs to be described to users, as even things that people do well internally can cause problems when they are being externalized. For example, it would be a problem if users put things in the wrong place (pizza as an instance of the non-food-type food) or make other modelling errors. There also should be tool support, for exammple to ensure that all instances of the food-type food are subclasses of the non-food-type food (and maybe vice-versa).
But what else can be done? Every other approach that I have seen has what I consider to be worse problems.
Stas Malyshev smalyshev@wikimedia.org
Peter F. Patel-Schneider
Hi Stas,
Yes, P31 is always rdf:type and P279 is always rdfs:subClassOf in RDF/OWL exports that use the community interpretation of those properties. If those exports want to be decidable then they'll need to omit claims that use P31 or P279 as qualifiers -- which some of Markus's RDF/OWL exports already do, as mentioned in my previous message.
Even if exports don't do that, they would still be in conformance with RDF, RDFS, and OWL 2 Full. But they would not be valid OWL 2 DL, which is what most target when they want to query large ontologies because of the performance guarantee of decidability.
If we want to pick up those items in OWL 2 DL exports, then the solution is simple: don't use qualifiers in *instance of* (P31) or *subclass of* (P279) claims. We could capture that information in other ways, e.g. by using non-P31/P279 properties for it on Wikidata. This would probably be a good idea regardless of OWL 2 DL conformance.
same thing can not be an individual and a class
That is incorrect. That old restriction of OWL 1 DL has been negated by OWL 2 DL, which was released in 2012. This enables class-individual punning while maintaining decidability [2]. So we could say something like "subclass of: Homo" and "instance of: taxon" in the item about human (Q5) while ensuring that queries that use the community-agreed W3C semantics could theoretically terminate. This is a big deal.
if we adhere to the strict notion that pizza is only a subclass, then we
would practically never have any instances in the database for wide categories of things
Why would it be a problem to not have instances in Wikidata for wide categories of things? Let's consider what would happen if we don't explicitly model classes as instances, and thus don't have any explicit instances for a wide category of things where Wikidata does not have items for "strictly interpreted" (i.e. actual) instances.
We would not only still have items and tons of meaningful data on those items -- e.g. hydrogen and cancer -- but omitting *instance of* there would also help ensure ontological correctness and easy interoperability with many major third-party ontologies.
In fact, this is how a vast swath of ontologies already are -- they are often mostly comprised of classes and very few or no instances. This is the case with the Disease Ontology, a vetted third-party ontology which is currently being used as the semantic backbone for diseases on Wikidata [3], as well as ontologies like ChEBI for chemistry and many other scientific ontologies. See also how the Stanford group that develops Protege models pizza as a class and not an instance: http://protege.stanford.edu/ontologies/pizza/pizza.owl.
There is an extra layer here to consider. Pizza could be explicitly modeled as both a class and an instance, albeit rather awkwardly. Similar to how we could say "Porsche 356 *subclass of* car" and "Porsche 356 *instance of* car model", we could also theoretically state something like "pizza *subclass of* food" and "pizza *instance of* food class" or somesuch.
That is called metamodeling. Opinions differ within the Wikidata community on how widely such explicit metamodeling should be used, but there is consensus that statements like "pizza *instance of* food" and "Porsche 356 *instance of* car" are incorrect and not suitable for Wikidata.
for most people, pizza is a food, not a "subclass of food"
The phrase "is a" is in no way mutually exclusive with "subclass of". "Is a" is ambiguous -- it can mean the subject is either a class or an instance. In other words, "is a" can mean either *instance of* (P31) or *subclass of* (P279).
New Wikidata editors often oversimplify *instance of* (P31) to "is a" because P31 is so widely used where the everyday phrase "is a" fits. However, in many ontologies, like Disease Ontology or any of the other Open Biomedical Ontologies (OBO) [3], *is_a* actually resolves to rdfs:subClassOf, i.e. *subclass of* (P279). To avoid confusion, when talking in an ontological context as we do with Wikidata classification, it's best to avoid ambiguous phrases like "is a" and favor more precise phrases like "instance of" and "subclass of".
that it is *already* what the consensus on Wikidata is
That "consensus" directly conflicts with other consensuses which have established that chemical compounds, diseases, and genes should use *subclass of* instead of *instance of*. Wikidata should not be a disjointed patchwork of knowledge fiefdoms where each community has its own insular, incompatible usage of *subclass of* and *instance of*. This problem is especially acute in chemistry on Wikidata, where chemical elements use "*instance of *chemical element*" *even though it has been established that chemical compounds should not use "*instance of* chemical compound" [4].
More importantly, having "*instance of *chemical element" and (transitively) "subclass of *chemical element*" in items as we do now is ontologically incorrect. Hydrogen (Q556) has an example of such modeling. That state of affairs has been widely recognized as an error in other discussions, e.g. the "Item both instance and subclass" thread that Markus and Denny chimed in on September 2014 [5].
Eric
1. http://www.w3.org/TR/owl2-new-features/#F12:_Punning 2. http://www.comlab.ox.ac.uk/people/boris.motik/pubs/motik07metamodeling-journ... 3. http://www.obofoundry.org/ 4. https://lists.wikimedia.org/pipermail/wikidata-l/2014-October/004695.html 5. https://lists.wikimedia.org/pipermail/wikidata-l/2014-September/004650.html
On Sun, Oct 18, 2015 at 4:59 PM, Stas Malyshev smalyshev@wikimedia.org wrote:
Hi!
The community-defined meaning of /subclass of/ (P279) is that of rdfs:subClassOf [1]. Similarly, the community-defined meaning of /instance of/ (P31) is that of rdf:type [2, 3].
Are you sure it is always correct? AFAIK there are some specific rules and meanings in OWL that classes should adhere to, also same thing can not be an individual and a class, and others (not completely sure of the whole list, as I don't have enough background in RDF/OWL). But I'm not sure existing data actually follows that.
There are some open problems with how to handle qualifiers on /instance of/ and /subclass of/ in RDF/OWL exports of P31 as rdf:type and P279 as rdfs:subClassOf, but that does not negate the community's decision to tie its two most basic membership properties to those W3C standard properties. In the current RDF/OWL exports that follow the community
I'm not sure I understand how that works in practice. I.e., if we say that P31 *is* rdf:type, then it can't be qualified in RDF/OWL and we can not represent part (albeit small, qualified properties are about 0.2% of all such properties) of our data.
I mean, we can certainly have data sets which include P31 statements from the data translated to rdf:type unless they have qualifiers, and that can be very useful pragmatically, no question about it. But can we really say P31 is the same as rdf:type and use it whenever we choose to represent Wikidata data as RDF? I'm not sure about that.
For example, pizza (https://www.wikidata.org/wiki/Q177) is currently modeled as an instance of food and (transitively) a subclass of food.
Here we have another practical issue - if we adhere to the strict notion that pizza is only a subclass, then we would practically never have any instances in the database for wide categories of things. I.e. since a particular food item is rarely notable enough to be featured in Wikidata, no food would have instances. It may be formally correct but I'm afraid it's not like most people think - for most people, pizza is a food, not a "subclass of food". Same with chemistry - as virtually no actual physical chemical compound (as in "this brown liquid in my test tube I prepared this morning by mixing contents of those three other test tubes") of would be notable enough to gain entry in Wikidata, nothing in chemistry would ever be an instance. Theoretically it may be sound, but practically I'm not sure it would work well, even more - that it is *already* what the consensus on Wikidata is.
-- Stas Malyshev smalyshev@wikimedia.org
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
2015-10-19 4:44 GMT+02:00 Emw emw.wiki@gmail.com:
This problem is especially acute in chemistry on Wikidata, where chemical elements use "*instance of *chemical element*" *even though it has been established that chemical compounds should not use "*instance of* chemical compound" [
As we discussed many times, it's natural to model "chemical element" as "type of substance" or "type of atom" depending on the definintion you take. But no question elements are either subclass of substance of subclass of atom however. It's a natural case of metamodeling, similarly to the "car model" example. And no, using instance of has not be proven to be incorrect at all.
On 19.10.2015 04:44, Emw wrote: ...
The phrase "is a" is in no way mutually exclusive with "subclass of". "Is a" is ambiguous -- it can mean the subject is either a class or an instance. In other words, "is a" can mean either /instance of/ (P31) or /subclass of/ (P279).
Indeed. Alas, some languages on Wikidata use the literal translation of "is a" as a label for "instance of". German is among them.
Markus
To anyone who argues that pizza is an <instance of:food> just ask them what toppings it has?
They will immediately acknowledge that pizza is a class of similar but distinct items. Just think about what statements you can make about an item and it will usually guide you quickly to what type of item you have.
Similarly the "Diary of Anne Frank" is an instance of a memoir or a literary work but is a subclass of book (because there are lots of physical books with that name). Literary works have authors and publishers. Books have numbers of pages and printers and physical locations.
This brings us to 'Chicago style deep dish pepperoni pizza'. Instance or class? The answer is that it is an instance of a 'recipe' with ingredients, cooking methods etc. and it is a subclass of pizza. There may even be a case to go to the 4 level model developed by the librarians at FRBR for books: * Work - a literary work has properties like author, language, * Expression - an edition of this work with a publisher, publication date, translator etc. * Manifestation - is this the hardback, paperback or ebook, version of an expression? * Item - a physical book with a purchase date, shelf position etc. In practice Wikidata uses "edition" to mean both Expression and Manifestation and we often combine the edition and the work in one item, especially where there is only one edition of a work.
Translated into pizza * Work = recipe. American Hot Italian style pepperoni pizza * Expression = "Pizza Express" American Hot pizza * Manifestation = Did you get the Eat in the restaurant or the take home from the supermarket version? * Item = the pizza I ate last night.
OK?
Joe
On Mon, Oct 19, 2015 at 1:14 PM Markus Krötzsch < markus@semantic-mediawiki.org> wrote:
On 19.10.2015 04:44, Emw wrote: ...
The phrase "is a" is in no way mutually exclusive with "subclass of". "Is a" is ambiguous -- it can mean the subject is either a class or an instance. In other words, "is a" can mean either /instance of/ (P31) or /subclass of/ (P279).
Indeed. Alas, some languages on Wikidata use the literal translation of "is a" as a label for "instance of". German is among them.
Markus
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Joe... you just HAD to mention FRBR !!! argh.... let it die, let it die, let it die....lololol :)
Thad +ThadGuidry https://www.google.com/+ThadGuidry
Hi!
Similarly the "Diary of Anne Frank" is an instance of a memoir or a literary work but is a subclass of book (because there are lots of physical books with that name). Literary works have authors and publishers. Books have numbers of pages and printers and physical locations.
I'm not sure I understand this. What is the difference between "instance of memoir" and "subclass of book"? You could literally argue with the same words that it is also "subclass of memoir" and again since very rarely any specific physical book is notable enough (maybe excluding things like first Gutenberg Bible, etc.) we would have virtually no instances of book at all. I do not think people think that way - if you ask somebody is "Diary of Anne Frank" an example of a book or a class I think most people would say it's an example of a book and not a class. Unless we plan to seek out and record every printed physical copy of that book, I don't see any practical reason to describe it as a class. This class - and hundreds of thousands of other book titles, maybe with rare exceptions of the Gutenberg Bible, etc. - would never have any instances. So my question is - what is the use of modeling something as a class if there won't be ever any instances of the class modeled?
Consistency and well foundedness. It's acutally pretty confortable that the basic object we're classyfing are concrete stuffs. The diary of Ann Franck as a Work is more an abstract object. This ensure that we always indeed are trying to class concrete object at the base of the classification system. Classes and higher order classes are then always at some level : one for classes as they are collections of concrete objects or events, two for car models or chemical elements as they are classes of classes of objects or events, and so on.
This avoids to have some questions to be asked about "will my class will have instance" "oh shit, in fact we have articles on the concrete objects after all, what will we do ?????"
Let's do be right the first time on the basic principle we uses. https://www.wikidata.org/wiki/Help:Classification simplifies overall things because we use a regular and consistent scheme across the vasts domain of application Wikidata is a database for. Read the "Metaclass" article, follow the citations of the sources of it, you'll see that this is a widely accepted scheme actually. For a reason(s) :)
2015-10-19 19:58 GMT+02:00 Stas Malyshev smalyshev@wikimedia.org:
Hi!
Similarly the "Diary of Anne Frank" is an instance of a memoir or a literary work but is a subclass of book (because there are lots of physical books with that name). Literary works have authors and publishers. Books have numbers of pages and printers and physical
locations.
I'm not sure I understand this. What is the difference between "instance of memoir" and "subclass of book"? You could literally argue with the same words that it is also "subclass of memoir" and again since very rarely any specific physical book is notable enough (maybe excluding things like first Gutenberg Bible, etc.) we would have virtually no instances of book at all. I do not think people think that way - if you ask somebody is "Diary of Anne Frank" an example of a book or a class I think most people would say it's an example of a book and not a class. Unless we plan to seek out and record every printed physical copy of that book, I don't see any practical reason to describe it as a class. This class - and hundreds of thousands of other book titles, maybe with rare exceptions of the Gutenberg Bible, etc. - would never have any instances. So my question is - what is the use of modeling something as a class if there won't be ever any instances of the class modeled?
-- Stas Malyshev smalyshev@wikimedia.org
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Hi Peter,
The community-defined meaning of subclass of (P279) is that of rdfs:subClassOf [1]. Similarly, the community-defined meaning of instance of (P31) is that of rdf:type [2, 3].
That's encouraging. I do note that there was quite a bit of discussion on these two properties. I am assuming that this has all died down and that the mearning of these two properties are now stable.
The meaning of these two properties are not defined in http://www.w3.org/TR/rdf-schema but instead in RDF 1.1 Semantics [http://www.w3.org/TR/rdf-semantics]. There there is a full formal definition of the RDFS meaning of instance of and subclass of.
This definition states that objects that are instances of a class are instances of superclasses of the class. Do Wikidata tools show these implied instance relationships? If not, then there is something decidedly lacking.
There are some open problems with how to handle qualifiers on instance of and subclass of in RDF/OWL exports of P31 as rdf:type and P279 as rdfs:subClassOf, but that does not negate the community's decision to tie its two most basic membership properties to those W3C standard properties. In the current RDF/OWL exports that follow the community interpretation of P31 and P279, e.g. wikidata-taxonomy.nt.gz and wikidata-instances.nt.gz in [4], statements that have qualifiers on either of those properties are simply omitted.
The treatment of qualified subclass of and instance of is not addressed in RDF 1.1 Semantics, and is independent of how to export Wikidata information as RDF or OWL. Is there a theory of how qualifiers (and other aspects of Wikidata) are supposed to interact with these two properties in Wikidata? If not, how can there be a true community understanding of these two properties?
The community's definition of disease is less established. However, there is consensus that diseases like cancer (Q12078) and malaria (Q12156) are classes. An instance of disease would be a particular case of a disease, i.e. a particular case of an abnormal condition in a particular organism. For example, it would be the particular case of throat cancer that caused U.S. President Ulysses S. Grant to die, as reflected in the Wikidata statement "Ulysses S. Grant cause of death throat cancer" [5].
That's fine and this modelling methodology does have its advantage. Having this written down somewhere that is easy to find would be helpful.
Wikidata has no items on actual instances of disease to my knowledge -- although it does have at least one item about an instance of a symptom [6]. That of course does not mean that such instances of disease do not exist or that they could not theoretically be modeled in some local Wikibase installation (e.g. in a physician's office or a hospital) that uses Wikidata vocabulary to track actual instances of disease, e.g. a particular case of pancreatic cancer in a patient.
If you have questions or concerns regarding how diseases are modeled, I would recommend contacting Wikidata editor and disease ontologist Elvira Mitraka (Emitraka) [7], as well as WikiProject Medicine [8] or WikiProject Molecular Biology [9].
Thanks.
Regarding how outsiders can become aware of modeling methodology, I recommend reading https://www.wikidata.org/wiki/Help:Basic_membership_properties and engaging with particular domain modeling groups on Wikidata, e.g. the wikiprojects mentioned above. This mailing list and Wikidata Project Chat [10] are also good places to ask questions.
I have indeed asked the question on this mailing list and received quite some useful responses. Is there a way of getting from the appropriate Wikidata pages to the domain modelling groups without having to ask? It seems to me that this would be helpful.
Finally, regarding your question "Is Wikidata uniform in applying this methodology?", the answer is no. Wikidata's use of subclass of and instance of varies among (and sometimes within) different domains of knowledge like human occupations, creative work genres, cuisines, and sports. The basic difference in usage among those domains is using instance of where others would use subclass of.
I was unable to find this information for diseases without using my expertise in this area. Is there some easier way of determining which approach is being used in which domain?
For example, pizza (https://www.wikidata.org/wiki/Q177) is currently modeled as an instance of food and (transitively) a subclass of food. Problematic indeed!
Well, not really problematic per se, but a modelling methodology that can easily lead to incorrect determinations. In essence, food is the union of those things that we might eat (the pizza I ate in Bethlehem last week) and categories of these things (bad pizza from a hole-in-the-wall restaurant).
Disease modeling achieves the same goal of easy queryability by making statements like "malaria subclass of disease" and "malaria subclass of parasitic protozoa infectious disease" [11], where the latter value transitively resolves to disease. This is not only rather redundant, but also makes the subclass of hierarchy cyclic and thus not a directed acyclic graph (DAG) due to the situation you note in the item about disease itself. But at least it avoids the more severe problem of being ontologically incorrect as seen in the item on pizza -- and all chemical elements, e.g. hydrogen (Q556) [12].
This modelling methodology does not fall prey to the above problem, but is not itself problem-free. Does the redundant information have any impact? If not, then why include it? If so, then everyone needs to know about it.
I don't see any problems with cyclic hierarchies per se, but stating cyclic hierarchies is often a signal of a modelling error. In the disease domain it appears that the only cyclicity is from disease to itself. This appears to be a part of the modelling methodology, but is this stated anywhere?
As has been stated elsewhere, it would be better to have a higher-order class or some other signal that this modelling methodology is in use for disease and its subclasses.
Regards, Eric https://www.wikidata.org/wiki/User:Emw
- http://www.w3.org/TR/rdf-schema/#ch_subclassof
- http://www.w3.org/TR/rdf-schema/#ch_type
- is a -> instance of.
https://www.wikidata.org/w/index.php?title=Property_talk:P31&oldid=25407...
http://tools.wmflabs.org/wikidata-exports/rdf/index.php?content=dump_downloa...
- Ulysses S. Grant: cause of death. https://www.wikidata.org/wiki/Q34836#P509
- George H. W. Bush vomiting incident. https://www.wikidata.org/wiki/Q5540112
- https://www.wikidata.org/wiki/User:Emitraka
- WikiProject Medicine on Wikidata.
https://www.wikidata.org/wiki/Wikidata_talk:WikiProject_Medicine
- Wikiproject Molecular Biology on Wikidata.
https://www.wikidata.org/wiki/Wikidata:WikiProject_Molecular_biology
- https://www.wikidata.org/wiki/Wikidata:Project_chat
- Malaria: subclass of.
https://www.wikidata.org/w/index.php?title=Q12156&oldid=259072228#P279
Hi Peter,
Wikidata is thought of as a knowledge base by not only you and me, but also Denny Vrandecic and Markus Kroetzsch [1] and many others.
I said that one of Wikidata's goals is "structuring the sum of all human knowledge" to suggest how Wikidata fits into the vision set out for Wikipedia by Jimmy Wales: "Imagine a world in which every single person on the planet is given free access to the sum of all human knowledge. That's what we're doing."[2] One could certainly nitpick both of those statements -- the projects will never have information on how much coffee I have left in my mug as I write this, which is certainly human knowledge -- but neither statement is meant to be especially precise. They are statements of ambitious vision.
To answer your questions: no, I don't recall getting any pushback from the workshop audience on any of the subjects you asked about.
Best, Eric https://www.wikidata.org/wiki/User:Emw
1. Denny Vrandecic, Markus Kroetzsch (2014). "Wikidata: A Free Collaborative Knowledgebase". Communications of the ACM. http://cacm.acm.org/magazines/2014/10/178785-wikidata/fulltext
2. Jimmy Wales (2004). "Wikipedia Founder Jimmy Wales Responds". Slashdot. http://slashdot.org/story/04/07/28/1351230/wikipedia-founder-jimmy-wales-res...
On Fri, Oct 16, 2015 at 1:14 PM, Peter F. Patel-Schneider < pfpschneider@gmail.com> wrote:
It's very pleasant to hear from someone else who thinks of Wikidata as a knowledge base (or at least hopes that Wikidata can be considered as a knowledge base). Did you get any pushback on this or on your stated Wikidata goal of structuring the sum of all human knowledge?
Did you get any pushback on your section on classification in Wikidata? It seems to me that some of that is rather controversial in the Wikidata community. I was a bit surprised to see class reasoning used on diseases. This depends on a particular modelling methodology.
peter
On 10/12/2015 11:47 AM, Emw wrote:
Hi all,
On Saturday, I facilitated a workshop at the U.S. National Archives
entitled
"An Ambitious Wikidata Tutorial" as part of WikiConference USA 2015.
Slides are available at: http://www.slideshare.net/_Emw/an-ambitious-wikidata-tutorial
https://commons.wikimedia.org/wiki/File:An_Ambitious_Wikidata_Tutorial.pdf
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata