Dear Sir, I thank you for your efforts. It was an honour for me to meet you in Stockholm. Concerning the points you raised, I 1. Concerning the problematic use of Instance of and SubClass of, I invite you to see https://tinyurl.com/y29lx9o4. This seems to be not accurate as drugs differ from drug classes from a pharmacological view. For example, Artemisinin, Ibuprofen and Sobosbuvir are drugs. However, drug classes should include antibiotics and antivirals. Concerning WikiProject Ontology, I know it. The project succeeded to solve many matters related to the structure of Wikidata. However, the project is slow. 2. Concerning SQID and ShEx, they are interesting projects. However, this is not what I mean. I ask if Wikidata can involve a description logics layer for properties that can be later used to find data inconsistencies within Wikidata. 3. The principle of enriching Wikidata within labels, Wikidata statements and statement references is simple using citation indexes such as PubMed. For adding a reference to a Wikidata statement X(A,B), we should only search for (X AND A AND B) using PubMed and find the most relevant result. 4. We should either change CC-0 into CC-BY Licence or try to convince the creators of ontologies to change the licence of their works into CC-0. 5. Using a constraint is an excellent idea. I ask if sensitivity and specificity can be used as constraints https://en.m.wikipedia.org/wiki/Sensitivity_and_specificity. The matter of these two metrics is that they can vary from a study to another. 6. Concerning CC-0 Lexicons, I can cite Onto.PT https://www.researchgate.net/publication/259935306_OntoPT_Recent_development.... 7. I think that we should be an ontology for the stages of diseases. 8. Reasonator is based on POS Tagging of Wikipedia. https://tools.wmflabs.org/reasonator/?&q=1339&lang=en. However, this is not important. The most important fact is how to use Wikidata labels to ameliorate translations of Mediawiki messages. 9. I agree on the fact that the creation of new properties can solve this problem. Yours Sincerely, Houcemeddine Turki (he/him) Medical Student, Faculty of Medicine of Sfax, University of Sfax, Tunisia Undergraduate Researcher, UR12SP36 GLAM and Education Coordinator, Wikimedia TN User Group Member, WikiResearch Tunisia Member, Wiki Project Med Member, WikiIndaba Steering Committee Member, Wikimedia and Library User Group Steering Committee Co-Founder, WikiLingua Maghreb Founder, TunSci ____________________ +21629499418
-------- Message d'origine -------- De : fn@imm.dtu.dk Date : 2019/08/21 15:49 (GMT+01:00) À : wikidata@lists.wikimedia.org Objet : Re: [Wikidata] Important, Critical issues related to Wikidata
On 21/08/2019 13:20, Houcemeddine A. Turki wrote:
Dear Ms., I thank you for your efforts. I tried to contact Ms. Léa Lacroix and Ms. Lydia. However, I failed due to my participation to many sessions. I saw Ms. Lydia when returning to the hotel on the third day. But, I had to go for the old town. I had nine points to raise:
- Instance of and Subclass of are not well defined for users although
they are quite different. I ask if these two properties can be merged as is-a. This will be easier to process by users and developers.
I would say in most Semantic Web applications there is a distinction between class and instance. I wonder if you can give examples of the problem. I see that diseases are instances of diseases which may be contested. Are you aware of WikiProject Ontology
https://www.wikidata.org/wiki/Wikidata:WikiProject_Ontology
- There is a large misuse of Wikidata properties. I ask if it will be
possible to add description logics to Wikidata properties so that such matters will not happen.
We have property suggestions and ShEx which might help. There is also SQID webservice https://tools.wmflabs.org/sqid which is of help.
- There is a lack of several labels, statements and references in
Wikidata. I ask if it will be interesting for Wikimedia Deutschland to work with us on using citation indexes to enrich and verify the information provided by Wikidata.
I wonder if you could be more concrete here.
It more sounds like editor issues?
- Legal issues related to data integration into Wikidata still exist.
There was a session in Wikimania about this. However, this should be ameliorated.
I which way should it be ameliorated? Should we leave CC0?
- Some statements are more important than other ones. For example,
medical signs that rarely exist or that are not specific should given less weight than important signs. Using fuzzy logic is important here.
That has also bothered me. To a certain extent qualifiers may be used.
- Several CC-0 Online Lexicons already exist. I ask if there is a
possibility to speed up integrating them to LexData.
Could you give some pointers to this? For modern Danish, I know Q66371001 which is CC0, but not much more. I am under the impression that Basque has been setup automatically presumable from an existing resource.
- Wikidata property constraints should be ameliorated. For example,
medical signs of a disease should be associated to the corresponding statuses of the disease.
I suppose one could create items for stages.
- Wikidata Labels can be used to translate Mediawiki messages using the
principle of Reasonator.
Perhaps to a certain extent. Currently translatewiki is used. That has review option. I am not sure that Wikidata can be used to translated messages such as "Write a helpful note for {{GENDER:$1|$1}} and future reviewers."
I am not sure what "the principle of Reasonator" is.
- In a Wikidata statement, the object can be a triple. For example, X
Known for (X Co-founder of Y). I ask if this fact can become supported by Wikidata.
Qualifiers to a certain extent makes that possible. Perhaps new properties are necessary.
best regards Finn
I ask if we can have an online meeting to discuss these nine points Yours Sincerely, Houcemeddine Turki (he/him) Medical Student, Faculty of Medicine of Sfax, University of Sfax, Tunisia Undergraduate Researcher, UR12SP36 GLAM and Education Coordinator, Wikimedia TN User Group Member, WikiResearch Tunisia Member, Wiki Project Med Member, WikiIndaba Steering Committee Member, Wikimedia and Library User Group Steering Committee Co-Founder, WikiLingua Maghreb Founder, TunSci ____________________ +21629499418
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
_______________________________________________ Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
On Wed, Aug 21, 2019 at 8:44 PM Houcemeddine A. Turki turkiabdelwaheb@hotmail.fr wrote:
I thank you for your efforts. It was an honour for me to meet you in Stockholm. Concerning the points you raised, I
- Concerning the problematic use of Instance of and SubClass of, I invite you to see https://tinyurl.com/y29lx9o4. This seems to be not accurate as drugs differ from drug classes from a pharmacological view. For example, Artemisinin, Ibuprofen and Sobosbuvir are drugs. However, drug classes should include antibiotics and antivirals. Concerning WikiProject Ontology, I know it. The project succeeded to solve many matters related to the structure of Wikidata. However, the project is slow.
To me this is actually a nice example of *why* we need to distinction.
Sobosbuvir as in instance is often the active ingredient. Sobosbuvir as a class is often the formulation, which is not a single entity, but a class of formulations with different amounts (and concentrations) of active ingredient
It's just that in street/common language (not wrong, just different!) we don't use different words for it. Instance-vs-subclass are tools that help us make this distinction.
Now, related to this, plz have a look at https://www.wikidata.org/wiki/Wikidata:WikiProject_Chemistry where class and instance are both used, for good reasons.
Egon
- .I ask if Wikidata can involve a description logics layer for
properties that can be later used to find data inconsistencies within Wikidata.
This exactly one of the use cases we use ShEx for, i.e. inconsistency or disagreement between resources. Wikidata does not reflect the true, but (partly) reflect what other resources describe on a selected matter. There are many examples of Wikidata where different items state opposite statements. This is fine. Any issue known on Wikidata, provided the references are correct and the values are not distorted in the ingestion process, basically needs to be discussed with the primary source, not with Wikidata. ShEx provides the instrument to identify items relevant to your view, versus items disagreeing with your view on the matter.
4. We should either change CC-0 into CC-BY Licence or try to convince the
creators of ontologies to change the licence of their works into CC-0.
I am excited you are joining us in our quest for more CC0 data. More and more data is adopting the CC0 license. I really like the following blog on the matter https://osc.universityofcalifornia.edu/2016/09/cc-by-and-data-not-always-a-g...
7. I think that we should be an ontology for the stages of diseases.
Wikidata is not a primary source, point me to a ontology of stages of disease and I will add it to wikidata.