Hello all,
I’m very happy to announce that another important feature for Lexicographical Data https://www.wikidata.org/wiki/Wikidata:Lexicographical_data has been deployed: the ability to *query Lexemes in the Query Service*.
Here are a few examples:
- List of the longest words in English https://query.wikidata.org/#SELECT%20DISTINCT%20%3Fl%20%3Fword%20%3Flen%20WHERE%20%7B%0A%20%7B%0A%20%20%20%3Fl%20a%20ontolex%3ALexicalEntry%20%3B%20dct%3Alanguage%20wd%3AQ1860%20%3B%20wikibase%3Alemma%20%3Fword%0A%20%20%20BIND%28strlen%28%3Fword%29%20as%20%3Flen%29%20%20%0A%20%20%7D%20UNION%20%7B%0A%20%20%20%3Fl%20a%20ontolex%3ALexicalEntry%20%3B%20dct%3Alanguage%20wd%3AQ1860%20%3B%20ontolex%3AlexicalForm%2Fontolex%3Arepresentation%20%3Fword%0A%20%20%20BIND%28strlen%28%3Fword%29%20as%20%3Flen%29%20%20%0A%20%20%7D%0A%7D%20%0Aorder%20by%20DESC%28%3Flen%29%20%0ALIMIT%2020, or in German https://query.wikidata.org/#SELECT%20DISTINCT%20%3Fl%20%3Fword%20%3Flen%20WHERE%20%7B%0A%20%7B%0A%20%20%20%3Fl%20a%20ontolex%3ALexicalEntry%20%3B%20dct%3Alanguage%20wd%3AQ188%20%3B%20wikibase%3Alemma%20%3Fword%0A%20%20%20BIND%28strlen%28%3Fword%29%20as%20%3Flen%29%20%20%0A%20%20%7D%20UNION%20%7B%0A%20%20%20%3Fl%20a%20ontolex%3ALexicalEntry%20%3B%20dct%3Alanguage%20wd%3AQ188%20%3B%20ontolex%3AlexicalForm%2Fontolex%3Arepresentation%20%3Fword%0A%20%20%20BIND%28strlen%28%3Fword%29%20as%20%3Flen%29%20%20%0A%20%20%7D%0A%7D%20%0Aorder%20by%20DESC%28%3Flen%29%20%0ALIMIT%2020 - Graph of all Lexemes https://query.wikidata.org/#%23defaultView%3AGraph%0ASELECT%20%3Flexeme%20%3FlexemeLabel%20%3Ftarget%20%3FtargetLabel%20WHERE%20%7B%0A%20%20%3Flexeme%20wdt%3AP5191%20%3Ftarget%3B%20wikibase%3Alemma%20%3FlexemeLabel.%0A%20%20%3Ftarget%20wdt%3AP5191*%20wd%3AL2087%3B%20wikibase%3Alemma%20%3FtargetLabel.%0A%7D derived from *wódr̥ (L2087) https://www.wikidata.org/wiki/Lexeme:L2087 - Grammatical genders that are most used in lexicographical data in Wikidata https://query.wikidata.org/#%23%20most%20common%20grammatical%20genders%0ASELECT%20%3Fgender%20%3FgenderLabel%20%3Fcount%20WITH%20%7B%0A%20%20SELECT%20%3Fgender%20%28COUNT%28%3Flexeme%29%20AS%20%3Fcount%29%20WHERE%20%7B%0A%20%20%20%20%3Flexeme%20a%20ontolex%3ALexicalEntry%3B%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20wdt%3AP5185%20%3Fgender.%0A%20%20%7D%0A%20%20GROUP%20BY%20%3Fgender%0A%7D%20AS%20%25results%20WHERE%20%7B%0A%20%20INCLUDE%20%25results.%0A%20%20SERVICE%20wikibase%3Alabel%20%7B%20bd%3AserviceParam%20wikibase%3Alanguage%20%22%5BAUTO_LANGUAGE%5D%2Cen%22.%20%7D%0A%7D%0AORDER%20BY%20DESC%28%3Fcount%29
The queries are based on the RDF mapping that you can find here https://www.mediawiki.org/wiki/Extension:WikibaseLexeme/RDF_mapping. Feel free to help improving the documentation, so people can understand how to build queries out of Lexemes.
Thank you very much to Tpt https://www.wikidata.org/wiki/User:Tpt who’s been doing a huge part of the work by mapping Lexemes in RDF, and Smalyshev (WMF) https://www.wikidata.org/wiki/User:Smalyshev_(WMF) who made the RDF dumps available and integrated in the Query Service.
Feel free to play with it, bring some of these ideas of queries https://www.wikidata.org/wiki/Wikidata:Lexicographical_data/Ideas_of_queries to life, and let us know if you find any issue or bug. These can be stored as subtasks of this one https://phabricator.wikimedia.org/T193645 on Phabricator.
If you have questions about Lexicographical Data in general, feel free to write on the talk page of the project https://www.wikidata.org/wiki/Wikidata_talk:Lexicographical_data. If you have specific questions about the integration in the Query Service, you can also ping Stas onwiki or on IRC.
Cheers,