[Wiktionary-l] Fwd: [Wikidata] First experiment of lexicographical data is out

23 May 2018

      -------- Messaggio inoltrato --------
Oggetto: 	[Wikidata] First experiment of lexicographical data is out
Data: 	Wed, 23 May 2018 14:33:39 +0200
Mittente: 	Léa Lacroix lea.lacroix@wikimedia.de
Rispondi-a: 	Discussion list for the Wikidata project 
wikidata@lists.wikimedia.org
A: 	Discussion list for the Wikidata project. wikidata@lists.wikimedia.org
Hello all,
After several years discussing about it, and one year of development and 
discussion with the communities, the development team has now released 
the first version of lexicographical data support on Wikidata 
https://www.wikidata.org/wiki/Wikidata:Lexicographical_data.
Since the start of Wikidata in 2012, the multilingual knowledge base was 
mainly focused on concepts: Q-items are related to a thing or an idea, 
not to the word describing it. Starting now, Wikidata stores a new type 
of data: words, phrases and sentences, in many languages, described in 
many languages. This information will be stored in new types of 
entities, called Lexemes, Forms and Senses. It will allow editors to 
describe precisely all words in all languages, and will be reusable, 
just like the whole content of Wikidata, by multiple tools and queries, 
everything that the community creates to play with words. 
Lexicographical data can be reused inside and outside the Wikimedia 
projects, and can provide support for Wiktionary.
The first release
A new namespace and several new entity types have been created in order 
to model words and phrases. If you’re new to this project, you can learn 
more by looking at the documentation 
https://www.wikidata.org/wiki/Wikidata:Lexicographical_data/Documentation, 
briefly describing the data model and the interface. The technical 
structure is set, but the editors remain free to model and organize data 
as they prefer, with the usual open discussions and community processes 
that we apply on Wikidata. Some discussions about new properties 
https://www.wikidata.org/wiki/Wikidata:Property_proposal/Lexemes to 
create have already started: if you want to be involved in the early 
stage of the project to shape it, please participate!
Please note that the version that is now deployed is a first experiment, 
that will be continuously improved in the future. Some features are 
missing, some bugs may certainly occur. Here are the features that are 
included in the first release:
* Add, edit and delete Lexemes, Forms, statements, qualifiers, references
   * Link between the different entity types (Item to Lexeme, Form to
     Item, etc.)
   * Entity suggestion when adding a property or a value
And the following features will not be included in the first version, 
but are planned for the future:
* Find Lexemes and Forms via Special:Search
   * RDF support (which also means: the ability to query it with
     query.wikidata.org http://query.wikidata.org)
   * Support for Senses
   * Merging of Lexemes
   * Including the data on other Wikimedia projects, such as Wiktionary
How to try it?
The features described above are now deployed on Wikidata.org. Here are 
some suggestions of what you can do to explore this new territory:
* If you’re not familiar with the structure of Lexemes, have a look at
     the documentation
https://www.wikidata.org/wiki/Wikidata:Lexicographical_data/Documentation
   * Look at what is already existing
https://www.wikidata.org/wiki/Special:AllPages?from=&to=&namespace=146.
     Please note that Special:Search and the search bar on the top right
     corner of pages is not supporting Lexemes yet. We’re working on this.
   * Create a new Lexeme with Special:NewLexeme
     https://www.wikidata.org/wiki/Special:NewLexeme
   * If a property that you need is missing, you can suggest it here
     https://www.wikidata.org/wiki/Wikidata:Property_proposal/Lexemes
   * Discuss about how to model words and ask questions on Wikidata
     talk:Lexicographical data
   * Report bugs or issues that you may encounter: either on the talk
     page or on Phabricator, if you’re comfortable using it (create a
     task, add the tag |Lexicographical data|, and add Lea_Lacroix_(WMDE)
     as a subscriber)
About mass imports and tools
We kindly ask you to *not plan any mass import from any source for the 
moment*. There are several reasons behind that: first of all, like 
mentioned above, the release is a first version and we need to observe 
how our system reacts to the manual edits before starting considering 
automatic ones. The system may not be ready for big massive imports at 
the beginning. Second reason is legal. Lexicographical data in Wikidata 
is released under CC0, and the responsibility of each editor is to make 
sure that the data they will add is compatible with CC0. For more 
information, you can have a look at the advice of WMF Legal team 
https://meta.wikimedia.org/wiki/Wikilegal/Lexicographical_Data. 
Finally, we strongly encourage you to discuss with the communities 
before considering any import from the Wiktionaries. Wiktionary editors 
have been putting a lot of efforts during years to build definitions, 
and we should be respectful of this work, and discuss with them to find 
common solutions to work on lexicographical data and enjoy the use of it 
together.
We also suggest you to wait a bit before building tools or scripts on 
the top of lexicographical data. The interface and its API are probably 
going to evolve during the next months, and the system may not be stable 
enough to support such tools. We will inform you as soon as it will be 
possible.
Next steps
After this first release, some improvements will be made on a very 
regular basis (new deployments every week). Once you tried playing with 
the new data, feel free to give us feedback. We’re looking especially to 
know what are the most important features for you to be worked on next.
* What did you experiment while editing lexicographical data? What
     went wrong or was unexpected?
   * What bugs or troubles during the process did you encounter?
   * What are the features that are, in your opinion, the most important?
     Which one should we work on next?
If you’re interested in following the discussions and further 
announcements about lexicographical data, I encourage you to follow 
Wikidata:Lexicographical data 
https://www.wikidata.org/wiki/Wikidata:Lexicographical_data and its 
talk page, where we will discuss about how to organize and structure 
data, new features to be added, ideas of tools and queries, and a lot of 
other things.
Additional note: with this new kind of data enabled on Wikidata, we 
expect some new editors to get interest in it, edit Lexemes, suggest 
properties or ask questions. They may not be familiar with all of our 
community processes and our ways to organize content. They will need 
help and support as well as links to useful resources to understand how 
the Wikidata community works. I hope that we will all be kind and 
patient, both with other editors and with the software that may not work 
exactly as we want it to at the beginning :)
Thanks to the people who tested the model and the interface before the 
release, who showed support and curiosity about lexicographical data on 
Wikidata!
If you have any question or idea, feel free to write on Wikidata 
talk:Lexicographical data or contact me.
-- 
Léa Lacroix
Project Manager Community Communication for Wikidata

Wikimedia Deutschland e.V.
Tempelhofer Ufer 23-24
10963 Berlin
www.wikimedia.de http://www.wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.

Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg 
unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das 
Finanzamt für Körperschaften I Berlin, Steuernummer 27/029/42207.

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

[Wiktionary-l] Fwd: [Wikidata] First experiment of lexicographical data is out