Re: [Wikidata] First experiment of lexicographical data is out

24 May 2018

      I was too busy yesterday testing, and I forgot to share my love for your
efforts. :) Today I have another lesson in which I'll talk about Wikidata,
and I'll be sure to say that finally LD started! Thank you!
L.
Il gio 24 mag 2018, 10:19 Nicolas VIGNERON vigneron.nicolas@gmail.com ha
scritto:
...
THANK YOU!
I've been expecting this for so long.
@Eran: Q17117425 could made nice lexemes :P
@Micru: even without senses, there is a lot of possible application (I'd
love to have a spell-checker for Wikisource, no sense needed here).
Cheers,
~nicolas
2018-05-24 10:01 GMT+02:00 David Cuenca Tudela dacuetu@gmail.com:
...
Congratulations on releasing this!
I think the usefulness will improve after releasing Senses, but it is
good that we have something to play with and start suggesting improvements.
Regards,
Micru
On Wed, May 23, 2018 at 11:25 PM, Eran Rosenthal eranroz89@gmail.com
wrote:
...
Kudos to the development team. It's great to see it is finally in
production.
There is still a long way to go to have a more friendly user interface,
but Q17117425 :)
On Wed, May 23, 2018 at 9:12 PM, Gerard Meijssen <
gerard.meijssen@gmail.com> wrote:
...
Hoi,
Sorry the Q-number refers to a thing or an idea absolutely. The labels
don't; they describe the idea and with the possible exception of special
cases like templates lists and categories these labels can be associated
with lexical information. It is vitally important to have lexical
information about them because they will spark the development of better
use of the information inherently available in the Q-items.
Thanks,
       GerardM
On 23 May 2018 at 14:33, Léa Lacroix lea.lacroix@wikimedia.de wrote:
...
Hello all,
After several years discussing about it, and one year of development
and discussion with the communities, the development team has now released
the first version of lexicographical data support on Wikidata
https://www.wikidata.org/wiki/Wikidata:Lexicographical_data.
Since the start of Wikidata in 2012, the multilingual knowledge base
was mainly focused on concepts: Q-items are related to a thing or an idea,
not to the word describing it. Starting now, Wikidata stores a new type of
data: words, phrases and sentences, in many languages, described in many
languages. This information will be stored in new types of entities, called
Lexemes, Forms and Senses. It will allow editors to describe precisely all
words in all languages, and will be reusable, just like the whole content
of Wikidata, by multiple tools and queries, everything that the community
creates to play with words. Lexicographical data can be reused inside and
outside the Wikimedia projects, and can provide support for Wiktionary.
The first release
A new namespace and several new entity types have been created in
order to model words and phrases. If you’re new to this project, you can
learn more by looking at the documentation
https://www.wikidata.org/wiki/Wikidata:Lexicographical_data/Documentation,
briefly describing the data model and the interface. The technical
structure is set, but the editors remain free to model and organize data as
they prefer, with the usual open discussions and community processes that
we apply on Wikidata. Some discussions about new properties
https://www.wikidata.org/wiki/Wikidata:Property_proposal/Lexemes to
create have already started: if you want to be involved in the early stage
of the project to shape it, please participate!
Please note that the version that is now deployed is a first
experiment, that will be continuously improved in the future. Some features
are missing, some bugs may certainly occur. Here are the features that are
included in the first release:

Add, edit and delete Lexemes, Forms, statements, qualifiers,

references

Link between the different entity types (Item to Lexeme, Form to

Item, etc.)

Entity suggestion when adding a property or a value

And the following features will not be included in the first version,
but are planned for the future:

Find Lexemes and Forms via Special:Search
RDF support (which also means: the ability to query it with

query.wikidata.org)

Support for Senses
Merging of Lexemes
Including the data on other Wikimedia projects, such as

Wiktionary
How to try it?
The features described above are now deployed on Wikidata.org. Here
are some suggestions of what you can do to explore this new territory:

If you’re not familiar with the structure of Lexemes, have a

look at the documentation
   https://www.wikidata.org/wiki/Wikidata:Lexicographical_data/Documentation

Look at what is already existing

https://www.wikidata.org/wiki/Special:AllPages?from=&to=&namespace=146.
   Please note that Special:Search and the search bar on the top right corner
   of pages is not supporting Lexemes yet. We’re working on this.

Create a new Lexeme with Special:NewLexeme

https://www.wikidata.org/wiki/Special:NewLexeme

If a property that you need is missing, you can suggest it here

https://www.wikidata.org/wiki/Wikidata:Property_proposal/Lexemes

Discuss about how to model words and ask questions on Wikidata

talk:Lexicographical data

Report bugs or issues that you may encounter: either on the talk

page or on Phabricator, if you’re comfortable using it (create a
   task, add the tag Lexicographical data, and add Lea_Lacroix_(WMDE)
   as a subscriber)
About mass imports and tools
We kindly ask you to *not plan any mass import from any source for
the moment*. There are several reasons behind that: first of all,
like mentioned above, the release is a first version and we need to observe
how our system reacts to the manual edits before starting considering
automatic ones. The system may not be ready for big massive imports at the
beginning. Second reason is legal. Lexicographical data in Wikidata is
released under CC0, and the responsibility of each editor is to make sure
that the data they will add is compatible with CC0. For more information,
you can have a look at the advice of WMF Legal team
https://meta.wikimedia.org/wiki/Wikilegal/Lexicographical_Data.
Finally, we strongly encourage you to discuss with the communities before
considering any import from the Wiktionaries. Wiktionary editors have been
putting a lot of efforts during years to build definitions, and we should
be respectful of this work, and discuss with them to find common solutions
to work on lexicographical data and enjoy the use of it together.
We also suggest you to wait a bit before building tools or scripts on
the top of lexicographical data. The interface and its API are probably
going to evolve during the next months, and the system may not be stable
enough to support such tools. We will inform you as soon as it will be
possible.
Next steps
After this first release, some improvements will be made on a very
regular basis (new deployments every week). Once you tried playing with the
new data, feel free to give us feedback. We’re looking especially to know
what are the most important features for you to be worked on next.

What did you experiment while editing lexicographical data? What

went wrong or was unexpected?

What bugs or troubles during the process did you encounter?
What are the features that are, in your opinion, the most

important? Which one should we work on next?
If you’re interested in following the discussions and further
announcements about lexicographical data, I encourage you to follow Wikidata:Lexicographical
data https://www.wikidata.org/wiki/Wikidata:Lexicographical_data
and its talk page, where we will discuss about how to organize and
structure data, new features to be added, ideas of tools and queries, and a
lot of other things.
Additional note: with this new kind of data enabled on Wikidata, we
expect some new editors to get interest in it, edit Lexemes, suggest
properties or ask questions. They may not be familiar with all of our
community processes and our ways to organize content. They will need help
and support as well as links to useful resources to understand how the
Wikidata community works. I hope that we will all be kind and patient, both
with other editors and with the software that may not work exactly as we
want it to at the beginning :)
Thanks to the people who tested the model and the interface before the
release, who showed support and curiosity about lexicographical data on
Wikidata!
If you have any question or idea, feel free to write on Wikidata
talk:Lexicographical data or contact me.
--
Léa Lacroix
Project Manager Community Communication for Wikidata
Wikimedia Deutschland e.V.
Tempelhofer Ufer 23-24
10963 Berlin
www.wikimedia.de
Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt
für Körperschaften I Berlin, Steuernummer 27/029/42207.

Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata
--
Etiamsi omnes, ego non

Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

Re: [Wikidata] First experiment of lexicographical data is out