I have a blog http://ultimategerardm.blogspot.com/ that deals almost
exclusively with the Ultimate Wiktionary project.
Today I was helped to add a "Site Feed" to my blog. This makes it more
convenient to follow my blog.
I am extremely happy that I can inform you on Boxing day that a tired
Erik has produced the first tangible result of the Wikidata / Ultimate
Wiktionary project. It shows that we want great content in many
languages, that we want to include thesaurus information and that we
are happy to include with gratitude content like the Gemet thesaurus.
:) I want to thank Erik for working really hard to make this happen :)
This is the text of Erik's E-mail to the Wikitech mailinglist:
Ho ho ho,
we now have our first read-only prototype of Ultimate Wiktionary /
Wikidata, using a subset of the final UW design. This subset is a
complex, versioned relational database that can model
- lexical items (words, short phrases) with multiple meanings
- synonyms and translations, on the meaning level
- other relationships between them, on the meaning level.
The prototype is at:
It contains over 70,000 words in 22 languages; many of them have
definitions. The definitions usually come in 4 languages. As I
understand it, we can have this data under the GFDL, but it's just one
of many building blocks we will use in seeding the UW.
There will be at least one significant upgrade to this protoype before
the end of the year. All the tables and fields for versioning complex
relations without ballooning up the database are already there. I'm not
sure if the model I have in mind for versioning works yet, and I hope to
test and demonstrate it soon. (Versioning, in my opinion, is the single
greatest challenge for Wikidata.) I also want to show how we try to "eat
our own dogfood" in Ultimate Wiktionary by localizing the user interface
using the content of the dictionary.
All the records are already hooked up to pages and revisions, so you can
use [[Special:Allpages]] and the like to navigate. When there are
identical words in different languages, all the translations and
definitions are shown on the same page.
Our goal with Ultimate Wiktionary is to provide an even more complex
application that will make this data collaboratively editable, to add
dynamic user-based views, APIs, and crucial features such as
inflections, etymologies, complex relations and attributes, and much
more. This will be a huge challenge. Fortunately, more funding seems to
be on the horizon, allowing us to put more developers on the job.
Ultimate Wiktionary is just one application of Wikidata, and we will try
to generalize as much functionality as possible, so that it will be
reasonably straightforward to build new Wikidata apps. In particular,
versioning and all basic relation types should be handled on the
Wikidata level. There are thousands of possible new applications for
Wikimedia and other MediaWiki users if we get this right.
Please take a look at the prototype. The view component is a quick and
dirty hack, but the backend is approaching some stability. There are
some small inconsistencies in the data here and there, some of them
inherited from GEMET. Due to time constraints, I also had to stop the
import at about 80% leading to a few red links; I'll try to import the
remaining terms in the next few days.
Finally, expect a paper explaining some of the key ideas of Wikidata and
UW, showing the first user interface prototypes, and defining future
development milestones and applications. I will also try to describe
some of the forthcoming changes to the MediaWiki core that come with the
need of Wikidata to handle multiple languages in one instalation; these
changes will benefit multi-language projects like Meta and Commons.
I will be at the 22C3 on December 30 to demonstrate this prototype as
well as the completed namespace manager, and to answer questions.
a very tired Erik
Wikitech-l mailing list