Hoi, I am extremely happy that I can inform you on Boxing day that a tired Erik has produced the first tangible result of the Wikidata / Ultimate Wiktionary project. It shows that we want great content in many languages, that we want to include thesaurus information and that we are happy to include with gratitude content like the Gemet thesaurus.
:) I want to thank Erik for working really hard to make this happen :)
Thanks, Gerard ..
This is the text of Erik's E-mail to the Wikitech mailinglist: ***************************** Ho ho ho,
we now have our first read-only prototype of Ultimate Wiktionary / Wikidata, using a subset of the final UW design. This subset is a complex, versioned relational database that can model - lexical items (words, short phrases) with multiple meanings - synonyms and translations, on the meaning level - other relationships between them, on the meaning level.
The prototype is at: http://epov.org/wd-gemet/index.php/Main_Page
It contains over 70,000 words in 22 languages; many of them have definitions. The definitions usually come in 4 languages. As I understand it, we can have this data under the GFDL, but it's just one of many building blocks we will use in seeding the UW.
There will be at least one significant upgrade to this protoype before the end of the year. All the tables and fields for versioning complex relations without ballooning up the database are already there. I'm not sure if the model I have in mind for versioning works yet, and I hope to test and demonstrate it soon. (Versioning, in my opinion, is the single greatest challenge for Wikidata.) I also want to show how we try to "eat our own dogfood" in Ultimate Wiktionary by localizing the user interface using the content of the dictionary.
All the records are already hooked up to pages and revisions, so you can use [[Special:Allpages]] and the like to navigate. When there are identical words in different languages, all the translations and definitions are shown on the same page.
Our goal with Ultimate Wiktionary is to provide an even more complex application that will make this data collaboratively editable, to add dynamic user-based views, APIs, and crucial features such as inflections, etymologies, complex relations and attributes, and much more. This will be a huge challenge. Fortunately, more funding seems to be on the horizon, allowing us to put more developers on the job.
Ultimate Wiktionary is just one application of Wikidata, and we will try to generalize as much functionality as possible, so that it will be reasonably straightforward to build new Wikidata apps. In particular, versioning and all basic relation types should be handled on the Wikidata level. There are thousands of possible new applications for Wikimedia and other MediaWiki users if we get this right.
Please take a look at the prototype. The view component is a quick and dirty hack, but the backend is approaching some stability. There are some small inconsistencies in the data here and there, some of them inherited from GEMET. Due to time constraints, I also had to stop the import at about 80% leading to a few red links; I'll try to import the remaining terms in the next few days.
Finally, expect a paper explaining some of the key ideas of Wikidata and UW, showing the first user interface prototypes, and defining future development milestones and applications. I will also try to describe some of the forthcoming changes to the MediaWiki core that come with the need of Wikidata to handle multiple languages in one instalation; these changes will benefit multi-language projects like Meta and Commons.
I will be at the 22C3 on December 30 to demonstrate this prototype as well as the completed namespace manager, and to answer questions.
Best,
a very tired Erik _______________________________________________ Wikitech-l mailing list Wikitech-l@wikimedia.org http://mail.wikipedia.org/mailman/listinfo/wikitech-l