Many people have asked me about the status of Wikidata. My response is in CVS ;-) (wikidata branch). It's largely a redesign of the existing namespace system to be more flexible and powerful. Namespaces form the foundation of Wikidata, as each namespace will be associated with a virtual table schema. This has necessitated the changes so far, but these will also be useful for most other Wikimedia projects.
For example, these changes make it possible to define an arbitrary number of synonyms for each namespace, so that we can finally rename "Image:" to "File:" without breaking links (and additional synonyms like "Video:" and "Sound:"). Another new feature are default prefixes, which make it possible to define that any unprefixed link [[Bla]] in one namespace will be be prefixed by "Foo:". This could be useful for Wikibooks, esp. large spaces like Cookbook: and Jokebook:, so that you can simply type "Carrots" and it goes to [[Cookbook:Carrots]].
Together with the existing namespace filtering, this effectively allows the creation of "wikis within a wiki". What namespaces are needed can be decided by each project community without developer interaction, as the new namespace manager is a special page that a user group can be given access to.
As you hopefully will agree, the changes that are part of Wikidata don't just bloat MediaWiki up - they add useful functionality for all projects.
See my CVS commit comment and the Namespace.php docs for further detail. There's still quite a bit of coding, hacking and testing to do, but I anticipate finalizing the first milestone in the coming days. This will also include a design whitepaper for the Wikidata table design, and some GUI demos.
From now on, I will commit more often -- this is important, I think, to ensure that people see that work is being done.
What's the timetable? My guesstimates so far have been mostly wrong, so I'm careful not to make too exact predictions, especially when certain design decisions haven't been made yet. But I'm fully committed to further implementing Wikidata and Ultimate Wiktionary in the coming months. I'll have to take a two-week-break soon to make some extra money, however.
If you want to work on Wikidata with me, please contact me, and I can hand out a few tasks. Also, I always appreciate comments on my code -- what is ugly, what should be refactored, etc.
Best,
Erik
Does this major change allow writers to have flexible synonyms handling, like an active list? If so (that would be great) I can stop buggin everybody with the redirect/synonyms/fuzzy-system question. Sorry, but I'm a real dummy in wiki system spec.
Regards V.F.
___________________________________ Yahoo! Mail: gratis 1GB per i messaggi e allegati da 10MB http://mail.yahoo.it
On 31/08/05, Valentina Faussone valentina_faussone@yahoo.it wrote:
Does this major change allow writers to have flexible synonyms handling, like an active list?
This is only talking about *namespaces*, not titles - i.e. the part before the ":". So, "WP:" could become a proper alias for "Wikipedia:" such that "WP:Foo" meant the same as "Wikipedia:Foo". Similarly, "File:Foo", "Image:Foo", and "Sound:Foo" could all refer to the same location. So no, this doesn't help with ordinary redirects between articles.
Erik Moeller wrote:
See my CVS commit comment and the Namespace.php docs for further detail. There's still quite a bit of coding, hacking and testing to do, but I anticipate finalizing the first milestone in the coming days. This will also include a design whitepaper for the Wikidata table design, and some GUI demos.
A couple DB-related notes on my first quick readthrough:
# Pseudo-namespaces (title prefixes) $match = $dbs->addQuotes($name.":%"); $res = $dbs->select( 'page', array('page_title'), array('page_title LIKE '.$match), $fname, array('LIMIT'=>1) );
Two things you probably want to do here are to add a clause 'page_namespace' => 0 (otherwise it may end up an unindexed table scan which will be *very* slow and may get false matches on non-main namespaces), and to escape any _ or % chars in the name for that LIKE match.
If we don't already have a public function for LIKE escaping we probably should add one and use that consistently.
$match=$dbs->addQuotes("%[[$name:%");
# Query needs to be optimized/simplified, # but will generally be run very rarely. $res = $dbs->select( array('page', /* FROM */ 'pagelinks', 'revision', 'text'), array('DISTINCT page_title', 'page_namespace'), array('pl_namespace='.$index, 'page_id=pl_from', 'rev_id=page_latest', 'rev_text_id=old_id', 'old_text like '.$match), array('LIMIT'=>1)
I'm afraid this one just isn't going to work at all... Text entries in text.old_text will often be either compressed or indirection records pointing to the secondary text storage cluster. You also have the issues of case sensitivity, spaces vs underscores, etc.
From now on, I will commit more often -- this is important, I think, to ensure that people see that work is being done.
Cool!
-- brion vibber (brion @ pobox.com)
wikitech-l@lists.wikimedia.org