[Foundation-l] [Commons-l] Wikidata

Jan Kucera (Kozuch) garbage5 at seznam.cz
Sat Nov 27 11:23:54 UTC 2010


Hi there,

so how do we move forward with Wikidata? There is a bunch of proposals both on Strategy and Meta, but I guess we need a clearly dedicated place for serious discussion on topic. So lets either create a wiki on data.wikimedia.org or a dedicated mailing list here... or both.

Kozuch

> ------------ Původní zpráva ------------
> Od: Erik Moeller <erik at wikimedia.org>
> Předmět: Re: [Foundation-l] [Commons-l] Wikidata
> Datum: 24.11.2010 20:25:37
> ----------------------------------------
> Hi all,
> 
> as you may know I've been involved in the structured data community
> for a few years (through the original "Wikidata" proposal in 2004 as
> well as architecting and developing OmegaWiki, together with the
> OpenProgress team and others from 2005-2007). I've been following
> Semantic MediaWiki, Freebase and other projects from the beginning.
> You don't need to sell me on the value or importance of structured
> data.
> 
> The problem space is very complex, especially when taking into account
> that Wikimedia is a fully multilingual system. There are still low
> hanging fruits, especially for a project like Wikimedia Commons, but I
> agree w/ Michael that a more holistic approach to how to access and
> manage data in WMF projects is much preferable to, for example,
> throwing SMW into some wikis and not others, etc.
> 
> When I joined WMF, I couldn't justify arguing for higher priority on
> data tech projects more so than, for example, the 2009-10 usability
> initiative and continuing efforts in this area, especially given that
> we still have only a tiny engineering staff. I don't believe that
> structured data is going to be the principal driver of participation
> -- that problem space is more about social and technical barriers to
> entry, interaction with new users, mentoring, etc. And we're
> continuing to fall behind the rest of the web in terms of usability.
> 
> That being said, it's clear that it's a key enabling technology
> (including for _some_ usability improvements, although many of them
> can be made without a full-fledged structured data support system). I
> particularly think it has huge potential in bootstrapping small
> languages by more closely interconnecting useful and translatable bits
> of information (start a page about "Germany" in a new language and
> immediately pull all relevant data, possibly including translations of
> labels if available).
> 
> Danese and I have been working on a "Data Summit" this year to bring
> together both the key players in the structured data field (DBPedia,
> SMW, etc.), as well as some of the research and analytics community.
> Unfortunately we've had to reschedule it, but it'll happen in Q1 2011.
> We're not going to be able to dedicate lots of resources to
> engineering in this area in the near future, but since there are
> already so many disparate efforts that focus on making WP data usable,
> we do hope that we can partner up with others to move things forward.
> 
> In a nutshell, I think we should aim to establish a “Wikidata Commons”
> project at data.wikimedia.org which serves all Wikimedia projects with
> structured data in a language-neutral fashion, analog to “Wikimedia
> Commons” for multimedia files, and which becomes the central location
> to curate, maintain and discuss such data. Wikidata Commons should
> provide standard interfaces for querying, importing, and exporting
> data. This project could be built incrementally (starting with clunky
> but reasonably future-proof ways to manage and retrieve data).
> 
> The key challenges as I see them continue to be, as ever: 1)
> maintaining predictable and reasonable system performance as the DB
> scales, more and increasingly complex queries are performed, etc., 2)
> consistently improving rather than degrading user experience, 3)
> handling multilingual representations of all translatable content well
> without giving undue prominence to any one language, 4) effectively
> caching and purging data wherever it's used, 5)
> versioning/transactioning relational data to be maximally useful and
> conducive to collaboration.
> 
> Earlier this week, Danese and I met with Denny Vrandecic from SMW,
> who's recently put together a prototype called "Shortipedia" that
> allows language-independent (using multilingual labels) annotation of
> concepts with SMW-style properties through a minimal form-based
> interface, interfacing with whichever triple store is configured for
> SMW. It's still very much a hack, and he's aiming to clean it up for
> the summit. But it looks potentially very interesting, and like a
> concept we could rally energy behind. The data from such a repository
> could then be pulled into WP templates, accessed through "wizards"
> that auto-generate template data for new articles, etc.
> 
> Anyone who wants to advance the thinking in this space should also
> consider what can be done today with Wikimedia Commons and SMW. Since
> Wikimedia Commons is an intrinsically multilingual database with focus
> on annotating individual files, its operational requirements are
> somewhat different from those of most other projects. It would be
> useful to have an instance of SMW running using a copy of the
> Wikimedia Commons database and possibly Semantic Forms to see what
> such annotation could look like in practice. Anyone with time and
> technical skills can put together prototypes like this that'll help us
> move forward.
> 
> Again, I think the likely path forward here is for us to ally
> effectively with the key players in the space, rather than doing all
> the work ourselves.
> 
> -- 
> Erik Möller
> Deputy Director, Wikimedia Foundation
> 
> Support Free Knowledge: http://wikimediafoundation.org/wiki/Donate
> 
> _______________________________________________
> foundation-l mailing list
> foundation-l at lists.wikimedia.org
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
> 
> 
> 



More information about the foundation-l mailing list