[Wikimedia-l] The case for supporting open source machine translation

Wed Apr 24 11:36:36 UTC 2013

On 24 April 2013 11:35, Denny Vrandečić <denny.vrandecic at wikimedia.de> wrote:

> If we constrain b) a lot, we could just go and develop "pages to display
> for pages that do not exist yet based on Wikidata" in the smaller
> languages. That's a far cry from machine translating the articles, but it
> would be a low hanging fruit. And it might help with a desire which is
> evidently strongly expressed by the mass creation of articles through bots
> in a growing number of languages.

There has historically been a lot of tension around mass-creation of
articles because of the maintenance problem - we can create two
hundred thousand stubs in Tibetan or Tamil, but who will maintain
them? Wikidata gives us the potential of squaring that circle, and in
fact you bring it up here...

> II ) develop a feature that blends into Wikipedia's search if an article
> about a topic does not exist yet, but we  have data on Wikidata about that
> topic

I think this would be amazing. A software hook that says "we know X
article does not exist yet, but it is matched to Y topic on Wikidata"
and pulls out core information, along with a set of localised
descriptions... we gain all the benefit of having stub articles
(scope, coverage) without the problems of a small community having to
curate a million pages. It's not the same as hand-written content, but
it's immeasurably better than no content, or even an attempt at
machine-translating free text.

XXX is [a species of: fish] [in the: Y family]. It [is found in: Laos,
Vietnam]. It [grows to: 20 cm]. (pictures)

Wikidata Phase 4, perhaps :-)

-- 
- Andrew Gray
  andrew.gray at dunelm.org.uk