Re: [WikimediaMobile] What people think about Wikidata descriptions in search on mobile web beta, and a question about "arbitrary access" of Wikidata data

15 Aug 2015

On Sat, Aug 15, 2015 at 1:17 PM Lydia Pintscher <
lydia.pintscher(a)wikimedia.de&gt; wrote:

...

 On Aug 15, 2015 14:06, "Magnus Manske" &lt;magnusmanske(a)googlemail.com&gt;
 wrote:

 On Sat, Aug 15, 2015 at 7:38 AM Lydia Pintscher < 
lydia.pintscher(a)wikimedia.de&gt; wrote:
 >
> On Sat, Aug 15, 2015 at 3:43 AM, Dan Garry &lt;dgarry(a)wikimedia.org&gt; 
wrote:
 > > I've seen arguments on both sides
here. Some say automatically  generated
 > > descriptions are not good enough. Some
say they are. Why don't we  gather
    some data on this and use that to decide what's
right? :-) 
 Please do. Especially pay attention to languages other than English
 though. Because even if we get algorithms to write good descriptions
 for English are we going to do the same for all the other languages?
 Especially those where grammar is tricky and Wikidata doesn't even
 have the necessary information to make the grammar right? The other
 tricky side is determining why something is actually notable. That's
 not a trivial thing to determine based on the data we have.

 And you know very well that (AFAIK) I am the only one who actually  worked on this,
in a tiny fraction of my spare time, and I only speak
 German and English.

 The /real/ questions here are:
 1. The language that are actually implemented, are they returning  descriptions
that are good/OK/bad/plain wrong
  2. What could be achieved, on the existing or
similar infrastructure, in  a short period of time, if we drive to get code snippets
(or equivalent)
 for other languages from volunteers?
  3. What could be achieved, medium/long term, if
we had a proper linguist  to work on the problem? Or someone who has worked with
multi-language text
 generation before?

 I've just been winging it so far. Current auto-descriptions are not the  best
we can do. They are, frankly, the WORST we can do. This is a starting
 point, not the end product.

 Yeah I understand. And this is not a criticism of your work. I think it is
 actually rather cool. It is questioning if it is a good idea to continue to
 push it to get into production on Wikipedia on a large scale.
 With that, I agree wholeheartedly.

There might be a point of doing an "extended prototype" though, before
going to production (as much as I'd like that). What languages would be
easy, hard, impossible? Would this work as a stand-alone project (e.g.
dedicated VM), or as an extension of wikibase (flexibility vs. convenient
integration)? What open source code is already out there we could use?
Anyone in WMF/chapters who has experience in text generation? Anyone in
WMF/chapters who speaks a "small" language who could help set up an example
generator for that? What are the major item "classes" on Wikidata to be
covered with special code, beyond the obvious "human bio"?

And we'd need someone to run this. As much as I'd like to, I'm stretched
too thin as it is...

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

Re: [WikimediaMobile] What people think about Wikidata descriptions in search on mobile web beta, and a question about "arbitrary access" of Wikidata data