On Mon Feb 09 2015 at 12:08:01 Amir E. Aharoni <amir.aharoni@mail.huji.ac.il> wrote:
Manual descriptions are not an entire waste of time.

I never said that, so please don't put words in my mouth. I did say, on several occasions, that, for the vast majority of items, manual descriptions are a waste of volunteer time. For cases where "the point" of the item is not easily expressible as statements, manual descriptions are indeed useful. But the number of cases where this applies will only shrink, as we get more properties on Wikidata, and as the description generators get better.
 

Magnus writes in his post:
> And some people have seen my Reasonator tool, where (for some item types, and some languages) rather long descriptions can be generated.

It's not necessary gd that they are long. For the mobile app it's better if they are short.

Which is one reason my automatic description API defaults to short descriptions, unless you ask for a long one. Even then, it will default to short, unless the language/item type combination is covered. The short description generator covers more languages, and is easily extendable. Check out "stock" at
https://bitbucket.org/magnusmanske/autodesc/src/019b395c1bd5e13720e5cfda4df0311dd371ba60/www/js/short_autodesc.js?at=master

I think the results are, at the very least, understandable by humans; and yes, these things tend to get better very quickly (developers! developers! <throws chair>).

 

But the "some item types, and some languages" part is the real problem. Only some. It's quite possible that in the future Reasonator will cover all languages and all data types and will also be tweaked t provide appropriate length, maybe even different lengths according to context. Reasonator natural language sentence creation works for a very small number of languages. If it was as easy to translate it as it is to translate MediaWiki UI messages, I wouldn't object to its wider, but AFAIK this is not the case not.

And it's not that good for English either. Reasonator is not smart enough at the moment to describe people with several qualifications. The current Reasonator-generated description of Peter Garrett is vastly inferior to the manually-written description. Compare:
1. "Australian singer and politician, Minister for School Education, Early Childhood and Youth, Minister for Sustainability, Environment, Water, Population and Communities (Australia), and Member of the Australian House of Representatives (*1953) ♂"
2. "Australian politician and Midnight Oil lead singer".
Basic human intuition tells me that for most Wikipedia readers, who simply want to know "Who is Peter Garrett?", #2 is far more useful. #1 has oversize descriptions of all his political roles, and doesn't have the name the rock band that made him popular. This is just one example out of hundreds of thousands that could be brought up. For what it's worth, #2 is also easier to translate manually.

It's important to emphasize at this point that I have the utmost respect to Magnus's brilliant work. It's just not ready to completely replace the manual descriptions.

Thanks, and it should not. But a little developer time can save megahours (new unit!) of volunteers performing needless work.
 

A practical solution for now is to have a system for manual translation of descriptions, which shows the Reasonator descriptions as a translation aid, similarly to how the Translate extension shows translation memory suggestions. Also, a way to manually tweak descriptions can take Reasonator further, for example a way to tell it that for the Peter Garrett item there's no need to include a long list of all his roles in the Australian government.

Oh, and even if you can run away some day from manually translating descriptions, you cannot run away from manually translating labels. At most, some can be copied from Wikipedia, but even then many of them need post-import fixing.

So all of this brings me back to https://phabricator.wikimedia.org/T64695 .

Yes, labels are a different beast. Though there are some things we could automate; basically, all people with a German label and no English one could use the German one in English just as well. Those cases aside, automated translation can be helpful, but can also go horribly wrong in cases.
 


--
Amir Elisha Aharoni · אָמִיר אֱלִישָׁע אַהֲרוֹנִי
http://aharoni.wordpress.com
‪“We're living in pieces,
I want to live in peace.” – T. Moore‬

2015-02-09 12:58 GMT+02:00 Daniel Kinzler <daniel.kinzler@wikimedia.de>:
@Gerard, @Magnus: please help me out here.

I agree that automatic descriptions are very useful. I also think that in *some*
cases, manual descriptions are more useful, and maybe even needed.

I definitely think that 3rd party consumers of wikidata should not have to think
about whether descriptions have been written manually or were created
automatically. This should be completely transparent.

So, if you want to help with making automated description a reality, please make
suggestions that take into account the above points, and also consider the
mechanisms for language fallback.

The only thing that I can think of right away is simply inserting automated
descriptions by bot. This isn't ideal, but I can't think of a better solution
that wouldn't be hugely complicated (and would thus not be implemented any time
soon). Maybe you have ideas?

-- daniel


Am 09.02.2015 um 11:41 schrieb Magnus Manske:
> Manual descriptions are, in the vast majority of cases, a waste of volunteer
> time. Alternative:
> http://magnusmanske.de/wordpress/?p=265
>
> On Sun Feb 08 2015 at 17:37:42 Gerard Meijssen <gerard.meijssen@gmail.com
> <mailto:gerard.meijssen@gmail.com>> wrote:
>
>     Hoi,
>     How does that help ? The point is exactly that there is no point to
>     descriptions. Why iterate on a dog it will still be a mutt.
>     Thanks,
>         GerardM
>
>     On 8 February 2015 at 14:07, Amir E. Aharoni <amir.aharoni@mail.huji.ac.il
>     <mailto:amir.aharoni@mail.huji.ac.il>> wrote:
>
>         I'd rather see it not as something terribly disappointing, but as an
>         opportunity to find a way to fill item descriptions more efficiently.
>
>         Basically, to find some cycles to resolve
>         https://phabricator.wikimedia.org/T64695
>
>         בתאריך 8 בפבר 2015 10:33, ‏"Gerard Meijssen" <gerard.meijssen@gmail.com
>         <mailto:gerard.meijssen@gmail.com>> כתב:
>
>             Hoi,
>             I understand that item descriptions are going to be used in a mobile
>             app. In my opinion that is seriously disappointing because it is not
>             realistic to expect enough coverage in any language. Particularly in
>             the small languages it will not be really useful.
>
>             My question is: we have had automated descriptions for a long time.
>             What is it that they makes that they are not used.?
>
>             Thanks,
>                  GerardM
>
>             _______________________________________________
>             Wikidata-l mailing list
>             Wikidata-l@lists.wikimedia.org <mailto:Wikidata-l@lists.wikimedia.org>
>             https://lists.wikimedia.org/mailman/listinfo/wikidata-l
>
>
>         _______________________________________________
>         Wikidata-l mailing list
>         Wikidata-l@lists.wikimedia.org <mailto:Wikidata-l@lists.wikimedia.org>
>         https://lists.wikimedia.org/mailman/listinfo/wikidata-l
>
>
>     _________________________________________________
>     Wikidata-l mailing list
>     Wikidata-l@lists.wikimedia.org <mailto:Wikidata-l@lists.wikimedia.org>
>     https://lists.wikimedia.org/__mailman/listinfo/wikidata-l
>     <https://lists.wikimedia.org/mailman/listinfo/wikidata-l>
>
>
>
> _______________________________________________
> Wikidata-l mailing list
> Wikidata-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata-l
>


--
Daniel Kinzler
Senior Software Developer

Wikimedia Deutschland
Gesellschaft zur Förderung Freien Wissens e.V.

_______________________________________________
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

_______________________________________________
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l