There is no question that there is a lot of room for improvement of autodesc. There are also some instances where a manual description is vastly superior to an automatic one, where the algorithm can not catch the point of why that item is important.

However, consider this:
* volunteer time to manually update these descriptions: a few minutes? (load, read, understand, type, save...)
* volunteer time to have them generated automatically: none (well, mine, but distributed over 14M items times 250 languages, lim->0)

I noticed there are no biographies in your list, which is surprising, considering those are most numerous "class" of items. It is also one of the few classes where autodesc does something more clever than "generic description". I assume this was not intentional ;-)

The situation, for most languages, is this: No manual descriptions, on basically any item. And that will remain so for the (near) future. Automatic descriptions can change that, literally over night, with a little programming and linguistic effort. Adding a manual description can help speakers of that language; adding a statement, and thereby improving automatic descriptions in all languages, helps everyone. With essentially the same volunteer effort. This is a "force multiplier" of volunteer effort with a factor of 250. And we ignore that ... why, exactly?



On Wed, Aug 19, 2015 at 3:25 AM Monte Hurd <mhurd@wikimedia.org> wrote:
Ok, so I just did what I proposed. I went to random enwiki articles and described the first ten I found which didn't already have descriptions:


- "Courage Under Fire", 1996 film about a Gulf War friendly-fire incident

- "Pebasiconcha immanis", largest known species of land snail, extinct

- "List of Kenyan writers", notable Kenyan authors

- "Solar eclipse of December 14, 1917", annular eclipse which lasted 77 seconds

- "Natchaug Forest Lumber Shed", historic Civilian Conservation Corps post-and-beam building

- "Sun of Jamaica (album)", debut 1980 studio album by Goombay Dance Band

- "E-1027", modernist villa in France by architect Eileen Gray

- "Daingerfield State Park", park in Morris County, Texas, USA, bordering Lake Daingerfield

- "Todo Lo Que Soy-En Vivo", 2014 Live album by Mexican pop singer Fey

- "2009 UEFA Regions' Cup", 6th UEFA Regions' Cup, won by Castile and Leon



And here are the respective descriptions from Magnus' (quite excellent) autodesc.js:



- "Courage Under Fire", 1996 film by Edward Zwick, produced by John Davis and David T. Friendly from United States of America

- "Pebasiconcha immanis", species of Mollusca

- "List of Kenyan writers", Wikimedia list article

- "Solar eclipse of December 14, 1917", solar eclipse

- "Natchaug Forest Lumber Shed", Construction in Connecticut, United States of America

- "Sun of Jamaica (album)", album

- "E-1027", villa in Roquebrune-Cap-Martin, France

- "Daingerfield State Park", state park and state park of a state of the United States in Texas, United States of America

- "Todo Lo Que Soy-En Vivo", live album by Fey

- "2009 UEFA Regions' Cup", none



Thoughts? 

Just trying to make my own bold assertions falsifiable :)



On Tue, Aug 18, 2015 at 6:32 PM, Monte Hurd <mhurd@wikimedia.org> wrote:
The whole human-vs-extracted descriptions quality question could be fairly easy to test I think:

- Pick, some number of articles at random. 
- Run them through a description extraction script.
- Have a human describe the same articles with, say, the app interface I demo'ed.

If nothing else this exercise could perhaps make what's thus far been a wildly abstract discussion more concrete.




On Tue, Aug 18, 2015 at 6:17 PM, Monte Hurd <mhurd@wikimedia.org> wrote:
If having the most elegant description extraction mechanism was the goal I would totally agree ;)

On Tue, Aug 18, 2015 at 5:19 PM, Dmitry Brant <dbrant@wikimedia.org> wrote:
IMO, allowing the user to edit the description is a missed opportunity to make the user edit the actual *data*, such that the description is generated correctly.



On Tue, Aug 18, 2015 at 8:02 PM, Monte Hurd <mhurd@wikimedia.org> wrote:
IMO, if the goal is quality, then human curated descriptions are superior until such time as the auto-generation script passes the Turing test ;) 

I see these empty descriptions as an amazing opportunity to give *everyone* an easy new way to edit. I whipped an app editing interface up at the Lyon hackathon:

I used it to add a couple hundred descriptions in a single day just by hitting "random" then adding descriptions for articles which didn't have them.

I'd love to try a limited test of this in production to get a sense for how effective human curation can be if the interface is easy to use...


On Tue, Aug 18, 2015 at 1:25 PM, Jan Ainali <jan.ainali@wikimedia.se> wrote:
Nice one! 

Does not appear to work on svwiki though. Does it have something to do with that the wiki in question does not display that tagline?

Med vänliga hälsningar,
Jan Ainali

Verksamhetschef, Wikimedia Sverige 
0729 - 67 29 48


Tänk dig en värld där varje människa har fri tillgång till mänsklighetens samlade kunskap. Det är det vi gör.


2015-08-18 17:23 GMT+02:00 Magnus Manske <magnusmanske@googlemail.com>:
Show automatic description underneath "From Wikipedia...":
https://en.wikipedia.org/wiki/User:Magnus_Manske/autodesc.js

To use, add:
importScript ( 'User:Magnus_Manske/autodesc.js' ) ;
to your common.js

On Tue, Aug 18, 2015 at 9:47 AM Jane Darnell <jane023@gmail.com> wrote:
It would be even better if this (short: 3 field max) pipe-separated list was available as a gadget to wikidatans on Wikipedia (like me). I can't see if a page I am on has an "instance of" (though it should) and I can see the description thanks to another gadget (sorry no idea which one that is). Often I will update empty descriptions, but if I was served basic fields (so for a painting, the creator field), I would click through to update that too.

On Tue, Aug 18, 2015 at 9:58 AM, Federico Leva (Nemo) <nemowiki@gmail.com> wrote:
Jane Darnell, 15/08/2015 08:53:
Yes but even if the descriptions were just the contents of fields
separated by a pipe it would be better than nothing.

+1, item descriptions are mostly useless in my experience.

As for "get into production on Wikipedia" I don't know what it means, I certainly don't like 1) mobile-specific features, 2) overriding existing manually curated content; but it's good to 3) fill gaps. Mobile folks often do (1) and (2), if they *instead* did (3) I'd be very happy. :)

Nemo

_______________________________________________
Mobile-l mailing list
Mobile-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/mobile-l

_______________________________________________
Mobile-l mailing list
Mobile-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/mobile-l



_______________________________________________
Mobile-l mailing list
Mobile-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/mobile-l



_______________________________________________
Mobile-l mailing list
Mobile-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/mobile-l




--
Dmitry Brant
Mobile Apps Team (Android)
Wikimedia Foundation
https://www.mediawiki.org/wiki/Wikimedia_mobile_engineering




_______________________________________________
Mobile-l mailing list
Mobile-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/mobile-l