- The tool that generates the descriptions deserves a lot more development.
Magnus' tool is very much a prototype, and represents a tiny glimpse of
what's possible. Looking at its current output is a straw man.
- Auto-generated descriptions work for current articles, and *all future
articles*. They automatically adapt to updated data. They automatically
become more accurate as new data is added.
- When you edit the descriptions yourself, you're not really making a
meaningful contribution to the *data* that underpins the given Wikidata
entry; i.e. you're not contributing any new information. You're simply
paraphrasing the first sentence or two of the Wikipedia article. That can't
possibly be a productive use of contributors' time.
As for Brian's suggestion:
It would be a step forward; we can even invent a whole template-type syntax
for transcluding bits of actual data into the description. But IMO, that
kind of effort would still be better spent on fully-automatic descriptions,
because that's the ideal that semi-automatic descriptions can only approach.
Could there be a way to have our nicely curated description cake and eat
it too? For example, interpolating data into the description and/or marking
data points which are referenced in the description (so as to mark it as
outdated when they change)?
I appreciate the potential benefits of generated descriptions (and other
things), but Monte's examples might have swayed me towards human
curated—when available.
On Tuesday, August 18, 2015, Monte Hurd mhurd@wikimedia.org wrote:
Ok, so I just did what I proposed. I went to random enwiki articles and
described the first ten I found which didn't already have descriptions:
- "Courage Under Fire", *1996 film about a Gulf War friendly-fire
incident*
"Pebasiconcha immanis", *largest known species of land snail, extinct*
"List of Kenyan writers", *notable Kenyan authors*
"Solar eclipse of December 14, 1917", *annular eclipse which lasted 77
seconds*
- "Natchaug Forest Lumber Shed", *historic Civilian Conservation Corps
post-and-beam building*
- "Sun of Jamaica (album)", *debut 1980 studio album by Goombay Dance
Band*
"E-1027", *modernist villa in France by architect Eileen Gray*
"Daingerfield State Park", *park in Morris County, Texas, USA,
bordering Lake Daingerfield*
"Todo Lo Que Soy-En Vivo", *2014 Live album by Mexican pop singer Fey*
"2009 UEFA Regions' Cup", *6th UEFA Regions' Cup, won by Castile and
Leon*
And here are the respective descriptions from Magnus' (quite excellent)
autodesc.js:
- "Courage Under Fire", *1996 film by Edward Zwick, produced by John
Davis and David T. Friendly from United States of America*
"Pebasiconcha immanis", *species of Mollusca*
"List of Kenyan writers", *Wikimedia list article*
"Solar eclipse of December 14, 1917", *solar eclipse*
"Natchaug Forest Lumber Shed", *Construction in Connecticut, United
States of America*
"Sun of Jamaica (album)", *album*
"E-1027", *villa in Roquebrune-Cap-Martin, France*
"Daingerfield State Park", *state park and state park of a state of
the United States in Texas, United States of America*
"Todo Lo Que Soy-En Vivo", *live album by Fey*
"2009 UEFA Regions' Cup", *none*
Thoughts?
Just trying to make my own bold assertions falsifiable :)
On Tue, Aug 18, 2015 at 6:32 PM, Monte Hurd mhurd@wikimedia.org wrote:
The whole human-vs-extracted descriptions quality question could be
fairly easy to test I think:
- Pick, some number of articles at random.
- Run them through a description extraction script.
- Have a human describe the same articles with, say, the app interface I
demo'ed.
If nothing else this exercise could perhaps make what's thus far been a
wildly abstract discussion more concrete.
On Tue, Aug 18, 2015 at 6:17 PM, Monte Hurd mhurd@wikimedia.org wrote:
If having the most elegant description extraction mechanism was the
goal I would totally agree ;)
On Tue, Aug 18, 2015 at 5:19 PM, Dmitry Brant dbrant@wikimedia.org
wrote:
IMO, allowing the user to edit the description is a missed opportunity
to make the user edit the actual *data*, such that the description is
generated correctly.
On Tue, Aug 18, 2015 at 8:02 PM, Monte Hurd mhurd@wikimedia.org
wrote:
IMO, if the goal is quality, then human curated descriptions are
superior until such time as the auto-generation script passes the Turing
test ;)
I see these empty descriptions as an amazing opportunity to give
*everyone* an easy new way to edit. I whipped an app editing interface up
at the Lyon hackathon:
bluetooth720 https://www.youtube.com/watch?v=6VblyGhf_c8
I used it to add a couple hundred descriptions in a single day just
by hitting "random" then adding descriptions for articles which didn't have
them.
I'd love to try a limited test of this in production to get a sense
for how effective human curation can be if the interface is easy to use...
On Tue, Aug 18, 2015 at 1:25 PM, Jan Ainali jan.ainali@wikimedia.se
wrote:
> Nice one!
>
> Does not appear to work on svwiki though. Does it have something to
> do with that the wiki in question does not display that tagline?
>
>
> *Med vänliga hälsningar,Jan Ainali*
>
> Verksamhetschef, Wikimedia Sverige http://wikimedia.se
> 0729 - 67 29 48
>
>
> *Tänk dig en värld där varje människa har fri tillgång till
> mänsklighetens samlade kunskap. Det är det vi gör.*
> Bli medlem. http://blimedlem.wikimedia.se
>
>
> 2015-08-18 17:23 GMT+02:00 Magnus Manske <
> magnusmanske@googlemail.com>:
>
>> Show automatic description underneath "From Wikipedia...":
>> https://en.wikipedia.org/wiki/User:Magnus_Manske/autodesc.js
>>
>> To use, add:
>> importScript ( 'User:Magnus_Manske/autodesc.js' ) ;
>> to your common.js
>>
>> On Tue, Aug 18, 2015 at 9:47 AM Jane Darnell jane023@gmail.com
>> wrote:
>>
>>> It would be even better if this (short: 3 field max)
>>> pipe-separated list was available as a gadget to wikidatans on Wikipedia
>>> (like me). I can't see if a page I am on has an "instance of" (though it
>>> should) and I can see the description thanks to another gadget (sorry no
>>> idea which one that is). Often I will update empty descriptions, but if I
>>> was served basic fields (so for a painting, the creator field), I would
>>> click through to update that too.
>>>
>>> On Tue, Aug 18, 2015 at 9:58 AM, Federico Leva (Nemo) <
>>> nemowiki@gmail.com> wrote:
>>>
>>>> Jane Darnell, 15/08/2015 08:53:
>>>>
>>>>> Yes but even if the descriptions were just the contents of fields
>>>>> separated by a pipe it would be better than nothing.
>>>>>
>>>>
>>>> +1, item descriptions are mostly useless in my experience.
>>>>
>>>> As for "get into production on Wikipedia" I don't know what it
>>>> means, I certainly don't like 1) mobile-specific features, 2) overriding
>>>> existing manually curated content; but it's good to 3) fill gaps. Mobile
>>>> folks often do (1) and (2), if they *instead* did (3) I'd be very happy. :)
>>>>
>>>> Nemo
>>>>
>>>
>>> _______________________________________________
>>> Mobile-l mailing list
>>> Mobile-l@lists.wikimedia.org
>>> https://lists.wikimedia.org/mailman/listinfo/mobile-l
>>>
>>
>> _______________________________________________
>> Mobile-l mailing list
>> Mobile-l@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/mobile-l
>>
>>
>
> _______________________________________________
> Mobile-l mailing list
> Mobile-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/mobile-l
>
>
Mobile-l mailing list
Mobile-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/mobile-l
--
Dmitry Brant
Mobile Apps Team (Android)
Wikimedia Foundation
https://www.mediawiki.org/wiki/Wikimedia_mobile_engineering
--
EN Wikipedia user page: https://en.wikipedia.org/wiki/User:Brian.gerstle
IRC: bgerstle
--
Dmitry Brant
Mobile Apps Team (Android)
Wikimedia Foundation
https://www.mediawiki.org/wiki/Wikimedia_mobile_engineering
On Tue, Aug 18, 2015 at 10:36 PM, Brian Gerstle
bgerstle@wikimedia.org
wrote:
> Could there be a way to have our nicely curated description cake and eat
> it too? For example, interpolating data into the description and/or marking
> data points which are referenced in the description (so as to mark it as
> outdated when they change)?
>
> I appreciate the potential benefits of generated descriptions (and other
> things), but Monte's examples might have swayed me towards human
> curated—when available.
>
> On Tuesday, August 18, 2015, Monte Hurd
mhurd@wikimedia.org wrote:
>
>> Ok, so I just did what I proposed. I went to random enwiki articles and
>> described the first ten I found which didn't already have descriptions:
>>
>>
>> - "Courage Under Fire", *1996 film about a Gulf War friendly-fire
>> incident*
>>
>> - "Pebasiconcha immanis", *largest known species of land snail, extinct*
>>
>> - "List of Kenyan writers", *notable Kenyan authors*
>>
>> - "Solar eclipse of December 14, 1917", *annular eclipse which lasted 77
>> seconds*
>>
>> - "Natchaug Forest Lumber Shed", *historic Civilian Conservation Corps
>> post-and-beam building*
>>
>> - "Sun of Jamaica (album)", *debut 1980 studio album by Goombay Dance
>> Band*
>>
>> - "E-1027", *modernist villa in France by architect Eileen Gray*
>>
>> - "Daingerfield State Park", *park in Morris County, Texas, USA,
>> bordering Lake Daingerfield*
>>
>> - "Todo Lo Que Soy-En Vivo", *2014 Live album by Mexican pop singer Fey*
>>
>> - "2009 UEFA Regions' Cup", *6th UEFA Regions' Cup, won by Castile and
>> Leon*
>>
>>
>>
>> And here are the respective descriptions from Magnus' (quite excellent)
>> autodesc.js:
>>
>>
>>
>> - "Courage Under Fire", *1996 film by Edward Zwick, produced by John
>> Davis and David T. Friendly from United States of America*
>>
>> - "Pebasiconcha immanis", *species of Mollusca*
>>
>> - "List of Kenyan writers", *Wikimedia list article*
>>
>> - "Solar eclipse of December 14, 1917", *solar eclipse*
>>
>> - "Natchaug Forest Lumber Shed", *Construction in Connecticut, United
>> States of America*
>>
>> - "Sun of Jamaica (album)", *album*
>>
>> - "E-1027", *villa in Roquebrune-Cap-Martin, France*
>>
>> - "Daingerfield State Park", *state park and state park of a state of
>> the United States in Texas, United States of America*
>>
>> - "Todo Lo Que Soy-En Vivo", *live album by Fey*
>>
>> - "2009 UEFA Regions' Cup", *none*
>>
>>
>>
>> Thoughts?
>>
>> Just trying to make my own bold assertions falsifiable :)
>>
>>
>>
>> On Tue, Aug 18, 2015 at 6:32 PM, Monte Hurd
mhurd@wikimedia.org wrote:
>>
>>> The whole human-vs-extracted descriptions quality question could be
>>> fairly easy to test I think:
>>>
>>> - Pick, some number of articles at random.
>>> - Run them through a description extraction script.
>>> - Have a human describe the same articles with, say, the app interface I
>>> demo'ed.
>>>
>>> If nothing else this exercise could perhaps make what's thus far been a
>>> wildly abstract discussion more concrete.
>>>
>>>
>>>
>>>
>>> On Tue, Aug 18, 2015 at 6:17 PM, Monte Hurd
mhurd@wikimedia.org wrote:
>>>
>>>> If having the most elegant description extraction mechanism was the
>>>> goal I would totally agree ;)
>>>>
>>>> On Tue, Aug 18, 2015 at 5:19 PM, Dmitry Brant
dbrant@wikimedia.org
>>>> wrote:
>>>>
>>>>> IMO, allowing the user to edit the description is a missed opportunity
>>>>> to make the user edit the actual *data*, such that the description is
>>>>> generated correctly.
>>>>>
>>>>>
>>>>>
>>>>> On Tue, Aug 18, 2015 at 8:02 PM, Monte Hurd
mhurd@wikimedia.org
>>>>> wrote:
>>>>>
>>>>>> IMO, if the goal is quality, then human curated descriptions are
>>>>>> superior until such time as the auto-generation script passes the Turing
>>>>>> test ;)
>>>>>>
>>>>>> I see these empty descriptions as an amazing opportunity to give
>>>>>> *everyone* an easy new way to edit. I whipped an app editing interface up
>>>>>> at the Lyon hackathon:
>>>>>> bluetooth720
https://www.youtube.com/watch?v=6VblyGhf_c8
>>>>>>
>>>>>> I used it to add a couple hundred descriptions in a single day just
>>>>>> by hitting "random" then adding descriptions for articles which didn't have
>>>>>> them.
>>>>>>
>>>>>> I'd love to try a limited test of this in production to get a sense
>>>>>> for how effective human curation can be if the interface is easy to use...
>>>>>>
>>>>>>
>>>>>> On Tue, Aug 18, 2015 at 1:25 PM, Jan Ainali
jan.ainali@wikimedia.se
>>>>>> wrote:
>>>>>>
>>>>>>> Nice one!
>>>>>>>
>>>>>>> Does not appear to work on svwiki though. Does it have something to
>>>>>>> do with that the wiki in question does not display that tagline?
>>>>>>>
>>>>>>>
>>>>>>> *Med vänliga hälsningar,Jan Ainali*
>>>>>>>
>>>>>>> Verksamhetschef, Wikimedia Sverige
http://wikimedia.se
>>>>>>> 0729 - 67 29 48
>>>>>>>
>>>>>>>
>>>>>>> *Tänk dig en värld där varje människa har fri tillgång till
>>>>>>> mänsklighetens samlade kunskap. Det är det vi gör.*
>>>>>>> Bli medlem.
http://blimedlem.wikimedia.se
>>>>>>>
>>>>>>>
>>>>>>> 2015-08-18 17:23 GMT+02:00 Magnus Manske <
>>>>>>> magnusmanske@googlemail.com>:
>>>>>>>
>>>>>>>> Show automatic description underneath "From Wikipedia...":
>>>>>>>>
https://en.wikipedia.org/wiki/User:Magnus_Manske/autodesc.js
>>>>>>>>
>>>>>>>> To use, add:
>>>>>>>> importScript ( 'User:Magnus_Manske/autodesc.js' ) ;
>>>>>>>> to your common.js
>>>>>>>>
>>>>>>>> On Tue, Aug 18, 2015 at 9:47 AM Jane Darnell
jane023@gmail.com
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> It would be even better if this (short: 3 field max)
>>>>>>>>> pipe-separated list was available as a gadget to wikidatans on Wikipedia
>>>>>>>>> (like me). I can't see if a page I am on has an "instance of" (though it
>>>>>>>>> should) and I can see the description thanks to another gadget (sorry no
>>>>>>>>> idea which one that is). Often I will update empty descriptions, but if I
>>>>>>>>> was served basic fields (so for a painting, the creator field), I would
>>>>>>>>> click through to update that too.
>>>>>>>>>
>>>>>>>>> On Tue, Aug 18, 2015 at 9:58 AM, Federico Leva (Nemo) <
>>>>>>>>> nemowiki@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> Jane Darnell, 15/08/2015 08:53:
>>>>>>>>>>
>>>>>>>>>>> Yes but even if the descriptions were just the contents of fields
>>>>>>>>>>> separated by a pipe it would be better than nothing.
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> +1, item descriptions are mostly useless in my experience.
>>>>>>>>>>
>>>>>>>>>> As for "get into production on Wikipedia" I don't know what it
>>>>>>>>>> means, I certainly don't like 1) mobile-specific features, 2) overriding
>>>>>>>>>> existing manually curated content; but it's good to 3) fill gaps. Mobile
>>>>>>>>>> folks often do (1) and (2), if they *instead* did (3) I'd be very happy. :)
>>>>>>>>>>
>>>>>>>>>> Nemo
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> Mobile-l mailing list
>>>>>>>>> Mobile-l@lists.wikimedia.org
>>>>>>>>>
https://lists.wikimedia.org/mailman/listinfo/mobile-l
>>>>>>>>>
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> Mobile-l mailing list
>>>>>>>> Mobile-l@lists.wikimedia.org
>>>>>>>>
https://lists.wikimedia.org/mailman/listinfo/mobile-l
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Mobile-l mailing list
>>>>>>> Mobile-l@lists.wikimedia.org
>>>>>>>
https://lists.wikimedia.org/mailman/listinfo/mobile-l
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> Mobile-l mailing list
>>>>>> Mobile-l@lists.wikimedia.org
>>>>>>
https://lists.wikimedia.org/mailman/listinfo/mobile-l
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Dmitry Brant
>>>>> Mobile Apps Team (Android)
>>>>> Wikimedia Foundation
>>>>>
https://www.mediawiki.org/wiki/Wikimedia_mobile_engineering
>>>>>
>>>>>
>>>>
>>>
>>
>
> --
> EN Wikipedia user page:
https://en.wikipedia.org/wiki/User:Brian.gerstle
> IRC: bgerstle
>
>
--
Dmitry Brant
Mobile Apps Team (Android)
Wikimedia Foundation
https://www.mediawiki.org/wiki/Wikimedia_mobile_engineering