Oh, and as for examples, random-paging just got me this:
https://en.wikipedia.org/wiki/Jules_Malou
Manual description: Belgian politician
Automatic description: Belgian politician and lawyer, Prime Minister of Belgium, and member of the Chamber of Representatives of Belgium (1810–1886) ♂
I know which one I'd prefer...
On Wed, Aug 19, 2015 at 10:50 AM Magnus Manske magnusmanske@googlemail.com wrote:
Thank you Dmitry! Well phrased and to the point!
As for "templating", that might be the worst of both worlds; without the flexibility and over-time improvement of automatic descriptions, but making it harder for people to enter (compared to "free-style" text). We have a Visual Editor on Wikipedia for a reason :-)
On Wed, Aug 19, 2015 at 4:07 AM Dmitry Brant dbrant@wikimedia.org wrote:
My thoughts, as ever(!), are as follows:
- The tool that generates the descriptions deserves a lot more
development. Magnus' tool is very much a prototype, and represents a tiny glimpse of what's possible. Looking at its current output is a straw man.
- Auto-generated descriptions work for current articles, and *all future
articles*. They automatically adapt to updated data. They automatically become more accurate as new data is added.
- When you edit the descriptions yourself, you're not really making a
meaningful contribution to the *data* that underpins the given Wikidata entry; i.e. you're not contributing any new information. You're simply paraphrasing the first sentence or two of the Wikipedia article. That can't possibly be a productive use of contributors' time.
As for Brian's suggestion: It would be a step forward; we can even invent a whole template-type syntax for transcluding bits of actual data into the description. But IMO, that kind of effort would still be better spent on fully-automatic descriptions, because that's the ideal that semi-automatic descriptions can only approach.
On Tue, Aug 18, 2015 at 10:36 PM, Brian Gerstle bgerstle@wikimedia.org wrote:
Could there be a way to have our nicely curated description cake and eat it too? For example, interpolating data into the description and/or marking data points which are referenced in the description (so as to mark it as outdated when they change)?
I appreciate the potential benefits of generated descriptions (and other things), but Monte's examples might have swayed me towards human curated—when available.
On Tuesday, August 18, 2015, Monte Hurd mhurd@wikimedia.org wrote:
Ok, so I just did what I proposed. I went to random enwiki articles and described the first ten I found which didn't already have descriptions:
- "Courage Under Fire", *1996 film about a Gulf War friendly-fire
incident*
- "Pebasiconcha immanis", *largest known species of land snail,
extinct*
"List of Kenyan writers", *notable Kenyan authors*
"Solar eclipse of December 14, 1917", *annular eclipse which lasted
77 seconds*
- "Natchaug Forest Lumber Shed", *historic Civilian Conservation Corps
post-and-beam building*
- "Sun of Jamaica (album)", *debut 1980 studio album by Goombay Dance
Band*
"E-1027", *modernist villa in France by architect Eileen Gray*
"Daingerfield State Park", *park in Morris County, Texas, USA,
bordering Lake Daingerfield*
- "Todo Lo Que Soy-En Vivo", *2014 Live album by Mexican pop singer
Fey*
- "2009 UEFA Regions' Cup", *6th UEFA Regions' Cup, won by Castile and
Leon*
And here are the respective descriptions from Magnus' (quite excellent) autodesc.js:
- "Courage Under Fire", *1996 film by Edward Zwick, produced by John
Davis and David T. Friendly from United States of America*
"Pebasiconcha immanis", *species of Mollusca*
"List of Kenyan writers", *Wikimedia list article*
"Solar eclipse of December 14, 1917", *solar eclipse*
"Natchaug Forest Lumber Shed", *Construction in Connecticut, United
States of America*
"Sun of Jamaica (album)", *album*
"E-1027", *villa in Roquebrune-Cap-Martin, France*
"Daingerfield State Park", *state park and state park of a state of
the United States in Texas, United States of America*
"Todo Lo Que Soy-En Vivo", *live album by Fey*
"2009 UEFA Regions' Cup", *none*
Thoughts?
Just trying to make my own bold assertions falsifiable :)
On Tue, Aug 18, 2015 at 6:32 PM, Monte Hurd mhurd@wikimedia.org wrote:
The whole human-vs-extracted descriptions quality question could be fairly easy to test I think:
- Pick, some number of articles at random.
- Run them through a description extraction script.
- Have a human describe the same articles with, say, the app interface
I demo'ed.
If nothing else this exercise could perhaps make what's thus far been a wildly abstract discussion more concrete.
On Tue, Aug 18, 2015 at 6:17 PM, Monte Hurd mhurd@wikimedia.org wrote:
If having the most elegant description extraction mechanism was the goal I would totally agree ;)
On Tue, Aug 18, 2015 at 5:19 PM, Dmitry Brant dbrant@wikimedia.org wrote:
> IMO, allowing the user to edit the description is a missed > opportunity to make the user edit the actual *data*, such that the > description is generated correctly. > > > > On Tue, Aug 18, 2015 at 8:02 PM, Monte Hurd mhurd@wikimedia.org > wrote: > >> IMO, if the goal is quality, then human curated descriptions are >> superior until such time as the auto-generation script passes the Turing >> test ;) >> >> I see these empty descriptions as an amazing opportunity to give >> *everyone* an easy new way to edit. I whipped an app editing interface up >> at the Lyon hackathon: >> bluetooth720 https://www.youtube.com/watch?v=6VblyGhf_c8 >> >> I used it to add a couple hundred descriptions in a single day just >> by hitting "random" then adding descriptions for articles which didn't have >> them. >> >> I'd love to try a limited test of this in production to get a sense >> for how effective human curation can be if the interface is easy to use... >> >> >> On Tue, Aug 18, 2015 at 1:25 PM, Jan Ainali < >> jan.ainali@wikimedia.se> wrote: >> >>> Nice one! >>> >>> Does not appear to work on svwiki though. Does it have something >>> to do with that the wiki in question does not display that tagline? >>> >>> >>> *Med vänliga hälsningar,Jan Ainali* >>> >>> Verksamhetschef, Wikimedia Sverige http://wikimedia.se >>> 0729 - 67 29 48 >>> >>> >>> *Tänk dig en värld där varje människa har fri tillgång till >>> mänsklighetens samlade kunskap. Det är det vi gör.* >>> Bli medlem. http://blimedlem.wikimedia.se >>> >>> >>> 2015-08-18 17:23 GMT+02:00 Magnus Manske < >>> magnusmanske@googlemail.com>: >>> >>>> Show automatic description underneath "From Wikipedia...": >>>> https://en.wikipedia.org/wiki/User:Magnus_Manske/autodesc.js >>>> >>>> To use, add: >>>> importScript ( 'User:Magnus_Manske/autodesc.js' ) ; >>>> to your common.js >>>> >>>> On Tue, Aug 18, 2015 at 9:47 AM Jane Darnell jane023@gmail.com >>>> wrote: >>>> >>>>> It would be even better if this (short: 3 field max) >>>>> pipe-separated list was available as a gadget to wikidatans on Wikipedia >>>>> (like me). I can't see if a page I am on has an "instance of" (though it >>>>> should) and I can see the description thanks to another gadget (sorry no >>>>> idea which one that is). Often I will update empty descriptions, but if I >>>>> was served basic fields (so for a painting, the creator field), I would >>>>> click through to update that too. >>>>> >>>>> On Tue, Aug 18, 2015 at 9:58 AM, Federico Leva (Nemo) < >>>>> nemowiki@gmail.com> wrote: >>>>> >>>>>> Jane Darnell, 15/08/2015 08:53: >>>>>> >>>>>>> Yes but even if the descriptions were just the contents of >>>>>>> fields >>>>>>> separated by a pipe it would be better than nothing. >>>>>>> >>>>>> >>>>>> +1, item descriptions are mostly useless in my experience. >>>>>> >>>>>> As for "get into production on Wikipedia" I don't know what it >>>>>> means, I certainly don't like 1) mobile-specific features, 2) overriding >>>>>> existing manually curated content; but it's good to 3) fill gaps. Mobile >>>>>> folks often do (1) and (2), if they *instead* did (3) I'd be very happy. :) >>>>>> >>>>>> Nemo >>>>>> >>>>> >>>>> _______________________________________________ >>>>> Mobile-l mailing list >>>>> Mobile-l@lists.wikimedia.org >>>>> https://lists.wikimedia.org/mailman/listinfo/mobile-l >>>>> >>>> >>>> _______________________________________________ >>>> Mobile-l mailing list >>>> Mobile-l@lists.wikimedia.org >>>> https://lists.wikimedia.org/mailman/listinfo/mobile-l >>>> >>>> >>> >>> _______________________________________________ >>> Mobile-l mailing list >>> Mobile-l@lists.wikimedia.org >>> https://lists.wikimedia.org/mailman/listinfo/mobile-l >>> >>> >> >> _______________________________________________ >> Mobile-l mailing list >> Mobile-l@lists.wikimedia.org >> https://lists.wikimedia.org/mailman/listinfo/mobile-l >> >> > > > -- > Dmitry Brant > Mobile Apps Team (Android) > Wikimedia Foundation > https://www.mediawiki.org/wiki/Wikimedia_mobile_engineering > >
-- EN Wikipedia user page: https://en.wikipedia.org/wiki/User:Brian.gerstle IRC: bgerstle
-- Dmitry Brant Mobile Apps Team (Android) Wikimedia Foundation https://www.mediawiki.org/wiki/Wikimedia_mobile_engineering
On Tue, Aug 18, 2015 at 10:36 PM, Brian Gerstle bgerstle@wikimedia.org wrote:
Could there be a way to have our nicely curated description cake and eat it too? For example, interpolating data into the description and/or marking data points which are referenced in the description (so as to mark it as outdated when they change)?
I appreciate the potential benefits of generated descriptions (and other things), but Monte's examples might have swayed me towards human curated—when available.
On Tuesday, August 18, 2015, Monte Hurd mhurd@wikimedia.org wrote:
Ok, so I just did what I proposed. I went to random enwiki articles and described the first ten I found which didn't already have descriptions:
- "Courage Under Fire", *1996 film about a Gulf War friendly-fire
incident*
- "Pebasiconcha immanis", *largest known species of land snail,
extinct*
"List of Kenyan writers", *notable Kenyan authors*
"Solar eclipse of December 14, 1917", *annular eclipse which lasted
77 seconds*
- "Natchaug Forest Lumber Shed", *historic Civilian Conservation Corps
post-and-beam building*
- "Sun of Jamaica (album)", *debut 1980 studio album by Goombay Dance
Band*
"E-1027", *modernist villa in France by architect Eileen Gray*
"Daingerfield State Park", *park in Morris County, Texas, USA,
bordering Lake Daingerfield*
- "Todo Lo Que Soy-En Vivo", *2014 Live album by Mexican pop singer
Fey*
- "2009 UEFA Regions' Cup", *6th UEFA Regions' Cup, won by Castile and
Leon*
And here are the respective descriptions from Magnus' (quite excellent) autodesc.js:
- "Courage Under Fire", *1996 film by Edward Zwick, produced by John
Davis and David T. Friendly from United States of America*
"Pebasiconcha immanis", *species of Mollusca*
"List of Kenyan writers", *Wikimedia list article*
"Solar eclipse of December 14, 1917", *solar eclipse*
"Natchaug Forest Lumber Shed", *Construction in Connecticut, United
States of America*
"Sun of Jamaica (album)", *album*
"E-1027", *villa in Roquebrune-Cap-Martin, France*
"Daingerfield State Park", *state park and state park of a state of
the United States in Texas, United States of America*
"Todo Lo Que Soy-En Vivo", *live album by Fey*
"2009 UEFA Regions' Cup", *none*
Thoughts?
Just trying to make my own bold assertions falsifiable :)
On Tue, Aug 18, 2015 at 6:32 PM, Monte Hurd mhurd@wikimedia.org wrote:
The whole human-vs-extracted descriptions quality question could be fairly easy to test I think:
- Pick, some number of articles at random.
- Run them through a description extraction script.
- Have a human describe the same articles with, say, the app interface
I demo'ed.
If nothing else this exercise could perhaps make what's thus far been a wildly abstract discussion more concrete.
On Tue, Aug 18, 2015 at 6:17 PM, Monte Hurd mhurd@wikimedia.org wrote:
If having the most elegant description extraction mechanism was the goal I would totally agree ;)
On Tue, Aug 18, 2015 at 5:19 PM, Dmitry Brant dbrant@wikimedia.org wrote:
> IMO, allowing the user to edit the description is a missed > opportunity to make the user edit the actual *data*, such that the > description is generated correctly. > > > > On Tue, Aug 18, 2015 at 8:02 PM, Monte Hurd mhurd@wikimedia.org > wrote: > >> IMO, if the goal is quality, then human curated descriptions are >> superior until such time as the auto-generation script passes the Turing >> test ;) >> >> I see these empty descriptions as an amazing opportunity to give >> *everyone* an easy new way to edit. I whipped an app editing interface up >> at the Lyon hackathon: >> bluetooth720 https://www.youtube.com/watch?v=6VblyGhf_c8 >> >> I used it to add a couple hundred descriptions in a single day just >> by hitting "random" then adding descriptions for articles which didn't have >> them. >> >> I'd love to try a limited test of this in production to get a sense >> for how effective human curation can be if the interface is easy to use... >> >> >> On Tue, Aug 18, 2015 at 1:25 PM, Jan Ainali < >> jan.ainali@wikimedia.se> wrote: >> >>> Nice one! >>> >>> Does not appear to work on svwiki though. Does it have something >>> to do with that the wiki in question does not display that tagline? >>> >>> >>> *Med vänliga hälsningar,Jan Ainali* >>> >>> Verksamhetschef, Wikimedia Sverige http://wikimedia.se >>> 0729 - 67 29 48 >>> >>> >>> *Tänk dig en värld där varje människa har fri tillgång till >>> mänsklighetens samlade kunskap. Det är det vi gör.* >>> Bli medlem. http://blimedlem.wikimedia.se >>> >>> >>> 2015-08-18 17:23 GMT+02:00 Magnus Manske < >>> magnusmanske@googlemail.com>: >>> >>>> Show automatic description underneath "From Wikipedia...": >>>> https://en.wikipedia.org/wiki/User:Magnus_Manske/autodesc.js >>>> >>>> To use, add: >>>> importScript ( 'User:Magnus_Manske/autodesc.js' ) ; >>>> to your common.js >>>> >>>> On Tue, Aug 18, 2015 at 9:47 AM Jane Darnell jane023@gmail.com >>>> wrote: >>>> >>>>> It would be even better if this (short: 3 field max) >>>>> pipe-separated list was available as a gadget to wikidatans on Wikipedia >>>>> (like me). I can't see if a page I am on has an "instance of" (though it >>>>> should) and I can see the description thanks to another gadget (sorry no >>>>> idea which one that is). Often I will update empty descriptions, but if I >>>>> was served basic fields (so for a painting, the creator field), I would >>>>> click through to update that too. >>>>> >>>>> On Tue, Aug 18, 2015 at 9:58 AM, Federico Leva (Nemo) < >>>>> nemowiki@gmail.com> wrote: >>>>> >>>>>> Jane Darnell, 15/08/2015 08:53: >>>>>> >>>>>>> Yes but even if the descriptions were just the contents of >>>>>>> fields >>>>>>> separated by a pipe it would be better than nothing. >>>>>>> >>>>>> >>>>>> +1, item descriptions are mostly useless in my experience. >>>>>> >>>>>> As for "get into production on Wikipedia" I don't know what it >>>>>> means, I certainly don't like 1) mobile-specific features, 2) overriding >>>>>> existing manually curated content; but it's good to 3) fill gaps. Mobile >>>>>> folks often do (1) and (2), if they *instead* did (3) I'd be very happy. :) >>>>>> >>>>>> Nemo >>>>>> >>>>> >>>>> _______________________________________________ >>>>> Mobile-l mailing list >>>>> Mobile-l@lists.wikimedia.org >>>>> https://lists.wikimedia.org/mailman/listinfo/mobile-l >>>>> >>>> >>>> _______________________________________________ >>>> Mobile-l mailing list >>>> Mobile-l@lists.wikimedia.org >>>> https://lists.wikimedia.org/mailman/listinfo/mobile-l >>>> >>>> >>> >>> _______________________________________________ >>> Mobile-l mailing list >>> Mobile-l@lists.wikimedia.org >>> https://lists.wikimedia.org/mailman/listinfo/mobile-l >>> >>> >> >> _______________________________________________ >> Mobile-l mailing list >> Mobile-l@lists.wikimedia.org >> https://lists.wikimedia.org/mailman/listinfo/mobile-l >> >> > > > -- > Dmitry Brant > Mobile Apps Team (Android) > Wikimedia Foundation > https://www.mediawiki.org/wiki/Wikimedia_mobile_engineering > >
-- EN Wikipedia user page: https://en.wikipedia.org/wiki/User:Brian.gerstle IRC: bgerstle
-- Dmitry Brant Mobile Apps Team (Android) Wikimedia Foundation https://www.mediawiki.org/wiki/Wikimedia_mobile_engineering
Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l