On Mon Feb 09 2015 at 12:08:01 Amir E. Aharoni <amir.aharoni(a)mail.huji.ac.il>
wrote:
Manual descriptions are not an entire waste of time.
I never said that, so please don't put words in my mouth. I did say, on
several occasions, that, for the vast majority of items, manual
descriptions are a waste of volunteer time. For cases where "the point" of
the item is not easily expressible as statements, manual descriptions are
indeed useful. But the number of cases where this applies will only shrink,
as we get more properties on Wikidata, and as the description generators
get better.
Magnus writes in his post <http://magnusmanske.de/wordpress/?p=265>:
And some people have seen my Reasonator
<http://tools.wmflabs.org/reasonator/?q=Q1339> tool, where (for some item
types, and some languages) rather long descriptions can be generated.
It's not necessary gd that they are long. For the mobile app it's better
if they are short.
Which is one reason my automatic description API defaults to short
descriptions, unless you ask for a long one. Even then, it will default to
short, unless the language/item type combination is covered. The short
description generator covers more languages, and is easily extendable.
Check out "stock" at
https://bitbucket.org/magnusmanske/autodesc/src/019b395c1bd5e13720e5cfda4df…
I think the results are, at the very least, understandable by humans; and
yes, these things tend to get better very quickly (developers! developers!
<throws chair>).
But the "some item types, and some languages" part is the real problem.
Only some. It's quite possible that in the future Reasonator will cover all
languages and all data types and will also be tweaked t provide appropriate
length, maybe even different lengths according to context. Reasonator
natural language sentence creation works for a very small number of
languages. If it was as easy to translate it as it is to translate
MediaWiki UI messages, I wouldn't object to its wider, but AFAIK this is
not the case not.
And it's not that good for English either. Reasonator is not smart enough
at the moment to describe people with several qualifications. The current
Reasonator-generated description of Peter Garrett
<https://tools.wmflabs.org/reasonator/?find=peter+garrett> is vastly
inferior to the manually-written description. Compare:
1. "Australian singer and politician, Minister for School Education, Early
Childhood and Youth, Minister for Sustainability, Environment, Water,
Population and Communities (Australia), and Member of the Australian House
of Representatives (*1953) ♂"
2. "Australian politician and Midnight Oil lead singer".
Basic human intuition tells me that for most Wikipedia readers, who simply
want to know "Who is Peter Garrett?", #2 is far more useful. #1 has
oversize descriptions of all his political roles, and *doesn't* have the
name the rock band that made him popular. This is just one example out of
hundreds of thousands that could be brought up. For what it's worth, #2 is
also easier to translate manually.
It's important to emphasize at this point that I have the utmost respect
to Magnus's brilliant work. It's just not ready to completely replace the
manual descriptions.
Thanks, and it should not. But a little developer time can save megahours
(new unit!) of volunteers performing needless work.
A practical solution for now is to have a system for manual translation of
descriptions, which shows the Reasonator descriptions as a translation aid,
similarly to how the Translate extension shows translation memory
suggestions. Also, a way to manually tweak descriptions can take Reasonator
further, for example a way to tell it that for the Peter Garrett item
there's no need to include a long list of all his roles in the Australian
government.
Oh, and even if you can run away some day from manually translating
descriptions, you cannot run away from manually translating labels. At
most, some can be copied from Wikipedia, but even then many of them need
post-import fixing.
So all of this brings me back to
https://phabricator.wikimedia.org/T64695
.
Yes, labels are a different beast. Though there are some things we could
automate; basically, all people with a German label and no English one
could use the German one in English just as well. Those cases aside,
automated translation can be helpful, but can also go horribly wrong in
cases.
--
Amir Elisha Aharoni · אָמִיר אֱלִישָׁע אַהֲרוֹנִי
http://aharoni.wordpress.com
“We're living in pieces,
I want to live in peace.” – T. Moore
2015-02-09 12:58 GMT+02:00 Daniel Kinzler <daniel.kinzler(a)wikimedia.de>de>:
@Gerard, @Magnus: please help me out here.
I agree that automatic descriptions are very useful. I also think that in
*some*
cases, manual descriptions are more useful, and maybe even needed.
I definitely think that 3rd party consumers of wikidata should not have
to think
about whether descriptions have been written manually or were created
automatically. This should be completely transparent.
So, if you want to help with making automated description a reality,
please make
suggestions that take into account the above points, and also consider the
mechanisms for language fallback.
The only thing that I can think of right away is simply inserting
automated
descriptions by bot. This isn't ideal, but I can't think of a better
solution
that wouldn't be hugely complicated (and would thus not be implemented
any time
soon). Maybe you have ideas?
-- daniel
Am 09.02.2015 um 11:41 schrieb Magnus Manske:
Manual descriptions are, in the vast majority of
cases, a waste of
volunteer
time. Alternative:
http://magnusmanske.de/wordpress/?p=265
On Sun Feb 08 2015 at 17:37:42 Gerard Meijssen <
gerard.meijssen(a)gmail.com
<mailto:gerard.meijssen@gmail.com>>
wrote:
Hoi,
How does that help ? The point is exactly that there is no point to
descriptions. Why iterate on a dog it will still be a mutt.
Thanks,
GerardM
On 8 February 2015 at 14:07, Amir E. Aharoni <
amir.aharoni(a)mail.huji.ac.il
<mailto:amir.aharoni@mail.huji.ac.il>> wrote:
I'd rather see it not as something terribly disappointing, but
as an
opportunity to find a way to fill item
descriptions more
efficiently.
Basically, to find some cycles to resolve
https://phabricator.wikimedia.org/T64695
בתאריך 8 בפבר 2015 10:33, "Gerard Meijssen" <
gerard.meijssen(a)gmail.com
<mailto:gerard.meijssen@gmail.com>> כתב:
Hoi,
I understand that item descriptions are going to be used in
a mobile
app. In my opinion that is seriously
disappointing because
it is not
realistic to expect enough coverage
in any language.
Particularly in
the small languages it will not be
really useful.
My question is: we have had automated descriptions for a
long time.
What is it that they makes that they
are not used.?
Thanks,
GerardM
_______________________________________________
Wikidata-l mailing list
Wikidata-l(a)lists.wikimedia.org <mailto:
Wikidata-l(a)lists.wikimedia.org>
https://lists.wikimedia.org/mailman/listinfo/wikidata-l
_______________________________________________
Wikidata-l mailing list
Wikidata-l(a)lists.wikimedia.org <mailto:
Wikidata-l(a)lists.wikimedia.org>
https://lists.wikimedia.org/mailman/listinfo/wikidata-l
_________________________________________________
Wikidata-l mailing list
Wikidata-l(a)lists.wikimedia.org <mailto:
Wikidata-l(a)lists.wikimedia.org>
https://lists.wikimedia.org/__mailman/listinfo/wikidata-l
<https://lists.wikimedia.org/mailman/listinfo/wikidata-l>
_______________________________________________
Wikidata-l mailing list
Wikidata-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l
--
Daniel Kinzler
Senior Software Developer
Wikimedia Deutschland
Gesellschaft zur Förderung Freien Wissens e.V.
_______________________________________________
Wikidata-l mailing list
Wikidata-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l
_______________________________________________
Wikidata-l mailing list
Wikidata-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l