On Aug 15, 2015 14:06, "Magnus Manske" <magnusmanske@googlemail.com> wrote:
>
>
>
> On Sat, Aug 15, 2015 at 7:38 AM Lydia Pintscher <lydia.pintscher@wikimedia.de> wrote:
>>
>> On Sat, Aug 15, 2015 at 3:43 AM, Dan Garry <dgarry@wikimedia.org> wrote:
>> > I've seen arguments on both sides here. Some say automatically generated
>> > descriptions are not good enough. Some say they are. Why don't we gather
>> > some data on this and use that to decide what's right? :-)
>>
>> Please do. Especially pay attention to languages other than English
>> though. Because even if we get algorithms to write good descriptions
>> for English are we going to do the same for all the other languages?
>> Especially those where grammar is tricky and Wikidata doesn't even
>> have the necessary information to make the grammar right? The other
>> tricky side is determining why something is actually notable. That's
>> not a trivial thing to determine based on the data we have.
>>
>
> And you know very well that (AFAIK) I am the only one who actually worked on this, in a tiny fraction of my spare time, and I only speak German and English.
>
> The /real/ questions here are:
> 1. The language that are actually implemented, are they returning descriptions that are good/OK/bad/plain wrong
> 2. What could be achieved, on the existing or similar infrastructure, in a short period of time, if we drive to get code snippets (or equivalent) for other languages from volunteers?
> 3. What could be achieved, medium/long term, if we had a proper linguist to work on the problem? Or someone who has worked with multi-language text generation before?
>
> I've just been winging it so far. Current auto-descriptions are not the best we can do. They are, frankly, the WORST we can do. This is a starting point, not the end product.Yeah I understand. And this is not a criticism of your work. I think it is actually rather cool. It is questioning if it is a good idea to continue to push it to get into production on Wikipedia on a large scale.