On Mon Feb 09 2015 at 11:27:06 Daniel Kinzler <daniel.kinzler@wikimedia.de> wrote:
Am 09.02.2015 um 12:17 schrieb Magnus Manske:
> My autodesc API serves both at the moment, so the consumer can decide which one
> they want to use. Automatic descriptions can "miss the point" sometimes, but are
> generally more up-to-date.

Can you post a link for us to play with?

Interface at
https://tools.wmflabs.org/autodesc/

Example JSONFM:
https://tools.wmflabs.org/autodesc/?q=Q3184929&lang=&mode=short&links=text&redlinks=&format=jsonfm
 

In any case, the mobile app would need a production grade service, so it would
have to wait until this is fully integrated with wikibase and live on wikidata.

I understand that. In my blog post about the API:
http://magnusmanske.de/wordpress/?p=265
I point out that it is not exactly production-quality yet :-)
 

>     So, if you want to help with making automated description a reality, please make
>     suggestions that take into account the above points, and also consider the
>     mechanisms for language fallback.
>
>
> From my point of view, this is the "evolution" of automatic descriptions (ADs):
> 1. web-based tools as proof-of-concept. This is done.
> 2. web-based API to standardise automatic descriptions, and make them easily
> accessible for everyone. I am trying to do that now,
> 3. WMF/Wikibase-team picks up the API code, or writes their own; integration
> into MediaWiki/extension, with proper language generation in many languages,
> good caching/invalidation, API integration etc. Waiting for that :-)

As Markus points out, this does not address the needs of dump consumers. If the
UI and API generate automatic summaries on the fly, there is very little
incentive for users to enter descriptions manually (which is the point, of
course). This means few descriptions in dumps.

To have the automatic summaries in the dumps, we would need to either
materialize them in the database (and then invalidate/update them when
appropriate), or we generated them on the fly when creating the dump.

Just put them into wb_terms and not into the JSON. They could be displayed, added to search results, and put into "description dumps". Maybe these could even be sqlite databases, as there is little point analysing automatic descriptions for wording; you'd need these descriptions to display with an item, so sqlite could be a way of getting them quickly.
 


In summary, I understand the issue, but it seems tricky to get the solution
right, both conceptually, and in terms of engineering.

--
Daniel Kinzler
Senior Software Developer

Wikimedia Deutschland
Gesellschaft zur Förderung Freien Wissens e.V.

_______________________________________________
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l