On Mon Feb 09 2015 at 11:27:06 Daniel Kinzler daniel.kinzler@wikimedia.de wrote:
Am 09.02.2015 um 12:17 schrieb Magnus Manske:
My autodesc API serves both at the moment, so the consumer can decide
which one
they want to use. Automatic descriptions can "miss the point" sometimes,
but are
generally more up-to-date.
Can you post a link for us to play with?
Interface at https://tools.wmflabs.org/autodesc/
Example JSONFM: https://tools.wmflabs.org/autodesc/?q=Q3184929&lang=&mode=short&...
In any case, the mobile app would need a production grade service, so it would have to wait until this is fully integrated with wikibase and live on wikidata.
I understand that. In my blog post about the API: http://magnusmanske.de/wordpress/?p=265 I point out that it is not exactly production-quality yet :-)
So, if you want to help with making automated description a reality,
please make
suggestions that take into account the above points, and also
consider the
mechanisms for language fallback.
From my point of view, this is the "evolution" of automatic descriptions
(ADs):
- web-based tools as proof-of-concept. This is done.
- web-based API to standardise automatic descriptions, and make them
easily
accessible for everyone. I am trying to do that now, 3. WMF/Wikibase-team picks up the API code, or writes their own;
integration
into MediaWiki/extension, with proper language generation in many
languages,
good caching/invalidation, API integration etc. Waiting for that :-)
As Markus points out, this does not address the needs of dump consumers. If the UI and API generate automatic summaries on the fly, there is very little incentive for users to enter descriptions manually (which is the point, of course). This means few descriptions in dumps.
To have the automatic summaries in the dumps, we would need to either materialize them in the database (and then invalidate/update them when appropriate), or we generated them on the fly when creating the dump.
Just put them into wb_terms and not into the JSON. They could be displayed, added to search results, and put into "description dumps". Maybe these could even be sqlite databases, as there is little point analysing automatic descriptions for wording; you'd need these descriptions to display with an item, so sqlite could be a way of getting them quickly.
In summary, I understand the issue, but it seems tricky to get the solution right, both conceptually, and in terms of engineering.
-- Daniel Kinzler Senior Software Developer
Wikimedia Deutschland Gesellschaft zur Förderung Freien Wissens e.V.
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l