On vacation, internet spotty. Quick thoughts:
* RESTbase seems a good way to cache automatic descriptions; would either need a "wrapper" to generate on-the-fly or serve cached version, or a bot generating, storing, and updating automatic descriptions for all items in RESTbase
* Maybe only generate/update automatic descriptions for item types that have dedicated generator code (e.g. biographies), and for supported languages, at least initially? "Generic" English description should be understandable for most items, don't know for other languages
* Dmitry has done some work on a /proper/ AutoDesc implementation; couldn't try it out yet, sadly, but looks great so far
* It should be straightforward to (optionally) link words/names in descriptions to the #statement/property in the described item, to quickly edit wrong statements

Love to see how people take this idea and run with it. That's the spirit! :-)


On Sat, Aug 22, 2015 at 11:07 AM Jane Darnell <jane023@gmail.com> wrote:
Yes. This should be a client feature, not a Wikidata feature (so something that is on Wikipedia and Commons)

On Fri, Aug 21, 2015 at 10:54 PM, Jan Ainali <jan.ainali@wikimedia.se> wrote:
I am with Ryan here, and I believe that is Magnus idea too, the autodescription should not be a field in the database, it should be queried on the fly from the statements.

Med vänliga hälsningar,
Jan Ainali

Verksamhetschef, Wikimedia Sverige 
0729 - 67 29 48


Tänk dig en värld där varje människa har fri tillgång till mänsklighetens samlade kunskap. Det är det vi gör.


2015-08-21 21:26 GMT+02:00 Ryan Kaldari <rkaldari@wikimedia.org>:
If the way to 'edit' the autodescription is by changing the claims for the item, I support the idea. I would oppose, however, the autodescription being another text field you can edit directly as I think this would be very confusing for Wikidata editors, as each item would effectively just have 2 interchangable description fields.

On Aug 21, 2015, at 11:21 AM, Jon Katz <jkatz@wikimedia.org> wrote:

This is a really interesting discussion and it seems that there is near-consensus that an automated description for entities without a manual description is not a bad idea, particularly if they are kept in a separate field.  Speak now if you feel that is not correct.

To S's suggestion: what steps do we need to take to put autodesc into wiki's?
  • establish consensus with stakeholders outside this thread?
  • create new field?
  • rule out/protect against edge cases (are their length limits, for instance)
  • ways to edit (explaining to a user how they can edit or override is going to be important)

Who should own it and create an epic to track?  Wikidata, Search, Reading?....

On Fri, Aug 21, 2015 at 10:27 AM, Monte Hurd <mhurd@wikimedia.org> wrote:
This is why the automatic description cache and the manual description need to be kept separate; just "pasting" the autodesc into the manual description field would mean it could never be updated automatically. That would be very bad indeed.

+1000!!!! Exactly! I was operating under the assumption we were talking about the existing "description" field. Separate auto and manual description fields completely avoids *all* of the issues/concerns I raised :)

On Thu, Aug 20, 2015 at 2:48 AM, Magnus Manske <magnusmanske@googlemail.com> wrote:
So it turns out that ValterVBot alone has created over 1.8 MILLION "manual" descriptions. And there are other bots that do this. We already HAVE automatic descriptions, we just store them in the "manual" field.

The worst of both worlds.

On Thu, Aug 20, 2015 at 9:24 AM Magnus Manske <magnusmanske@googlemail.com> wrote:
On Thu, Aug 20, 2015 at 1:43 AM Monte Hurd <mhurd@wikimedia.org> wrote:
True about algorithms never being finished, but aren't we essentially "stuck" with the first run output, unless I misunderstand how you envision this working?

(assuming you don't want to over-write non-blank descriptions the next time you improve and re-run the process)

Of course we're not "stuck" with the initial automatic descriptions! Whatever gave you that idea? Ideally, each description would be computed on-the-fly, but that won't scale; output needs to be cached, and invalidated when necessary.

Possible reasons for cache invalidation:
* The item statements have changed
* Items referenced in the description (e.g. country for nationality) have changed
* The algorithm has been improved
* After cache reached a certain age, just to make sure

This is why the automatic description cache and the manual description need to be kept separate; just "pasting" the autodesc into the manual description field would mean it could never be updated automatically. That would be very bad indeed.


_______________________________________________
Mobile-l mailing list
Mobile-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/mobile-l


_______________________________________________
Mobile-l mailing list
Mobile-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/mobile-l

_______________________________________________
Mobile-l mailing list
Mobile-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/mobile-l



_______________________________________________
Mobile-l mailing list
Mobile-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/mobile-l


_______________________________________________
Mobile-l mailing list
Mobile-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/mobile-l