On Thu, Aug 20, 2015 at 1:43 AM Monte Hurd <mhurd@wikimedia.org> wrote:

True about algorithms never being finished, but aren't we essentially "stuck" with the first run output, unless I misunderstand how you envision this working?

(assuming you don't want to over-write non-blank descriptions the next time you improve and re-run the process)

Of course we're not "stuck" with the initial automatic descriptions! Whatever gave you that idea? Ideally, each description would be computed on-the-fly, but that won't scale; output needs to be cached, and invalidated when necessary.

Possible reasons for cache invalidation:

* The item statements have changed

* Items referenced in the description (e.g. country for nationality) have changed

* The algorithm has been improved

* After cache reached a certain age, just to make sure

This is why the automatic description cache and the manual description need to be kept separate; just "pasting" the autodesc into the manual description field would mean it could never be updated automatically. That would be very bad indeed.