Well, another use case for nearly-immediate updates:
I'll do a presentation next week, in which I intend to demonstrate that I can add a
Wikidata value online, which then is available immediately for my application - as well as
for the whole rest of the world. (In Library Land, that's a real blast, because
business processes related to authority data often take weeks or month ...)
That is an rather exotic and very infrequent use. Similar to James' use case, (if I
didn't get him wrong) it is not necessary to run these kind of queries in
production-strength settings. Perhaps, an current, un-cached "experimental" /
"unstable" endpoit could solve these kinds of use, too.
Cheers, Joachim
-----Ursprüngliche Nachricht-----
Von: Wikidata [mailto:wikidata-bounces@lists.wikimedia.org] Im Auftrag von James Heald
Gesendet: Mittwoch, 17. Februar 2016 17:21
An: Discussion list for the Wikidata project.
Betreff: Re: [Wikidata] SPARQL endpoint caching
On 17/02/2016 06:48, Markus Krötzsch wrote:
some random comments:
(1) Are there any concrete cases of applications that need
"super-up-to-date" results (where 120 sec is too old)? I do not
currently run or foresee to run any such application. Moreover, I
think that you have to allow for at least 60sec for an update to make
it into the RDF database, so 120sec seems to be already very close to
the freshness you could get at all. My applications would be fine with
getting updates every 10min.
Personally, I have quite often used WDQS to generate lists of items to things needing to
be fixed on Wikidata.
Having then done some fixes (typically by hand), I'll then re-run the query to see
what still needs to be done.
At this point it's quite frustrating if the database is lagging -- what I want is an
up-to-date representation of what still needs to be fixed; or whether everything is now
done.
So for this kind of use, the quicker an edit gets propagated to the search results the
better.
That said, I'm okay to put up with some occassional lag -- for example, if I know the
lag is ten minutes, I can go away and make a cup of coffee, or check the Wikidata email
list, or wherever the latest "knowledge engine" paranoia has got to. But (for
this kind of use anyway), more than the occasional ten-minute delay starts to get
annoying. (Which is why there should be big props to all the time that the SPARQL service
has usually very responsive to recent edits).
How relevant this mode of use is for caching I am not sure, because typically I'd do a
certain amount of editing before re-running the query.
But possibly if I found there was one edit I had missed, made the edit, then re-ran the
query to see if I'd finally got the output to look all just as it should -- that might
happen within a 120 second turnaround; so one would want at least to be able to purge the
results and re-run.
-- James.
_______________________________________________
Wikidata mailing list
Wikidata(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata