Re: [Wikidata] SPARQL endpoint caching

18 Feb 2016

Well, another use case for nearly-immediate updates: 

I'll do a presentation next week, in which I intend to demonstrate that I can add a
Wikidata value online, which then is available immediately for my application - as well as
for the whole rest of the world. (In Library Land, that's a real blast, because
business processes related to authority data often take weeks or month ...)

That is an rather exotic and very infrequent use. Similar to James' use case, (if I
didn't get him wrong) it is not necessary to run these kind of queries in
production-strength settings. Perhaps, an current, un-cached "experimental" /
"unstable" endpoit could solve these kinds of use, too.

Cheers, Joachim

-----Ursprüngliche Nachricht-----
Von: Wikidata [mailto:wikidata-bounces@lists.wikimedia.org] Im Auftrag von James Heald
Gesendet: Mittwoch, 17. Februar 2016 17:21
An: Discussion list for the Wikidata project.
Betreff: Re: [Wikidata] SPARQL endpoint caching

On 17/02/2016 06:48, Markus Krötzsch wrote:
...

 some random comments:

 (1) Are there any concrete cases of applications that need 
 "super-up-to-date" results (where 120 sec is too old)? I do not 
 currently run or foresee to run any such application. Moreover, I 
 think that you have to allow for at least 60sec for an update to make 
 it into the RDF database, so 120sec seems to be already very close to 
 the freshness you could get at all. My applications would be fine with 
 getting updates every 10min.

Personally, I have quite often used WDQS to generate lists of items to things needing to
be fixed on Wikidata.

Having then done some fixes (typically by hand), I'll then re-run the query to see
what still needs to be done.

At this point it's quite frustrating if the database is lagging -- what I want is an
up-to-date representation of what still needs to be fixed; or whether everything is now
done.

So for this kind of use, the quicker an edit gets propagated to the search results the
better.

That said, I'm okay to put up with some occassional lag -- for example, if I know the
lag is ten minutes, I can go away and make a cup of coffee, or check the Wikidata email
list, or wherever the latest "knowledge engine" paranoia has got to.  But (for
this kind of use anyway), more than the occasional ten-minute delay starts to get
annoying.  (Which is why there should be big props to all the time that the SPARQL service
has usually very responsive to recent edits).

How relevant this mode of use is for caching I am not sure, because typically I'd do a
certain amount of editing before re-running the query.

But possibly if I found there was one edit I had missed, made the edit, then re-ran the
query to see if I'd finally got the output to look all just as it should -- that might
happen within a 120 second turnaround; so one would want at least to be able to purge the
results and re-run.

   -- James.

_______________________________________________
Wikidata mailing list
Wikidata(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

Re: [Wikidata] SPARQL endpoint caching