Hello,
When we issue queries using the MWAPI [1] programatically, we sometimes get a 500 error. I would guess that happens in more than 30% of the cases. The following SPARQL query causes the error:
SELECT DISTINCT * WHERE { SERVICE wikibase:mwapi { bd:serviceParam wikibase:api "EntitySearch" . bd:serviceParam wikibase:endpoint "www.wikidata.orghttp://www.wikidata.org/" . bd:serviceParam wikibase:limit "once" . bd:serviceParam mwapi:search "integre" . bd:serviceParam mwapi:language "en" . ?subj wikibase:apiOutputItem mwapi:item . } } LIMIT 2000
When calling the SPARQL endpoint from e.g. Python (sparqlwrapper) or Java (rdf4j), I sometimes get the following stack trace:
SPARQL-QUERY: queryStr= SELECT DISTINCT * WHERE { SERVICE wikibase:mwapi { bd:serviceParam wikibase:api "EntitySearch" . bd:serviceParam wikibase:endpoint "www.wikidata.orghttp://www.wikidata.org/" . bd:serviceParam wikibase:limit "once" . bd:serviceParam mwapi:search "integre" . bd:serviceParam mwapi:language "en" . ?subj wikibase:apiOutputItem mwapi:item . } } LIMIT 2000 java.util.concurrent.ExecutionException: java.util.concurrent.ExecutionException: org.openrdf.query.QueryEvaluationException: java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.Exception: task=ChunkTask{query=66f912d3-e0e8-4bb7-93fe-1ebb044e5ade,bopId=1,partitionId=-1,sinkId=2,altSinkId=null}, cause=java.util.concurrent.ExecutionException: java.util.concurrent.ExecutionException: java.lang.RuntimeException: java.lang.RuntimeException: MWAPI request failed at java.util.concurrent.FutureTask.report(FutureTask.java:122) at java.util.concurrent.FutureTask.get(FutureTask.java:206) at com.bigdata.rdf.sail.webapp.BigdataServlet.submitApiTask(BigdataServlet.java:292) at com.bigdata.rdf.sail.webapp.QueryServlet.doSparqlQuery(QueryServlet.java:678) at com.bigdata.rdf.sail.webapp.QueryServlet.doGet(QueryServlet.java:290) at com.bigdata.rdf.sail.webapp.RESTServlet.doGet(RESTServlet.java:240) at com.bigdata.rdf.sail.webapp.MultiTenancyServlet.doGet(MultiTenancyServlet.java:273) at javax.servlet.http.HttpServlet.service(HttpServlet.java:687) at javax.servlet.http.HttpServlet.service(HttpServlet.java:790) at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:865) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1655) at org.wikidata.query.rdf.blazegraph.throttling.ThrottlingFilter.doFilter(ThrottlingFilter.java:320) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1642) at org.wikidata.query.rdf.blazegraph.throttling.SystemOverloadFilter.doFilter(SystemOverloadFilter.java:82) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1642) at ch.qos.logback.classic.helpers.MDCInsertingServletFilter.doFilter(MDCInsertingServletFilter.java:49) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1642) at org.wikidata.query.rdf.blazegraph.filters.QueryEventSenderFilter.doFilter(QueryEventSenderFilter.java:93) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1642) at org.wikidata.query.rdf.blazegraph.filters.ClientIPFilter.doFilter(ClientIPFilter.java:43) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1642) at org.wikidata.query.rdf.blazegraph.filters.RealAgentFilter.doFilter(RealAgentFilter.java:33) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1634) at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:533) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:146) at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132) at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:257) at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1595) at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:255) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1340) at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:203) at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:473) at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1564) at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:201) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1242) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:144) at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:220) at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:126) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132) at org.eclipse.jetty.server.Server.handle(Server.java:503) at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:364) at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:260) at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:305) at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:103) at org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:118) at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:333) at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:310) at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:168) at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:126) at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:366) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:765) at org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:683) at java.lang.Thread.run(Thread.java:748) Caused by: java.util.concurrent.ExecutionException: org.openrdf.query.QueryEvaluationException: java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.Exception: task=ChunkTask{query=66f912d3-e0e8-4bb7-93fe-1ebb044e5ade,bopId=1,partitionId=-1,sinkId=2,altSinkId=null}, cause=java.util.concurrent.ExecutionException: java.util.concurrent.ExecutionException: java.lang.RuntimeException: java.lang.RuntimeException: MWAPI request failed at java.util.concurrent.FutureTask.report(FutureTask.java:122) at java.util.concurrent.FutureTask.get(FutureTask.java:192) at com.bigdata.rdf.sail.webapp.QueryServlet$SparqlQueryTask.call(QueryServlet.java:889) at com.bigdata.rdf.sail.webapp.QueryServlet$SparqlQueryTask.call(QueryServlet.java:695) at com.bigdata.rdf.task.ApiTaskForIndexManager.call(ApiTaskForIndexManager.java:68) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ... 1 more Caused by: org.openrdf.query.QueryEvaluationException: java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.Exception: task=ChunkTask{query=66f912d3-e0e8-4bb7-93fe-1ebb044e5ade,bopId=1,partitionId=-1,sinkId=2,altSinkId=null}, cause=java.util.concurrent.ExecutionException: java.util.concurrent.ExecutionException: java.lang.RuntimeException: java.lang.RuntimeException: MWAPI request failed at com.bigdata.rdf.sail.Bigdata2Sesame2BindingSetIterator.hasNext(Bigdata2Sesame2BindingSetIterator.java:188) at info.aduna.iteration.IterationWrapper.hasNext(IterationWrapper.java:68) at org.openrdf.query.QueryResults.report(QueryResults.java:155) at org.openrdf.repository.sail.SailTupleQuery.evaluate(SailTupleQuery.java:76) at com.bigdata.rdf.sail.webapp.BigdataRDFContext$TupleQueryTask.doQuery(BigdataRDFContext.java:1722) at com.bigdata.rdf.sail.webapp.BigdataRDFContext$AbstractQueryTask.innerCall(BigdataRDFContext.java:1579) at com.bigdata.rdf.sail.webapp.BigdataRDFContext$AbstractQueryTask.call(BigdataRDFContext.java:1544) at com.bigdata.rdf.sail.webapp.BigdataRDFContext$AbstractQueryTask.call(BigdataRDFContext.java:757) ... 4 more Caused by: java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.Exception: task=ChunkTask{query=66f912d3-e0e8-4bb7-93fe-1ebb044e5ade,bopId=1,partitionId=-1,sinkId=2,altSinkId=null}, cause=java.util.concurrent.ExecutionException: java.util.concurrent.ExecutionException: java.lang.RuntimeException: java.lang.RuntimeException: MWAPI request failed at com.bigdata.relation.accesspath.BlockingBuffer$BlockingIterator.checkFuture(BlockingBuffer.java:1523) at com.bigdata.relation.accesspath.BlockingBuffer$BlockingIterator._hasNext(BlockingBuffer.java:1710) at com.bigdata.relation.accesspath.BlockingBuffer$BlockingIterator.hasNext(BlockingBuffer.java:1563) at com.bigdata.striterator.AbstractChunkedResolverator._hasNext(AbstractChunkedResolverator.java:365) at com.bigdata.striterator.AbstractChunkedResolverator.hasNext(AbstractChunkedResolverator.java:341) at com.bigdata.rdf.sail.Bigdata2Sesame2BindingSetIterator.hasNext(Bigdata2Sesame2BindingSetIterator.java:134) ... 11 more Caused by: java.util.concurrent.ExecutionException: java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.Exception: task=ChunkTask{query=66f912d3-e0e8-4bb7-93fe-1ebb044e5ade,bopId=1,partitionId=-1,sinkId=2,altSinkId=null}, cause=java.util.concurrent.ExecutionException: java.util.concurrent.ExecutionException: java.lang.RuntimeException: java.lang.RuntimeException: MWAPI request failed at java.util.concurrent.FutureTask.report(FutureTask.java:122) at java.util.concurrent.FutureTask.get(FutureTask.java:192) at com.bigdata.relation.accesspath.BlockingBuffer$BlockingIterator.checkFuture(BlockingBuffer.java:1454) ... 16 more Caused by: java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.Exception: task=ChunkTask{query=66f912d3-e0e8-4bb7-93fe-1ebb044e5ade,bopId=1,partitionId=-1,sinkId=2,altSinkId=null}, cause=java.util.concurrent.ExecutionException: java.util.concurrent.ExecutionException: java.lang.RuntimeException: java.lang.RuntimeException: MWAPI request failed at com.bigdata.rdf.sail.RunningQueryCloseableIterator.checkFuture(RunningQueryCloseableIterator.java:59) at com.bigdata.rdf.sail.RunningQueryCloseableIterator.close(RunningQueryCloseableIterator.java:73) at com.bigdata.rdf.sail.RunningQueryCloseableIterator.hasNext(RunningQueryCloseableIterator.java:82) at com.bigdata.striterator.ChunkedWrappedIterator.hasNext(ChunkedWrappedIterator.java:197) at com.bigdata.striterator.AbstractChunkedResolverator$ChunkConsumerTask.call(AbstractChunkedResolverator.java:222) at com.bigdata.striterator.AbstractChunkedResolverator$ChunkConsumerTask.call(AbstractChunkedResolverator.java:197) ... 4 more Caused by: java.util.concurrent.ExecutionException: java.lang.Exception: task=ChunkTask{query=66f912d3-e0e8-4bb7-93fe-1ebb044e5ade,bopId=1,partitionId=-1,sinkId=2,altSinkId=null}, cause=java.util.concurrent.ExecutionException: java.util.concurrent.ExecutionException: java.lang.RuntimeException: java.lang.RuntimeException: MWAPI request failed at com.bigdata.util.concurrent.Haltable.get(Haltable.java:273) at com.bigdata.bop.engine.AbstractRunningQuery.get(AbstractRunningQuery.java:1516) at com.bigdata.bop.engine.AbstractRunningQuery.get(AbstractRunningQuery.java:104) at com.bigdata.rdf.sail.RunningQueryCloseableIterator.checkFuture(RunningQueryCloseableIterator.java:46) ... 9 more Caused by: java.lang.Exception: task=ChunkTask{query=66f912d3-e0e8-4bb7-93fe-1ebb044e5ade,bopId=1,partitionId=-1,sinkId=2,altSinkId=null}, cause=java.util.concurrent.ExecutionException: java.util.concurrent.ExecutionException: java.lang.RuntimeException: java.lang.RuntimeException: MWAPI request failed at com.bigdata.bop.engine.ChunkedRunningQuery$ChunkTask.call(ChunkedRunningQuery.java:1367) at com.bigdata.bop.engine.ChunkedRunningQuery$ChunkTaskWrapper.run(ChunkedRunningQuery.java:926) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at com.bigdata.concurrent.FutureTaskMon.run(FutureTaskMon.java:63) at com.bigdata.bop.engine.ChunkedRunningQuery$ChunkFutureTask.run(ChunkedRunningQuery.java:821) ... 3 more Caused by: java.util.concurrent.ExecutionException: java.util.concurrent.ExecutionException: java.lang.RuntimeException: java.lang.RuntimeException: MWAPI request failed at java.util.concurrent.FutureTask.report(FutureTask.java:122) at java.util.concurrent.FutureTask.get(FutureTask.java:192) at com.bigdata.bop.engine.ChunkedRunningQuery$ChunkTask.call(ChunkedRunningQuery.java:1347) ... 8 more Caused by: java.util.concurrent.ExecutionException: java.lang.RuntimeException: java.lang.RuntimeException: MWAPI request failed at java.util.concurrent.FutureTask.report(FutureTask.java:122) at java.util.concurrent.FutureTask.get(FutureTask.java:206) at com.bigdata.bop.controller.ServiceCallJoin$ChunkTask.doServiceCallWithConstant(ServiceCallJoin.java:351) at com.bigdata.bop.controller.ServiceCallJoin$ChunkTask.call(ServiceCallJoin.java:303) at com.bigdata.bop.controller.ServiceCallJoin$ChunkTask.call(ServiceCallJoin.java:215) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at com.bigdata.bop.engine.ChunkedRunningQuery$ChunkTask.call(ChunkedRunningQuery.java:1346) ... 8 more Caused by: java.lang.RuntimeException: java.lang.RuntimeException: MWAPI request failed at com.bigdata.bop.controller.ServiceCallJoin$ChunkTask$ServiceCallTask.doServiceCall(ServiceCallJoin.java:757) at com.bigdata.bop.controller.ServiceCallJoin$ChunkTask$ServiceCallTask.call(ServiceCallJoin.java:616) at com.bigdata.bop.controller.ServiceCallJoin$ChunkTask$ServiceCallTask.call(ServiceCallJoin.java:552) at java.util.concurrent.FutureTask.run(FutureTask.java:266) ... 3 more Caused by: java.lang.RuntimeException: MWAPI request failed at org.wikidata.query.rdf.blazegraph.mwapi.MWApiServiceCall$ContinueIterator.doSearchRequest(MWApiServiceCall.java:488) at org.wikidata.query.rdf.blazegraph.mwapi.MWApiServiceCall$ContinueIterator.<init>(MWApiServiceCall.java:453) at org.wikidata.query.rdf.blazegraph.mwapi.MWApiServiceCall$MultiSearchIterator.doNextSearch(MWApiServiceCall.java:391) at org.wikidata.query.rdf.blazegraph.mwapi.MWApiServiceCall$MultiSearchIterator.<init>(MWApiServiceCall.java:352) at org.wikidata.query.rdf.blazegraph.mwapi.MWApiServiceCall.call(MWApiServiceCall.java:160) at org.wikidata.query.rdf.blazegraph.mwapi.MWApiServiceCall.call(MWApiServiceCall.java:66) at com.bigdata.bop.controller.ServiceCallJoin$ChunkTask$ServiceCallTask.doBigdataServiceCall(ServiceCallJoin.java:770) at com.bigdata.bop.controller.ServiceCallJoin$ChunkTask$ServiceCallTask.doServiceCall(ServiceCallJoin.java:707) ... 6 more Caused by: java.util.concurrent.TimeoutException at org.eclipse.jetty.client.util.InputStreamResponseListener.get(InputStreamResponseListener.java:216) at org.wikidata.query.rdf.blazegraph.mwapi.MWApiServiceCall$ContinueIterator.doSearchRequest(MWApiServiceCall.java:477) ... 13 more
I use the following Python script to call the service, I assume http maybe also works:
from SPARQLWrapper import SPARQLWrapper, JSON
queryString = """ SELECT DISTINCT * WHERE { SERVICE wikibase:mwapi { bd:serviceParam wikibase:api "EntitySearch" . bd:serviceParam wikibase:endpoint "www.wikidata.orghttp://www.wikidata.org/" . bd:serviceParam wikibase:limit "once" . bd:serviceParam mwapi:search "integre" . bd:serviceParam mwapi:language "en" . ?subj wikibase:apiOutputItem mwapi:item . } } LIMIT 2000 """
sparql = SPARQLWrapper("https://query.wikidata.org/sparql", returnFormat=JSON) sparql.setQuery(queryString) results = sparql.query().convert() print(results)
Can we do something about this? Why does it throw a "java.util.concurrent.TimeoutException" on your side? Thank you very much for this service, it is very useful for us.
Best,
[1]: https://www.mediawiki.org/wiki/Wikidata_Query_Service/User_Manual/MWAPI
Jan-Christoph Klie, M. Sc. Doctoral Researcher Ubiquitous Knowledge Processing (UKP) Lab FB 20 / Department of Computer Science Technische Universität Darmstadt Hochschulstr. 10, 64289 Darmstadt, Germany phone +49 (0)6151 16-25297 room S2|02|B111 https://www.ukp.tu-darmstadt.dehttps://www.ukp.tu-darmstadt.de/
Jan-Christoph Klie, M. Sc. Doctoral Researcher
Ubiquitous Knowledge Processing (UKP) Lab FB 20 / Department of Computer Science Technische Universität Darmstadt Hochschulstr. 10, 64289 Darmstadt, Germany phone +49 (0)6151 16-25297 room S2|02|B111 https://www.ukp.tu-darmstadt.dehttps://www.ukp.tu-darmstadt.de/
On Tue, Apr 7, 2020 at 8:16 PM Jan-Christoph Klie < klie@ukp.informatik.tu-darmstadt.de> wrote:
Hello,
When we issue queries using the MWAPI [1] programatically, we sometimes get a 500 error. I would guess that happens in more than 30% of the cases. The following SPARQL query causes the error:
SELECT DISTINCT * WHERE { SERVICE wikibase:mwapi { bd:serviceParam wikibase:api "EntitySearch" . bd:serviceParam wikibase:endpoint "www.wikidata.org" . bd:serviceParam wikibase:limit "once" . bd:serviceParam mwapi:search "integre" . bd:serviceParam mwapi:language "en" . ?subj wikibase:apiOutputItem mwapi:item . } } LIMIT 2000
Hi,
unless I'm mistaken you don't seem to need to query this endpoint nor need SPARQL to search entities in wikidata. Have you looked at using the wbsearchentities[0] API endpoint. The SPARQL request you pasted in your message seems to be fulfilled using this API request: https://www.wikidata.org/w/api.php?action=wbsearchentities&format=json&a... in a much more efficient way. As to why the service is returning timeouts so often this is something we will be looking into.
0: https://www.wikidata.org/w/api.php?action=help&modules=wbsearchentities