Thank you very much Laurence.
From what I read on the web, I understand that this limit comes from
elastic search in order to, as you mentionned it, prevent execution time
and memory issues.
Considering this, we have decided to query our data using a sparql
request. It seems to work better as we get results almost immediately even
with an offset of 80000.
Below the sparql request we send :
PREFIX entity: <http://pfcnoemigration-wiki.bnf.fr/entity/>
PREFIX prop: <http://pfcnoemigration-wiki.bnf.fr/prop/direct/>
SELECT ?ent
WHERE {
?ent prop:P1 entity:Q58.
SERVICE wikibase:label { bd:serviceParam wikibase:language
"[AUTO_LANGUAGE],fr". }
}
LIMIT 50 OFFSET 80000
Best regards,
Pascal Lefeuvre
Scrum master for the French National Library
De : "Laurence Parry" <greenreaper(a)hotmail.com>
A : "Wikibase Community User Group" <wikibaseug(a)lists.wikimedia.org>
Date : 29/06/2021 15:13
Objet : [Wikibase] Re: : Querying a wikibase with api.php - problem when
offset greater than 10000
Hello Pascal,
Unfortunately the search offset limit does not seem to be modifiable via
standard configuration settings (I'd be glad to be corrected here).
But I think you may be able to adjust it by editing
extensions/CirrusSearch/includes/Searcher.php
Specifically, there is a constant there, MAX_OFFSET_LIMIT, which is set to
10000:
https://github.com/wbstack/mediawiki/blob/15862d7af0c6b32e288a76c77aeae8e99…
(Code may differ slightly depending on version.)
You could try setting that to a different value and see if it helps, after
doing whatever is appropriate to your setup to ensure that you are not
getting a cached version of the old code after editing it.
Bear in mind that an offset-based query might be slow, especially 50 at a
time, as it may process earlier entries each time you make a request.
Let us know if that is doable and works for you - if not, perhaps others
have ideas (or maybe there could be interest in making it a configurable).
I imagine the advice might be to use WDQS, but it might not provide the
snippets you're looking for.
Best regards,
--
Laurence "GreenReaper" Parry - Curator, WikiFur
From: pascal.lefeuvre(a)bnf.fr <pascal.lefeuvre(a)bnf.fr>
Sent: Tuesday, June 29, 2021 1:10:15 PM
To: wikibaseug(a)lists.wikimedia.org <wikibaseug(a)lists.wikimedia.org>
Subject: [Wikibase] : Querying a wikibase with api.php - problem when
offset greater than 10000
Dear Wikibase users;
I write you since I've just encountered a problem with my wikibase.
I am querying my wikibase using api.php. The query returns more than 10000
items. I process the results page by page using srlimit and sroffset
parameters.
The problem appears when sroffset becomes greater than 10000. Then I get
this error
{"batchcomplete":"","warnings":{"search":{"*":"Could
not retrieve results.
Up to 10000 search results are supported, but results starting at 10000
were
requested."}},"query":{"searchinfo":{"totalhits":0},"search":[]}}
The request is <My Wikibase
URL>/w/api.php?action=query&format=json&list=search&srsearch=haswbstatement:%22P338=Q58%22&srprop=snippet|titlesnippet|redirecttitle&srlimit=50&sroffset=10000
Is there a way to go over this limit ?
Thank you for your answers.
Pascal Lefeuvre
Scrum master for the French National Library
Visitez les expositions sur le site François-Mitterrand et retrouvez les
manifestations culturelles du mois de juin sur place ou à distance.
La bibliothèque tous publics est ouverte du mardi au samedi de 10 h à 19
h.
Les bibliothèques de recherche sont ouvertes, sur le site
François-Mitterrand, le lundi de 14 h à 19 h et du mardi au samedi de 10 h
à 19 h
Les sites Richelieu, Arsenal et Opéra retrouvent leurs horaires habituels.
Consulter les modalités d’accès
Avant d'imprimer, pensez à l'environnement.
_______________________________________________
Wikibaseug mailing list -- wikibaseug(a)lists.wikimedia.org
To unsubscribe send an email to wikibaseug-leave(a)lists.wikimedia.org
Visitez les expositions sur le site François-Mitterrand et retrouvez les manifestations
culturelles du mois de juin sur place ou à distance. La bibliothèque tous publics est
ouverte du mardi au samedi de 10 h à 19 h.
Les bibliothèques de recherche sont ouvertes, sur le site François-Mitterrand, le lundi de
14 h à 19 h et du mardi au samedi de 10 h à 19 h
Les sites Richelieu , Arsenal et Opéra retrouvent leurs horaires habituels.
Consulter les modalités d’accès Avant d'imprimer, pensez à l'environnement.