Thank you very much Laurence.
From what I read on the web, I understand
that this limit comes from elastic search in order to, as you mentionned
it, prevent execution time and memory issues.
Considering this, we have decided to
query our data using a sparql request. It seems to work better as we get
results almost immediately even with an offset of 80000.
Below the sparql request we send :
PREFIX entity: <http://pfcnoemigration-wiki.bnf.fr/entity/>
PREFIX prop: <http://pfcnoemigration-wiki.bnf.fr/prop/direct/>
SELECT ?ent
WHERE {
?ent prop:P1 entity:Q58.
SERVICE wikibase:label
{ bd:serviceParam wikibase:language "[AUTO_LANGUAGE],fr". }
}
LIMIT 50 OFFSET 80000
Best regards,
Pascal Lefeuvre
Scrum master for the French National Library
De :
"Laurence Parry"
<greenreaper@hotmail.com>
A :
"Wikibase Community
User Group" <wikibaseug@lists.wikimedia.org>
Date :
29/06/2021 15:13
Objet :
[Wikibase] Re:
: Querying a wikibase with api.php - problem when offset greater than 10000
Hello Pascal,
Unfortunately the search offset limit does not seem to
be modifiable via standard configuration settings (I'd be glad to be corrected
here).
But I think you may be able to adjust it by editing
extensions/CirrusSearch/includes/Searcher.php
Specifically, there is a constant there, MAX_OFFSET_LIMIT,
which is set to 10000:
https://github.com/wbstack/mediawiki/blob/15862d7af0c6b32e288a76c77aeae8e994f8de39/extensions/CirrusSearch/includes/Searcher.php#L71
(Code may differ slightly depending on version.)
You could try setting that to a different value and see
if it helps, after doing whatever is appropriate to your setup to ensure
that you are not getting a cached version of the old code after editing
it.
Bear in mind that an offset-based query might be slow,
especially 50 at a time, as it may process earlier entries each time you
make a request.
Let us know if that is doable and works for you - if not,
perhaps others have ideas (or maybe there could be interest in making it
a configurable). I imagine the advice might be to use WDQS, but it might
not provide the snippets you're looking for.
Best regards,
--
Laurence "GreenReaper" Parry - Curator, WikiFur
From: pascal.lefeuvre@bnf.fr <pascal.lefeuvre@bnf.fr>
Sent: Tuesday, June 29, 2021 1:10:15 PM
To: wikibaseug@lists.wikimedia.org <wikibaseug@lists.wikimedia.org>
Subject: [Wikibase] : Querying a wikibase with api.php - problem when
offset greater than 10000
Dear Wikibase users;
I write you since I've just encountered a problem with my wikibase.
I am querying my wikibase using api.php. The query returns more than 10000
items. I process the results page by page using srlimit and sroffset
parameters.
The problem appears when sroffset becomes greater than 10000.
Then I get this error
{"batchcomplete":"","warnings":{"search":{"*":"Could
not retrieve results. Up to 10000 search results are supported, but results
starting at 10000 were requested."}},"query":{"searchinfo":{"totalhits":0},"search":[]}}
The request is <My Wikibase URL>/w/api.php?action=query&format=json&list=search&srsearch=haswbstatement:%22P338=Q58%22&srprop=snippet|titlesnippet|redirecttitle&srlimit=50&sroffset=10000
Is there a way to go over this limit ?
Thank you for your answers.
Pascal Lefeuvre
Scrum master for the French National Library
Visitez les
expositions sur le site François-Mitterrand
et retrouvez les manifestations
culturelles du mois de juin
sur place ou à distance.
La bibliothèque tous publics est ouverte
du mardi au samedi de 10 h à 19 h.
Les bibliothèques de recherche sont ouvertes, sur le site François-Mitterrand,
le lundi de 14 h à 19 h et du mardi au samedi de 10 h à 19 h
Les sites Richelieu,
Arsenalet Opéraretrouvent leurs horaires habituels. Consulter
les modalités d’accès
Avant d'imprimer, pensez
à l'environnement._______________________________________________
Wikibaseug mailing list -- wikibaseug@lists.wikimedia.org
To unsubscribe send an email to wikibaseug-leave@lists.wikimedia.org
Visitez les expositions sur le site François-Mitterrand et retrouvez les manifestations culturelles du mois de juin sur place ou à distance.
La bibliothèque tous publics est ouverte du mardi au samedi de 10 h à 19 h.
Les bibliothèques de recherche sont ouvertes, sur le site François-Mitterrand, le lundi de 14 h à 19 h et du mardi au samedi de 10 h à 19 h
Les sites Richelieu, Arsenal et Opéra retrouvent leurs horaires habituels. Consulter les modalités d’accès
Avant d'imprimer, pensez à l'environnement.