Thank you very much Laurence.
From what I read on the web, I understand that this limit comes from elastic search in order to, as you mentionned it, prevent execution time and memory issues. Considering this, we have decided to query our data using a sparql request. It seems to work better as we get results almost immediately even with an offset of 80000.
Below the sparql request we send : PREFIX entity: http://pfcnoemigration-wiki.bnf.fr/entity/ PREFIX prop: http://pfcnoemigration-wiki.bnf.fr/prop/direct/ SELECT ?ent WHERE { ?ent prop:P1 entity:Q58. SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],fr". } } LIMIT 50 OFFSET 80000
Best regards,
Pascal Lefeuvre Scrum master for the French National Library
De : "Laurence Parry" greenreaper@hotmail.com A : "Wikibase Community User Group" wikibaseug@lists.wikimedia.org Date : 29/06/2021 15:13 Objet : [Wikibase] Re: : Querying a wikibase with api.php - problem when offset greater than 10000
Hello Pascal,
Unfortunately the search offset limit does not seem to be modifiable via standard configuration settings (I'd be glad to be corrected here).
But I think you may be able to adjust it by editing extensions/CirrusSearch/includes/Searcher.php
Specifically, there is a constant there, MAX_OFFSET_LIMIT, which is set to 10000: https://github.com/wbstack/mediawiki/blob/15862d7af0c6b32e288a76c77aeae8e994... (Code may differ slightly depending on version.)
You could try setting that to a different value and see if it helps, after doing whatever is appropriate to your setup to ensure that you are not getting a cached version of the old code after editing it.
Bear in mind that an offset-based query might be slow, especially 50 at a time, as it may process earlier entries each time you make a request.
Let us know if that is doable and works for you - if not, perhaps others have ideas (or maybe there could be interest in making it a configurable). I imagine the advice might be to use WDQS, but it might not provide the snippets you're looking for.
Best regards, -- Laurence "GreenReaper" Parry - Curator, WikiFur
From: pascal.lefeuvre@bnf.fr pascal.lefeuvre@bnf.fr Sent: Tuesday, June 29, 2021 1:10:15 PM To: wikibaseug@lists.wikimedia.org wikibaseug@lists.wikimedia.org Subject: [Wikibase] : Querying a wikibase with api.php - problem when offset greater than 10000
Dear Wikibase users;
I write you since I've just encountered a problem with my wikibase. I am querying my wikibase using api.php. The query returns more than 10000 items. I process the results page by page using srlimit and sroffset parameters. The problem appears when sroffset becomes greater than 10000. Then I get this error
{"batchcomplete":"","warnings":{"search":{"*":"Could not retrieve results. Up to 10000 search results are supported, but results starting at 10000 were requested."}},"query":{"searchinfo":{"totalhits":0},"search":[]}}
The request is <My Wikibase URL>/w/api.php?action=query&format=json&list=search&srsearch=haswbstatement:%22P338=Q58%22&srprop=snippet|titlesnippet|redirecttitle&srlimit=50&sroffset=10000
Is there a way to go over this limit ?
Thank you for your answers.
Pascal Lefeuvre Scrum master for the French National Library Visitez les expositions sur le site François-Mitterrand et retrouvez les manifestations culturelles du mois de juin sur place ou à distance. La bibliothèque tous publics est ouverte du mardi au samedi de 10 h à 19 h. Les bibliothèques de recherche sont ouvertes, sur le site François-Mitterrand, le lundi de 14 h à 19 h et du mardi au samedi de 10 h à 19 h Les sites Richelieu, Arsenal et Opéra retrouvent leurs horaires habituels. Consulter les modalités d’accès Avant d'imprimer, pensez à l'environnement. _______________________________________________ Wikibaseug mailing list -- wikibaseug@lists.wikimedia.org To unsubscribe send an email to wikibaseug-leave@lists.wikimedia.org
Visitez les expositions sur le site François-Mitterrand et retrouvez les manifestations culturelles du mois de juin sur place ou à distance. La bibliothèque tous publics est ouverte du mardi au samedi de 10 h à 19 h. Les bibliothèques de recherche sont ouvertes, sur le site François-Mitterrand, le lundi de 14 h à 19 h et du mardi au samedi de 10 h à 19 h Les sites Richelieu , Arsenal et Opéra retrouvent leurs horaires habituels. Consulter les modalités d’accès Avant d'imprimer, pensez à l'environnement.