Hi all!
We have to impose a fixed limit on search result, since search results can not be ordered by a unique ID, so paging is expensive.
The default for this limit is 50, but it SHOULD be 500 for bots. But the higher limit for bots is currently not applied by the wbsearchentities module - that's a bug, see https://bugzilla.wikimedia.org/show_bug.cgi?id=54096.
We should be able to fix this soon. Please poke us again if nothing happens for a couple of weeks.
-- daniel
Am 12.09.2013 12:12, schrieb Merlijn van Deen:
On 11 September 2013 20:31, Chinmay Naik chin.naik26@gmail.com wrote:
Can i retreive more than 100 items using this? I notice the 'search-continue' returned by the search result disappears after 50 items. for ex https://www.wikidata.org/wiki/Special:ApiSandbox#action=wbsearchentities&...
The api docs at https://www.wikidata.org/w/api.php explicitly state the highest value for 'continue' is 50:
limit - Maximal number of results The value must be between 0 and 50 Default: 7 continue - Offset where to continue a search The value must be between 0 and 50 Default: 0
which indeed suggests there is a hard limit of 100 entries. Maybe someone in the Wikidata dev team can explain the reason behind this?
Merlijn
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Daniel,
Even 500 seems like a very low limit for this system unless I'm misunderstanding something. Unless there is another way to execute queries that return more rows than that, this would negate the possibility of a huge number of applications - all of ours in particular. If we want to say, request something like "all human genes" (about 20,000 items), how would we do that?
Within Wikipedia, we do this via the mediawiki API based on contains-template or category queries without any issue. Certainly wikidata will be more useful for queries than raw mediawiki???
I'm certain I am missing something, please clarify.
This is currently standing in the way of our GSoC student completing his summer project - due next week. A little disappointing for him..
thanks -Ben
On Fri, Sep 13, 2013 at 5:04 AM, Daniel Kinzler <daniel.kinzler@wikimedia.de
wrote:
Hi all!
We have to impose a fixed limit on search result, since search results can not be ordered by a unique ID, so paging is expensive.
The default for this limit is 50, but it SHOULD be 500 for bots. But the higher limit for bots is currently not applied by the wbsearchentities module - that's a bug, see https://bugzilla.wikimedia.org/show_bug.cgi?id=54096.
We should be able to fix this soon. Please poke us again if nothing happens for a couple of weeks.
-- daniel
Am 12.09.2013 12:12, schrieb Merlijn van Deen:
On 11 September 2013 20:31, Chinmay Naik chin.naik26@gmail.com wrote:
Can i retreive more than 100 items using this? I notice the 'search-continue' returned by the search result disappears after 50
items.
for ex
https://www.wikidata.org/wiki/Special:ApiSandbox#action=wbsearchentities&...
The api docs at https://www.wikidata.org/w/api.php explicitly state the highest value for 'continue' is 50:
limit - Maximal number of results The value must be between 0 and 50 Default: 7 continue - Offset where to continue a search The value must be between 0 and 50 Default: 0
which indeed suggests there is a hard limit of 100 entries. Maybe someone in the Wikidata dev team can explain the reason behind this?
Merlijn
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Am 13.09.2013 18:24, schrieb Benjamin Good:
Daniel,
Even 500 seems like a very low limit for this system unless I'm misunderstanding something. Unless there is another way to execute queries that return more rows than that, this would negate the possibility of a huge number of applications - all of ours in particular. If we want to say, request something like "all human genes" (about 20,000 items), how would we do that?
You are looking for actual *query* support, not just a "search by name". This is on the road map, and I hope we will be able to deploy it by the end of the year. But it's not possible yet.
Supporting queries like "all people born in hamburg" or "all cities in europe" is an obvious goal for wikidata. And we are working on it, but it's not trivial to make this scale to the number of entries, queries and different properties we are dealing with.
Within Wikipedia, we do this via the mediawiki API based on contains-template or category queries without any issue. Certainly wikidata will be more useful for queries than raw mediawiki???
See above.
I'm certain I am missing something, please clarify.
This is currently standing in the way of our GSoC student completing his summer project - due next week. A little disappointing for him..
Sorry, but we have never hidden the fact that our query interface is not ready yet. wbsearchentities is a label lookup designed for find-as-you-type suggestions. It's not a query interface, and was never supposed to be.
I understand the disappointment, but there is little we can do about this now.
All I can suggest is working from a dump right now (and sadly, we only have mediawiki's raw json-in-xml dumps at the moment. I'm working on native JSON and RDF dumps, but they are not ready).
-- daniel
OK, thanks for your reply. We will watch for new developments and incorporate them into our work as they are ready.
Keep up the good work on this important project! -Ben
On Fri, Sep 13, 2013 at 1:20 PM, Daniel Kinzler <daniel.kinzler@wikimedia.de
wrote:
Am 13.09.2013 18:24, schrieb Benjamin Good:
Daniel,
Even 500 seems like a very low limit for this system unless I'm misunderstanding something. Unless there is another way to execute
queries
that return more rows than that, this would negate the possibility of a huge number of applications - all of ours in particular. If we want to say, request something like "all human genes" (about 20,000 items), how would we do that?
You are looking for actual *query* support, not just a "search by name". This is on the road map, and I hope we will be able to deploy it by the end of the year. But it's not possible yet.
Supporting queries like "all people born in hamburg" or "all cities in europe" is an obvious goal for wikidata. And we are working on it, but it's not trivial to make this scale to the number of entries, queries and different properties we are dealing with.
Within Wikipedia, we do this via the mediawiki API based on contains-template or category queries without any issue. Certainly wikidata will be more useful for queries than raw mediawiki???
See above.
I'm certain I am missing something, please clarify.
This is currently standing in the way of our GSoC student completing his summer project - due next week. A little disappointing for him..
Sorry, but we have never hidden the fact that our query interface is not ready yet. wbsearchentities is a label lookup designed for find-as-you-type suggestions. It's not a query interface, and was never supposed to be.
I understand the disappointment, but there is little we can do about this now.
All I can suggest is working from a dump right now (and sadly, we only have mediawiki's raw json-in-xml dumps at the moment. I'm working on native JSON and RDF dumps, but they are not ready).
-- daniel
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
wikidata-tech@lists.wikimedia.org