Request

List overview All Threads
Download

newer

older

Re: [Wikidata] Invitation to join...

The Wikimedia search engine, a...

Ahmed Mamdouh

9 Apr 2019 9 Apr '19

9:53 a.m.

Greetings All,

Hope this e-mail finds you well. I am currently doing a master project in NLP in JKU under the supervision of Prof. Bruno Buchberger the famous Austrian Mathematician.

I am facing a problem where I can’t get enough data for my project. So is there anything that can be done to extend the limit of queries as they timeout ?

Thanks in advance, Mamdouh

Show replies by date

Guillaume Lederrey

10 Apr 10 Apr

8:06 a.m.

Hello!

It isn't entirely clear from your email what kind of data you are looking for, or what endpoint you are using to get this data. If you need to extract large amount of data from Wikidata, you should probably start from the dumps [1], not from API calls. Without knowing more about your context, it is hard to recommend anything.

Good luck for your project!

Guillaume

[1] https://www.wikidata.org/wiki/Wikidata:Database_download

On Wed, Apr 10, 2019 at 9:14 AM Ahmed Mamdouh ahmed.mamdouh24@yahoo.com wrote:

...

Greetings All,

Hope this e-mail finds you well. I am currently doing a master project in NLP in JKU under the supervision of Prof. Bruno Buchberger the famous Austrian Mathematician.

I am facing a problem where I can’t get enough data for my project. So is there anything that can be done to extend the limit of queries as they timeout ?

Thanks in advance, Mamdouh _______________________________________________ Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata

-- Guillaume Lederrey Operations Engineer, Search Platform Wikimedia Foundation UTC+1 / CET

Stas Malyshev

4:19 p.m.

Hi!

...

...
I am facing a problem where I can’t get enough data for my project. So is there anything that can be done to extend the limit of queries as they timeout ?

If you have queries that take longer than timeout permits, the options usually would be:

1. Working with Wikidata dumps, as mentioned before

2. Looking into optimizing your query - maybe timeout happens because your query is too slow. Check out https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/query_optimizati... and https://www.wikidata.org/wiki/Wikidata:Request_a_query .

3. Download information in smaller chunks using LIMIT/OFFSET clauses. Note that this doesn't speed up query itself.

4. Use LDF server: https://www.mediawiki.org/wiki/Wikidata_Query_Service/User_Manual#Linked_Dat...

Depending on what data do you need, there probably would be the options.

-- Stas Malyshev smalyshev@wikimedia.org

Lucie Kaffee

12:58 p.m.

Hello Mamdouh,

As far as I know that is not possible, but you can also download the whole dataset as a dump and process it (e.g. query over the text data or set up your own SPARQL endpoint): https://www.wikidata.org/wiki/Wikidata:Database_download (JSON or RDF dumps are probably most helpful). Depending on your usecase this might be the right direction to look into :)

Feel free to reach out if there are more questions. Best, Lucie

On Wed, 10 Apr 2019 at 09:14, Ahmed Mamdouh ahmed.mamdouh24@yahoo.com wrote:

...

Greetings All,

Hope this e-mail finds you well. I am currently doing a master project in NLP in JKU under the supervision of Prof. Bruno Buchberger the famous Austrian Mathematician.

I am facing a problem where I can’t get enough data for my project. So is there anything that can be done to extend the limit of queries as they timeout ?

Thanks in advance, Mamdouh _______________________________________________ Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata

-- Lucie-Aimée Kaffee

1935

Age (days ago)

1936

Last active (days ago)

wikidata@lists.wikimedia.org

3 comments

4 participants

tags (0)

participants (4)

Ahmed Mamdouh
Guillaume Lederrey
Lucie Kaffee
Stas Malyshev