Valerio Bozzolan <boz+wiki(a)reyboz.it> hat am 28.
Januar 2020 um 17:43 geschrieben:
In order to further help you, can I ask you your Wikidata bot approval
discussion?
Dear Valerio,
Thanks for the quick answer, if I understood your answer, we should
be using an inappropriate approach at doing parallel programming in
the edition process.
In this case, we are aiming to have the data available asap, as soon
as we have it we should use another approach.
The question I made is about the necessity of loading large data
sets, because in the case of private instances, we need to load
20.000.000 of items for private use, and with a rate of 10 items per
second, using the approach we are following we will require 25 days,
with a script writing 24 hour a day, and speaking in big data terms,
20 M is an small data set.
So, I leave an open question:
my questions is if there is some experience when has been possible to
have a higher speed in edition rate?.
Best regards
Valerio Bozzolan <boz+wiki(a)reyboz.it> hat
am 28. Januar 2020 um
09:28 geschrieben:
Please note that - AFAIK - parallel requests are not well accepted.
https://www.mediawiki.org/wiki/API:Etiquette
(You may have a bigger problem now :^)
On Tue, 2020-01-28 at 08:13 +0100, wp1080397-lsrs wp1080397-lsrs
wrote:
Dear friends,
We have been working for some months in a wikidata project, and
we
have found an issue with edition performance, I began to work
with
wikidata java api, and when I tried to increase the edition speed
the
java system held editions, and inserted delays, which reduced
edition
output as well.
I chose the option to edit with pywikibot, but my experience was
that
this reduced more the edition.
At the end we use the procedure indicated here:
https://www.mediawiki.org/wiki/API:Edit#Example
With multithreading, and we reach a maximum of 10,6 edition per
second.
my questions is if there is some experience when has been
possible to
have a higher speed?.
Currently we need to write 1.500.000 items, and we would require
5
working days for such a task.
Best regards
Luis Ramos
Senior Java Developer
(Semantic Web Developer)
PST.AG
Jena, Germany.
_______________________________________________
Mediawiki-api mailing list
Mediawiki-api(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/mediawiki-api
_______________________________________________
Mediawiki-api mailing list
Mediawiki-api(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/mediawiki-api
Luis Ramos
Senior Java Developer
(Semantic Web Developer)
PST.AG
Jena, Germany.
_______________________________________________
Mediawiki-api mailing list
Mediawiki-api(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/mediawiki-api
_______________________________________________
Mediawiki-api mailing list
Mediawiki-api(a)lists.wikimedia.org