Dear all,
First, let me introduce myself, I am Pascal Lefeuvre from the french
national library (BnF).
I am scrum master on a project which aims at building the new software for
cataloguing documents at the BnF.
For this software, we have decided to use a private wikibase instance to
store and manage the data we will produce.
We are very interested in what you have done Jesper since we will need to
initialize our wikibase with more than 50 millions of items.
We have made some experiment using mediawiki/wikibase API.
we developed a bot to call the create item API in several threads
in order to speed up the process but it is not sufficient.
we have planned to run the bot on several servers at a time to see
if it goes faster.
I have some questions about what you did : are elastic search index and
blazegraph automatically synchronized when you create your items directly
in the database without using API or do you have to run specific scripts
to synchronize everything ?
Thank you again for your experiments.
Pascal
De : "Jesper Zedlitz" <jesper(a)zedlitz.de>
A : wikibaseug(a)lists.wikimedia.org
Date : 11/06/2020 15:17
Objet : Re: [Wikibase] propographical data and insert performance
Envoyé par : "Wikibaseug" <wikibaseug-bounces(a)lists.wikimedia.org>
After I have
cleaned up my demo code a bit I am going to share it
via GitHub.
yes, please! I'm really interested in such a speedup. We need to load
eight million of items and any speedup is appreciated :)
You can find it here:
https://github.com/jze/wikibase-insert
Jesper
_______________________________________________
Wikibaseug mailing list
Wikibaseug(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibaseug
En raison de la situation sanitaire en France concernant le Covid-19, et suite aux
instructions du Gouvernement, tous les sites de la Bibliothèque nationale de France sont
fermés au public jusqu’à nouvel ordre. Avant d'imprimer, pensez à
l'environnement.