[Wikibase] Re: Experiences/doubts regarding bulk imports into Wikibase

24 Jul 2021

@Renat, thank you for starting the Phabricator!

@Jesper, this looks really great! Do you know if this takes care of the 
secondary indexing for labels, specifically the wbt_item_terms, 
wbt_term_in_lang, wbt_text, and wbt_text_in_lang tables? I notice that 
these tables are not mentioned in the code.

Also one doubt: when you say "if you do this without a transaction", 
here you mean that if you do it without an explicit transaction / with a 
transaction per operation? I guess that the default behaviour is that 
each operation will form its own transaction and do the corresponding 
logging, latching, etc., for each insert, update, etc.?

@Dennis, agreed that this is part of the issue. Some of the scripts do 
provide options for batching, which certainly help significantly, but 
can still lead to lots of transactions / tasks / requests when importing 
at scale. It seems that some of the scripts for secondary indexing, 
however, do not support batching.

Best,
Aidan

On 2021-07-23 10:39, Jesper Zedlitz wrote:
>> Anyone has experience, tips or pointers on converting
>> and loading large-ish scale legacy data into Wikibase? Is there no
>> complete solution (envisaged) for this right now?
> 
> Even though this topic is a few days old, I would like to add some of my
> experiences. I had the same problem about a year ago and wrote a Java
> program to insert millions of items pretty fast. It works for the LTS
> version. I don't know if it also works with the current version.
> 
> You can find the code here: https://github.com/jze/wikibase-insert
> 
> Best wishes,
> Jesper
> _______________________________________________
> Wikibaseug mailing list -- wikibaseug(a)lists.wikimedia.org
> To unsubscribe send an email to wikibaseug-leave(a)lists.wikimedia.org
> 

2024

2023

2022

2021

2020

2019

2018

[Wikibase] Re: Experiences/doubts regarding bulk imports into Wikibase