@Renat, thank you for starting the Phabricator!
@Jesper, this looks really great! Do you know if this takes care of the secondary indexing for labels, specifically the wbt_item_terms, wbt_term_in_lang, wbt_text, and wbt_text_in_lang tables? I notice that these tables are not mentioned in the code.
Also one doubt: when you say "if you do this without a transaction", here you mean that if you do it without an explicit transaction / with a transaction per operation? I guess that the default behaviour is that each operation will form its own transaction and do the corresponding logging, latching, etc., for each insert, update, etc.?
@Dennis, agreed that this is part of the issue. Some of the scripts do provide options for batching, which certainly help significantly, but can still lead to lots of transactions / tasks / requests when importing at scale. It seems that some of the scripts for secondary indexing, however, do not support batching.
Best, Aidan
On 2021-07-23 10:39, Jesper Zedlitz wrote:
Anyone has experience, tips or pointers on converting and loading large-ish scale legacy data into Wikibase? Is there no complete solution (envisaged) for this right now?
Even though this topic is a few days old, I would like to add some of my experiences. I had the same problem about a year ago and wrote a Java program to insert millions of items pretty fast. It works for the LTS version. I don't know if it also works with the current version.
You can find the code here: https://github.com/jze/wikibase-insert
Best wishes, Jesper _______________________________________________ Wikibaseug mailing list -- wikibaseug@lists.wikimedia.org To unsubscribe send an email to wikibaseug-leave@lists.wikimedia.org