It's great, but I'm trying to optimize even more and If I'm not mistaken, there is no solution to reduce the number of requests to import my items, so the next step would be implementing some sort of multiprocessing, as suggested by Pellegrino Prevete.
I tried to implement that, unfortunately Pywikibot raises an APIerror : invalid CSRF token. It sounds like multiple processes share the same CSRF token to create / edit an item, which is a bit weird.
All in all, it got me thinking : is it even possible to use multiprocessing with Pywikibot at all ?
Here's some pseudo-code to show what I did :
'''
from multiprocessing.dummy import Pool
pool = Pool(processes=4)
results = pool.map(self.process_line, csv)
'''
Where `csv`is a list (already parsed CSV file), `self.process_line` is my method reading the data in the current csv line and creating the item with it.
1/ The idea is to read a CSV file, and create an item with its properties for each line. So I have to loop over thousands of lines and create an item and multiple claims associated, and it takes quite some time to do so. (atleast 1 hour to create 1000 items) I guess it's because for each line, I create a new entity and new claims, which means multiple requests for each line.
There is an API call, wbeditentity [1] that allows preparing an item with multiple claims which are written to WB in one call. its Are you aware of wikibase universal bot [2] and wikibase-tools [3]? Both cover functionality that should allow to do what you describe above and both use the wbeditentity call.
_______________________________________________
pywikibot mailing list
pywikibot@lists.wikimedia.orghttps://lists.wikimedia.org/mailman/listinfo/pywikibot