În lun., 4 feb. 2019 la 16:36, Kévin Bois kevin.bois@biblissima-condorcet.fr a scris:
Hello,
I'm trying to write a pywikibot script which read and create items / properties on my Wikibase instance. Following pieces of tutorials and script examples, I managed to write something working.
1/ The idea is to read a CSV file, and create an item with its properties for each line. So I have to loop over thousands of lines and create an item and multiple claims associated, and it takes quite some time to do so. (atleast 1 hour to create 1000 items) I guess it's because for each line, I create a new entity and new claims, which means multiple requests for each line. Some pseudo code I use in my script: To create a new item, I use : repo.editEntity({}, {}, summary='new item') assuming repo = site.data_repository() To create a new claim, I use : self.user_add_claim_unless_exists(item, claim), assuming my Bot inherit WikidataBot
Is there a better way to optimize that kind of bulk import ?
Not sure about this, but you might consider using low-level API functions directly or even crafting your API calls by hand. That kind of defies the purpose of using pwb, but oh well...
--
2/ I kind of have the same problem If I want to check if an item already exists, because first I need to get all existing items and check if they are in my CSV or not. (the CSV does not contain QIDs, but does contain a "custom" ID I've created and added as a property to each item )
This sounds like a great job for a SPARQL Query (see https://query.wikidata.org for the public endpoint for WIkidata). Is it feasible to add such an interface to your instance?
Regards, Strainu
--
I hope I was clear enough, any relevant example, idea, advice, would be much appreciated. Bear in mind I'm a beginner with the whole ecosystem so I'm open to any recommendation. Thanks ! _______________________________________________ pywikibot mailing list pywikibot@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/pywikibot