New subject: propographical data and insert performance

7 Jun 2020


      ...
we have been facing the same issue, because we need to upload 20 Millons
items, and I woder if you were able to update those tables at least?,
perhaps we could work in the others together?
I have now been able to insert the data directly into the database.
The process consists of these steps:
- generate the data for an item in JSON
- determine the next Q number and update the JSON item data accordingly
- insert data into the various database tables
There are several foreign keys in the database. Therefore, you can
a) use MySQL's auto-increment and read the assigned ids after an insert
b) assume that you are the only process writing to the database and
assign the key in advance
I have written a Java program to perform these steps. The process is
still terrible slow. Generating and updating the JSON item data is
blazing fast.  I can process thousands of items per seconds.
The bottleneck is the MySQL database. Up to now I have used the standard
docker-compose setup. With the Java program I can write only one item
per second.
So I have tried option b) with pre-defined ids and wrote the SQL
statements into a text file. I fed these SQL statements into the MySQL
command line tool. But after two hours I lost patience.
Now I will try to optimize the MySQL database setup.
Jesper

Re: [Wikibase] propographical data and insert performance