It's generally advised to reply to the replies to your original mailing list post, rather than creating a very similar post few days later...
On Thu, 11 Jun 2020 at 13:33, Leandro Tabares Martín < leandro.tabaresmartin@uhasselt.be> wrote:
Hi,
I have downloaded Blazegraph already compiled from [1]. I also made the optimizations indicated at [2].
For the loading process I'm following the instructions given in the "getting-started.md" file that comes in the "docs" folder of the compiled distribution [1]. That means:
1- Munge the data with: ./munge.sh -f data/wikidata-20150427-all-BETA.ttl.gz -d data/split -l en -s 2- Start the loading process with: ./loadRestAPI.sh -n wdq -d `pwd`/data/split
Then the loading process starts with a rate of 84352. However, the rate has been progressively going down till 3362 after 36 hours of loading.
I'm running the process on a HPC with SSD and I'm giving to the loading process 3 cores and 120 GB RAM. On the other hand, I notice that the average processor usage doesn't go up over 1.6 and the maximum RAM usage is 14 GB.
I also saw [3] and I'm running the loading natively (without containers). I have the difference with [3] that I've reduced the JVM heap to 4GB as [2] suggested.
So what else could I do to improve the loading performance.
Thanks,
Leandro
[1] http://search.maven.org/#search%7Cgav%7C1%7Cg%3A%22org.wikidata.query.rdf%22... [2] https://github.com/blazegraph/database/wiki/IOOptimization [3] https://addshore.com/2019/10/your-own-wikidata-query-service-with-no-limits-... _______________________________________________ Wikidata-tech mailing list Wikidata-tech@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-tech