Hi,
Please, find attached a picture of Blazegraph's performance during the load
of Wikidata's dataset. This is after increasing the resources assigned to
the job to 24 cores and 240 GB RAM. Do you think it is normal behaviour?
Thanks,
Leandro
On Thu, Jun 11, 2020 at 2:33 PM Leandro Tabares Martín <
leandro.tabaresmartin(a)uhasselt.be> wrote:
Hi,
I have downloaded Blazegraph already compiled from [1]. I also made the
optimizations indicated at [2].
For the loading process I'm following the instructions given in the
"getting-started.md" file that comes in the "docs" folder of the
compiled
distribution [1]. That means:
1- Munge the data with: ./munge.sh -f
data/wikidata-20150427-all-BETA.ttl.gz -d data/split -l en -s
2- Start the loading process with: ./loadRestAPI.sh -n wdq -d
`pwd`/data/split
Then the loading process starts with a rate of 84352. However, the rate
has been progressively going down till 3362 after 36 hours of loading.
I'm running the process on a HPC with SSD and I'm giving to the loading
process 3 cores and 120 GB RAM. On the other hand, I notice that the
average processor usage doesn't go up over 1.6 and the maximum RAM usage is
14 GB.
I also saw [3] and I'm running the loading natively (without containers).
I have the difference with [3] that I've reduced the JVM heap to 4GB as [2]
suggested.
So what else could I do to improve the loading performance.
Thanks,
Leandro
[1]
http://search.maven.org/#search%7Cgav%7C1%7Cg%3A%22org.wikidata.query.rdf%2…
[2]
https://github.com/blazegraph/database/wiki/IOOptimization
[3]
https://addshore.com/2019/10/your-own-wikidata-query-service-with-no-limits…