Hi
I was able to reduce the load time to 9.1 hours aprox. (32890338 msec) in
Virtuoso 7.
I used 6 SSD disks of 1T each with RAID 0 (mdadm software RAID, I have not
tried with hardware RAID).
The virtuoso.ini for 256G RAM is
on August 30th,
The size is 387G uncompressed and finally the file virtuoso.db is 362G. The
total number of triples is 9 470 700 617.
Have a look to the simple patch here (is just a workaround)
Check the Dockerfile which retrieves the patch from my forked Virtuoso git
repository
Best,
Le dim. 1 sept. 2019 à 13:38, Edgar Meij <edgar.meij(a)gmail.com> a écrit :
Thanks for this, Kingsley.
Based on
https://docs.google.com/spreadsheets/d/1-stlTC_WJmMU3xA_NxA1tSLHw6_sbpjff-5…
(copy-pasted below), it seems that it takes 43 hours to load, is that
correct?
Also, what is the "patch for geometry" mentioned there? I'm assuming that
is the patch meant to address
https://github.com/openlink/virtuoso-opensource/issues/295 and
https://community.openlinksw.com/t/non-terrestrial-geo-literals/359,
correct? Is it simply disabling the data validation code? Can you share the
patch?
Thanks,
Edgar
Other Information
Architecture x86_64
CPU op-mode(s) 32-bit, 64-bit
Byte Order Little Endian
CPU(s) 12.00
On-line CPU(s) list 0-11
Thread(s) per core 2.00
Core(s) per socket 6.00
Socket(s) 1.00
NUMA node(s) 1.00
Vendor ID GenuineIntel
CPU family 6.00
Model 63.00
Model name
Intel(R) Xeon(R) CPU E5-1650 v3 @ 3.50GHz
Stepping 2.00
CPU MHz 1,199.92
CPU max MHz 3,800.00
CPU min MHz 1,200.00
BogoMIPS 6,984.39
Virtualization VT-x
L1d cache 32K
L1i cache 32K
L2 cache 256K
L3 cache 15360K
NUMA node0 CPU(s) 0-11
RAM 128G
wikidata-20190610-all-BETA.ttl 383G
Virtuoso version
07.20.3230 (with patch for geometry)
Time to load 43 hours
virtuoso.db 340G
On Wed, Aug 14, 2019 at 12:10 AM Kingsley Idehen <kidehen(a)openlinksw.com>
wrote:
Hi Everyone,
A little FYI.
We have loaded Wikidata into a Virtuoso instance accessible via SPARQL
[1]. One benefit is helping to understand Wikidata using our Faceted
Browsing Interface for Entity Relationship Types [2][3].
Links:
[1]
http://wikidata.demo.openlinksw.com/sparql -- SPARQL endpoint
[2]
http://wikidata.demo.openlinksw.com/fct -- Faceted Browsing Interface
[3] About New York
<https://wikidata.demo.openlinksw.com/describe/?url=http%3A%2F%2Fwww.wikidata.org%2Fentity%2FQ60&gp=16&go=&lp=940&invfp=IFP_OFF&sas=SAME_AS_OFF&distinct=1>
Enjoy!
Feedback always welcome too :)
--
Regards,
Kingsley Idehen
Founder & CEO
OpenLink Software
Home Page:
http://www.openlinksw.com
Community Support:
https://community.openlinksw.com
Weblogs (Blogs):
Company Blog:
https://medium.com/openlink-software-blog
Virtuoso Blog:
https://medium.com/virtuoso-blog
Data Access Drivers Blog:
https://medium.com/openlink-odbc-jdbc-ado-net-data-access-drivers
Personal Weblogs (Blogs):
Medium Blog:
https://medium.com/@kidehen
Legacy Blogs:
http://www.openlinksw.com/blog/~kidehen/
http://kidehen.blogspot.com
Profile Pages:
Pinterest:
https://www.pinterest.com/kidehen/
Quora:
https://www.quora.com/profile/Kingsley-Uyi-Idehen
Twitter:
https://twitter.com/kidehen
Google+:
https://plus.google.com/+KingsleyIdehen/about
LinkedIn:
http://www.linkedin.com/in/kidehen
Web Identities (WebID):
Personal:
http://kingsley.idehen.net/public_home/kidehen/profile.ttl#i
:
http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this
_______________________________________________
Wikidata mailing list
Wikidata(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata
_______________________________________________
Wikidata mailing list
Wikidata(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata