topic: removed semantic-web, guile, and added wikidata-tech.
Let's move the conversation to wikidata-tech@
Please remove wikidata(a)lists.wikimedia.org next time you reply.
Le dim. 22 déc. 2019 à 23:35, Ted Thibodeau Jr
<tthibodeau(a)openlinksw.com> a écrit :
On Dec 22, 2019, at 03:17 PM, Amirouche Boubekki <amirouche.boubekki(a)gmail.com>
wrote:
Hello all ;-)
I ported the code to Chez Scheme to do an apple-to-apple comparison
between GNU Guile and Chez and took the time to launch a few queries
against Virtuoso available in Ubuntu 18.04 (LTS).
Hi, Amirouche --
Kingsley's points about tuning Virtuoso to use available
RAM [1] and other system resources are worth looking into,
but a possibly more important first question is --
Exactly what version of Virtuoso are you testing?
If you followed the common script on Ubuntu 18.04, i.e., --
sudo apt update
sudo apt install virtuoso-opensource
-- then you likely have version 6.1.6 of VOS, the Open Source
Edition of Virtuoso, which shipped 2012-08-02 [2], and is far
behind the latest version of both VOS (v7.2.5+) and Enterprise
Edition (v8.3+)!
The easiest way to confirm what you're running is to review
the first "paragraph" of output from the command corresponding
to the name of your Virtuoso binary --
virtuoso-t -?
$ virtuoso-t -?
Virtuoso Open Source Edition (multi threaded)
Version 6.1.6.3127-pthreads as of Feb 6 2018
virtuoso-iodbc-t -?
I do not have that command. I use isql-vt:
$ isql-vt --help
OpenLink Interactive SQL (Virtuoso), version 0.9849b.
If I'm right, and you're running 6.x,
you'll get much better
test results just by running a current version of Virtuoso.
You can build VOS 7.2.6+ from source [3] (we'd recommend the
develop/7 branch [4] for the absolute latest), or download a
precompiled binary [5] of VOS 7.2.5.1 or 7.2.6.dev.
You can also try Enterprise Edition at no cost for 30 days [5].
Next round I will try the develop branch.
Like I said, previously, somewhere, those benchmark must be taken with
a grain of salt:
For one, the Virtuoso timings are reported by Virtuoso. Second,
nomuofu side, I do not convert the internal representation into the
external representation, third and most important point, this is just
a glimpse into the full picture.
My mails are mainly trying to spark some interest or discussion with
wikidata and wikimedia, so that I can work full time on this. I
already described my intents, that is to create a benchmark tool based
wikidata SPARQL logs [*], then use those to reallistically benchmark
Virtuoso, the current solution and a new solution (nomunofu) that I am
working on.
[*]
https://iccl.inf.tu-dresden.de/web/Wissensbasierte_Systeme/WikidataSPARQL/en
Raw benchmarks would not tell all the thruth, because nomunofu can
rely on both WiredTiger and FoundationDB, which, as far as I know,
claim stronger guarantees than Virtuoso. The only way to know whether
Virtuoso is comparable to FoundationDB or WiredTiger, will be for
Virtuoso to pass the Jespen harness tests (
https://jepsen.io/).
I did not put all the eggs in the same basket, I am considering other
options. But I think working for wikimedia by contract or permanent
position would be best overall.
I will make another WDQS proposal, based on some feedback I have been
given on IRC to add more technical details (and improve the road map).
[1]
http://vos.openlinksw.com/owiki/wiki/VOS/VirtRDFPerformanceTuning
[2]
http://vos.openlinksw.com/owiki/wiki/VOS/VOSNews2012#2012-08-02%20--%20Anno….
[3]
http://vos.openlinksw.com/owiki/wiki/VOS/VOSBuild
[4]
https://github.com/openlink/virtuoso-opensource/tree/develop/7
[5]
https://sourceforge.net/projects/virtuoso/files/virtuoso/
Spoiler: the new code is always faster.
The hard disk is SATA, and the CPU is dubbed: Intel(R) Xeon(R) CPU
E3-1220 V2 @ 3.10GHz
I imported latest-lexeme.nt (6GB) using guile-nomunofu, chez-nomunofu
and Virtuoso:
- Chez takes 40 minutes to import 6GB
- Chez is 3 to 5 times faster than Guile
- Chez is 11% faster than Virtuoso
How did you load the data? Did you use Virtuoso's build-load
facilities? This is the recommended method [6].
[6]
http://vos.openlinksw.com/owiki/wiki/VOS/VirtBulkRDFLoader
Regarding query time, Chez is still faster than
Virtuoso with or
without cache. The query I am testing is the following:
SELECT ?s ?p ?o
FROM <http://fu>
WHERE {
?s <http://purl.org/dc/terms/language> <http://www.wikidata.org/entity/Q150>
.
?s <http://wikiba.se/ontology#lexicalCategory>
<http://www.wikidata.org/entity/Q1084> .
?s <http://www.w3.org/2000/01/rdf-schema#label> ?o
};
Virtuoso first query takes: 1295 msec.
The second query takes: 331 msec.
Then it stabilize around: 200 msec.
chez nomunofu takes around 200ms without cache.
There is still an optimization I can do to speed up nomunofu a little.
Happy hacking!
I'll be interested to hear your new results, with a current build,
and with proper INI tuning in place.
What will be the INI options I need to use? Thanks!
Regards,
Ted
--
A: Yes.
http://www.idallen.com/topposting.html
| Q: Are you sure?
| | A: Because it reverses the logical flow of conversation.
| | | Q: Why is top posting frowned upon?
Ted Thibodeau, Jr. // voice +1-781-273-0900 x32
Senior Support & Evangelism // mailto:tthibodeau@openlinksw.com
//
http://twitter.com/TallTed
OpenLink Software, Inc. //
http://www.openlinksw.com/
20 Burlington Mall Road, Suite 322, Burlington MA 01803
Weblog --
http://www.openlinksw.com/blogs/
Community --
https://community.openlinksw.com/
LinkedIn --
http://www.linkedin.com/company/openlink-software/
Twitter --
http://twitter.com/OpenLink
Facebook --
http://www.facebook.com/OpenLinkSoftware
Universal Data Access, Integration, and Management Technology Providers
Regards,
Amirouche ~ zig ~
https://hyper.dev