Dear Kingsley,
Let me start saying that I appreciate and thank the effort of loading
complete wikidata over a graph database and make and sparql endpoint
available. I know it is not an easy task to do
I just tried out the new virtuoso-hosted sparql endpoint with some
queries. My experiments are not exhaustive at all, but I just wanted to
raise two concern that I detected
Considering a (very simple) query that count all humans:
'''
SELECT (count(?human) as ?c)
WHERE
{
?human wdt:P31 wd:Q5 .
}
'''
I get a result of 10396057, which is ok considering the dataset that you
are using
But if we try to export all instances of human (on a tsv file) with the
following query:
'''
SELECT ?human
WHERE
{
?human wdt:P31 wd:Q5 .
}
'''
Then I only get 100000 results. Is there a limit over the number of
results that a query can have?
Furthermore, if we want to get all humans ordered by id, then the
endpoint times out. The following is the query:
'''
SELECT ?human
WHERE
{
?human wdt:P31 wd:Q5 .
}
ORDER BY DESC(?human)
'''
Thank you again for all your efforts. I am looking forward to see how
this new endpoint work, :)
Are you planning to update regularly the dataset?
All the best!
Larry
https://iccl.inf.tu-dresden.de/web/Larry_Gonzalez
On 11.01.23 21:51, Kingsley Idehen via Wikidata wrote:
> All,
>
> We are pleased to announce immediate availability of an new
> Virtuoso-hosted Wikidata instance based on the most recent datasets.
> This instance comprises 17 billion+ RDF triples.
>
> Host Machine Info:
>
> Item Value
>
> CPU
>
>
>
> |2x Intel(R) Xeon(R) CPU E5-2630 0 @ 2.30GHz|
>
> Cores
>
>
>
> |24|
>
> Memory
>
>
>
> |378 GB|
>
> SSD
>
>
>
> |4x Crucial M4 SSD 500 GB|
>
>
> Cloud related costs for a self-hosted variant, assuming:
>
> *
>
> dedicated machine for 1 year without upfront costs
>
> *
>
> 128 GiB memory
>
> *
>
> 16 cores or more
>
> *
>
> 512GB SSD for the database
>
> *
>
> 3T outgoing internet traffic (based on our DBpedia statistics)
>
>
> vendor machine type memory vCPUs monthly machine monthly disk
> monthly network monthly total
>
> Amazon
>
>
>
> r5a.4xlarge
>
>
>
> 128 GiB
>
>
>
> 16
>
>
>
> $479.61
>
>
>
> $55.96
>
>
>
> $276.48
>
>
>
> $812.05
>
> Google
>
>
>
> e2highmem-16
>
>
>
> 128 GiB
>
>
>
> 16
>
>
>
> $594.55
>
>
>
> $95.74
>
>
>
> $255.00
>
>
>
> $945.30
>
> Azure
>
>
>
> D32a
>
>
>
> 128 GiB
>
>
>
> 32
>
>
>
> $769.16
>
>
>
> $38.40
>
>
>
> $252.30
>
>
>
> $1,060.06
>
>
> SPARQL Query and Full Text Search service endpoints:
>
> *
>
>
https://wikidata.demo.openlinksw.com/sparql -- SPARQL Query Services
> Endpoint
>
> *
>
>
https://wikidata.demo.openlinksw.com/fct -- Faceted Search & Browsing
>
>
> Additional Information
>
> *
>
> Loading the Wikidata dataset 2022/12 into Virtuoso Open Source -
> Announcements - OpenLink Software Community (
openlinksw.com)
>
<https://community.openlinksw.com/t/loading-the-wikidata-dataset-2022-12-into-virtuoso-open-source/3580>
>
>
> Happy New Year!
>
> --
> Regards,
>
> Kingsley Idehen
> Founder & CEO
> OpenLink Software
> Home
Page:http://www.openlinksw.com
> Community
Support:https://community.openlinksw.com
> Weblogs (Blogs):
> Company
Blog:https://medium.com/openlink-software-blog
> Virtuoso
Blog:https://medium.com/virtuoso-blog
> Data Access Drivers
Blog:https://medium.com/openlink-odbc-jdbc-ado-net-data-access-drivers
>
> Personal Weblogs (Blogs):
> Medium
Blog:https://medium.com/@kidehen
> Legacy
Blogs:http://www.openlinksw.com/blog/~kidehen/
>
http://kidehen.blogspot.com
>
> Profile Pages:
>
Pinterest:https://www.pinterest.com/kidehen/
>
Quora:https://www.quora.com/profile/Kingsley-Uyi-Idehen
>
Twitter:https://twitter.com/kidehen
>
Google+:https://plus.google.com/+KingsleyIdehen/about
>
LinkedIn:http://www.linkedin.com/in/kidehen
>
> Web Identities (WebID):
>
Personal:http://kingsley.idehen.net/public_home/kidehen/profile.ttl#i
>
:http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this
>
>
> _______________________________________________
> Wikidata mailing list -- wikidata(a)lists.wikimedia.org
> Public archives at
https://lists.wikimedia.org/hyperkitty/list/wikidata@lists.wikimedia.org/me…
> To unsubscribe send an email to wikidata-leave(a)lists.wikimedia.org