_______________________________________________On 8/18/21 5:07 PM, Mike Pham wrote:
Wikidata community members,
Thank you for all of your work helping Wikidata grow and improve over the years. In the spirit of better communication, we would like to take this opportunity to share some of the current challenges Wikidata Query Service (WDQS) is facing, and some strategies we have for dealing with them.
WDQS currently risks failing to provide acceptable service quality due to the following reasons:
Graph size. WDQS uses Blazegraph as our graph backend. While Blazegraph can theoretically support 50 billion edges, in reality Wikidata is the largest graph we know of running on Blazegraph (~13 billion triples), and there is a risk that we will reach a size limit of what it can realistically support. Once Blazegraph is maxed out, WDQS can no longer be updated. This will also break Wikidata tools that rely on WDQS.
Software support. Blazegraph is end of life software, which is no longer actively maintained, making it an unsustainable backend to continue moving forward with long term.
Blazegraph maxing out in size poses the greatest risk for catastrophic failure, as it would effectively prevent WDQS from being updated further, and inevitably fall out of date. Our long term strategy to address this is to move to a new graph backend that best meets our WDQS needs and is actively maintained, and begin the migration off of Blazegraph as soon as a viable alternative is identified.
Do bear in mind that pre and post selection of Blazegraph for Wikidata, we've always offered an RDF-based DBMS that can handle current and future requirements for Wikidata, just as we do DBpedia.
At the time of our first rendezvous, handling 50 billion triples would have typically required our Cluster Edition which is a Commercial Only offering -- basically, that was the deal breaker back then.
Anyway, in recent times, our Open Source Edition has evolved to handle some 80 Billion+ triples (exemplified by the live Uniprot instance) where performance and scale is primary a function of available memory.
I hope this helps.
 https://wikidata.demo.openlinksw.com/sparql -- Our Live Wikidata SPARQL Query Endpoint
 https://docs.google.com/spreadsheets/d/15AXnxMgKyCvLPil_QeGC0DiXOP-Hu8Ln97fZ683ZQF0/edit#gid=0 -- Google Spreadsheet about various Virtuoso Configurations associated with some well-known public endpoints
 https://t.co/EjAAO73wwE -- this query doesn't complete with the current Blazegraph-based Wikidata endpoint
 https://t.co/GTATPPJNBI -- same query completing when applied to the Virtuoso-based endpoint
 https://t.co/X7mLmcYC69 -- about loading Wikidata's datasets into a Virtuoso instance
 https://twitter.com/search?q=%23Wikidata%20%23VirtuosoRDBMS%20%40kidehen&src=typed_query&f=live -- various demos shared via Twitter over the years regarding Wikidata
-- Regards, Kingsley Idehen Founder & CEO OpenLink Software Home Page: http://www.openlinksw.com Community Support: https://community.openlinksw.com Weblogs (Blogs): Company Blog: https://medium.com/openlink-software-blog Virtuoso Blog: https://medium.com/virtuoso-blog Data Access Drivers Blog: https://medium.com/openlink-odbc-jdbc-ado-net-data-access-drivers Personal Weblogs (Blogs): Medium Blog: https://medium.com/@kidehen Legacy Blogs: http://www.openlinksw.com/blog/~kidehen/ http://kidehen.blogspot.com Profile Pages: Pinterest: https://www.pinterest.com/kidehen/ Quora: https://www.quora.com/profile/Kingsley-Uyi-Idehen Twitter: https://twitter.com/kidehen Google+: https://plus.google.com/+KingsleyIdehen/about LinkedIn: http://www.linkedin.com/in/kidehen Web Identities (WebID): Personal: http://kingsley.idehen.net/public_home/kidehen/profile.ttl#i : http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this
Wikidata mailing list -- email@example.com
To unsubscribe send an email to firstname.lastname@example.org