Hi all,

As many of you already know, one of the Search team’s priorities this year is scaling Wikidata Query Service (WDQS). Specifically, this conversation has centered around the need to move off of the Blazegraph backend that WDQS currently uses.

As part of this process, we want to get input/feedback from our community of users, and better understand some of the use cases and needs you have. As mentioned in our Jan 2022 scaling update, Andrea Westerinen has joined our team as a Contract Graph Consultant, and this provides an opportunity to meet her (and others on the WMF Search team working on WDQS) and give us direct feedback about your needs.

There will be 2 feedback sessions (more information on each session below) that you are welcome and encouraged to join:

  1. WDQS scaling community meeting 1/2: SPARQL query features
  2. WDQS scaling community meeting 2/2: RDF store backend needs

The purpose of these meetings is primarily to facilitate meeting each other, and to gather requirements and use cases around WDQS — while this information will be used to plan future scaling, no decisions will be made during the meetings themselves.

While we have a rough outline of the topics we intend to cover in each meeting, we also welcome relevant feedback that may not be covered below, though we encourage and prioritize ideas that are also valuable to others. We ask that you please be mindful of allowing others to express their thoughts and perspectives, and helping facilitate a constructive conversation.

As always, thanks for your time, energy and patience, and look forward to seeing you in a couple of weeks!

Best,

Mike


Meeting details

WDQS scaling community meeting 1/2: SPARQL query features

SPARQL is a power querying language, and is the endpoint to access information on Wikidata. The flexibility and power of SPARQL also makes it possible for WDQS to be strained from complex/computationally expensive queries, affecting all users. In considering how to balance the usability of SPARQL and limitations on it that can help service reliability, we want to have a better understanding of what SPARQL features you most frequently use and/or are most important to you, and what the frequency of use is.

The following list of features indicates most of the SPARQL features of interest, but is not exhaustive, and anything else that comes to mind is also valuable:

WDQS scaling community meeting 2/2: RDF store backend needs

In addition to SPARQL query features, we are interested in knowing more about what functionality is important to you from an RDF store and SPARQL endpoint. For example, many you reported in the August 2021 WDQS user survey that the 60 second timeout limit was a top priority. This meeting will be about discussing how scaling the backend engineering of WDQS can be most valuable to your interests and needs. Other possible topics (non-exhaustive) may include:





Mike Pham (he/him)
Sr Product Manager, Search
Wikimedia Foundation