Happy Earth Day all,
This is an update that the Wikimedia Foundation (WMF) Search team, who
currently owns and maintains Wikidata Query Service (WDQS), will be
lowering our WDQS update lag Service Level Objective (SLO) from <10 minute
update lag 99% of the time to 95% of the time.
This change comes roughly 6 months after the release of the Streaming
Updater, which has significantly helped
<https://grafana.wikimedia.org/d/yCBd7Tdnk/wdqs-wcqs-lag-slo?orgId=1&from=now-6M&to=now&var-cluster_name=wdqs&var-lag_threshold=600&var-slo_period=30d>
with
WDQS update lag, but is unable to by itself offset the impact of outages
caused from Blazegraph failures, memory issues, and other modes of failure
related to scaling challenges.
For users, the SLO lowering to 95% may mean more frequent and/or longer
downtime of WDQS before the Search team is able to address these issues. We
recognize that this is not ideal, but this reduces how often alarm bells go
off (on weekends) for our Site Reliability Engineers, who have many other
priorities, and gives the Search team more space to focus on addressing
long term scaling issues rather than only putting out fires.
We appreciate your patience and understanding as we continue our work to
both provide a functional stable WDQS, and also work to scale it for future
resiliency.
Best,
Mike
—
*Mike Pham* (he/him)
Sr Product Manager, Search
Wikimedia Foundation <https://wikimediafoundation.org/>
Show replies by date