Re: [Wikidata] WDQS status

9 Jul 2020

On Thu, Jul 9, 2020 at 3:35 PM Gerard Meijssen &lt;gerard.meijssen(a)gmail.com&gt;
wrote:

...
  Hoi,
 Is this different from Special:MediaSearch ??

I'm assuming that you are asking if the new WCQS is different from the
Special:MediaSearch prototype [1].

And yes, it is quite different. WCQS is a low level SPARQL interface,
oriented toward power users and tools, allowing federation with WDQS and
the Wikdiata dataset. Special:MediaSearch is a higher level search
interface, backed by elasticsearch. It is using the same underlying data,
but in a very different way.

Somewhat unrelated: we are also planning some work on Special:MediaSearch
to better integrate is with our current search infrastructure [2].

[1] https://commons.wikimedia.org/wiki/Special:MediaSearch
[2] https://phabricator.wikimedia.org/T257043

Thanks,
...
        GerardM

 On Thu, 9 Jul 2020 at 15:23, Guillaume Lederrey &lt;glederrey(a)wikimedia.org&gt;
 wrote:

  Hello all!

 The Search Platform team will join the WIkidata office hours on July 21st
 16:00 UTC [1]. We are looking forward to discussing Wikidata Query Service
 and anything else you might find of interest.

 We've been hard at work on Wikimedia Commons Query Service (WCQS) [2].
 This will be a SPARL endpoint similar to WDQS, but serving the Structured
 Data on Commons dataset. Our goal is to open a beta service, hosted on
 Wikimedia Cloud Service (WMCS) by the end of July. The service will require
 an account on Commons for authentication and will allow federation with
 WDQS. We don't have a streaming update process ready yet, the data will be
 reloaded from Commons dumps weekly for a start.

 As part of that work, the dumps for Structured Data on Commons are now
 available [3]. Note that the prefix used in the TTL dumps is "wd", which
 does not make much sense. We are working with WMDE on renaming the
 prefixes, but this is more complex than expected since "wd" is hardcoded in
 more places than it should be. Those prefix should only be valid in the
 local context of the dumps, so renaming them is technically a non breaking
 change. That being said, if you start using those dumps, make sure you
 don't rely on this prefix, or that you are ready for a rename [4].

 We are planning to dig more into the data we have to get a better
 understanding of the use cases around WDQS [5] (not much content on that
 task yet, but it is coming). Some very preliminary analysis indicates that
 less then 2% of the queries on WDQS generate more than 90% of the load.
 This is definitely something we need to better understand. We will be
 working on defining the kind of questions we need to answer, and improving
 our data collection to be able to answer those questions.

 We have started an internal discussion around "planning for disaster"
 [6]. We want to better understand the potential failure scenarios around
 WDQS and have a plan if that worst case does happen. This will include some
 analytics work and some testing to better understand the constraints and
 what degraded mode we might still be able to provide in case of
 catastrophic failure.

 Thanks for reading!

    Guillaume

 [1] https://www.wikidata.org/wiki/Wikidata:Events#Office_hours
 [2] https://phabricator.wikimedia.org/T251488
 [3] https://dumps.wikimedia.org/other/wikibase/commonswiki/
 [4]
 https://dumps.wikimedia.org/other/wikibase/commonswiki/README_commonsrdfdum…
 [5] https://phabricator.wikimedia.org/T257045
 [6] https://phabricator.wikimedia.org/T257055

 --
 Guillaume Lederrey
 Engineering Manager, Search Platform
 Wikimedia Foundation
 UTC+1 / CET
 _______________________________________________
 Wikidata mailing list
 Wikidata(a)lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata
  _______________________________________________
 Wikidata mailing list
 Wikidata(a)lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata

-- 
Guillaume Lederrey
Engineering Manager, Search Platform
Wikimedia Foundation
UTC+1 / CET

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

Re: [Wikidata] WDQS status