Wikidata Query Service update lag

List overview All Threads
Download

newer

older

Call for proposals for Knowledge...

UMAP 2020 - Second call for...

Guillaume Lederrey

14 Nov 2019 14 Nov '19

9:50 a.m.

Hello all! As you've probably noticed, the update lag on the public WDQS endpoint [1] is not doing well [2], with lag climbing to > 12h for some servers. We are tracking this on phabricator [3], subscribe to that task if you want to stay informed. To be perfectly honest, we don't have a good short term solution. The graph database that we are using at the moment (Blazegraph [4]) does not easily support sharding, so even throwing hardware at the problem isn't really an option. We are working on a few medium term improvements: * A dedicated updater service in Blazegraph, which should help increase the update throughput [5]. Finger crossed, this should be ready for initial deployment and testing by next week (no promise, we're doing the best we can). * Some improvement in the parallelism of the updater [6]. This has just been identified. While it will probably also provide some improvement in throughput, we haven't actually started working on that and we don't have any numbers at this point. Longer term: We are hiring a new team member to work on WDQS. It will take some time to get this person up to speed, but we should have more capacity to address the deeper issues of WDQS by January. The 2 main points we want to address are: * Finding a triple store that scales better than our current solution. * Better understand what are the use cases on WDQS and see if we can provide a technical solution that is better suited. Our intuition is that some of the use cases that require synchronous (or quasi synchronous) updates would be better implemented outside of a triple store. Honestly, we have no idea yet if this makes sense and what those alternate solutions might be. Thanks a lot for your patience during this tough time! Guillaume [1] https://query.wikidata.org/ [2] https://grafana.wikimedia.org/d/000000489/wikidata-query-service?orgId=1&am… [3] https://phabricator.wikimedia.org/T238229 [4] https://blazegraph.com/ [5] https://phabricator.wikimedia.org/T212826 [6] https://phabricator.wikimedia.org/T238045 -- Guillaume Lederrey Engineering Manager, Search Platform Wikimedia Foundation UTC+1 / CET

Attachments:

attachment.htm (text/html — 3.1 KB)

Show replies by date

fn＠imm.dtu.dk

14 Nov 14 Nov

10:10 a.m.

Besides waiting for the new updater, it may be useful to tell us, what we as users can do too. It is unclear to me what the problem is. For instance, at one point I was worried that the many parallel requests to the SPARQL endpoint that we make in Scholia is a problem. As far as I understand it is not a problem at all. Another issue could be the way that we use Magnus Manske's Quickstatements and approve bots for high frequency editing. Perhaps a better overview and constraints on large-scale editing could be discussed? Yet another thought is the large discrepancy between Virginia and Texas data centers as I could see on Grafana [1]. As far as I understand the hardware (and software) are the same. So why is there this large difference? Rather than editing or BlazeGraph, could the issue be some form of network issue? [1] https://grafana.wikimedia.org/d/000000489/wikidata-query-service?panelId=8&… /Finn On 14/11/2019 10:50, Guillaume Lederrey wrote:

...

Lucas Werkmeister

11:37 a.m.

As the Wikitech WDQS Hardware section [1] explains, “due to how we route traffic with GeoDNS, the primary cluster (usually eqiad) sees most of the traffic.” So the clusters may all have the same hardware, but one cluster sees most of the query load, so it has a harder time keeping up with updates (since the update load is mostly the same everywhere). Cheers, Lucas [1]: https://wikitech.wikimedia.org/wiki/Wikidata_query_service#Hardware On 14.11.19 11:10, fn(a)imm.dtu.dk wrote:

...

_______________________________________________ Wikidata mailing list Wikidata(a)lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata

Guillaume Lederrey

1:31 p.m.

Thanks for the feedback! On Thu, Nov 14, 2019 at 11:11 AM <fn(a)imm.dtu.dk> wrote:

...

To be (again) completely honest, we don't entirely understand the issue either. There are clearly multiple related issues. In high level terms, we have at least: * Some part of the update process on Blazegraph is CPU bound and single threaded. Even with low query load, if we have a high edit rate, Blazegraph can't keep up, and saturates a single CPU (with plenty of available resources on other CPUs). This is a hard issue to fix, requiring either splitting the processing over multiple CPU or sharding the data over multiple servers. Neither of which Blazegraph supports (at least not in our current configuration). * There is a race for resources between edits and queries: a high query load will impact the update rate. This could to some extent be mitigated by reducing the query load: if no one is using the service, it works great! Obviously that's not much of a solution. What you can do (short term): * Keep bots usage well behaved (don't do parallel queries, provide a meaningful user agent, smooth the load over time if possible, ...). As far as I can see, most usage are already well behaved. * Optimize your queries: better queries will use less resources, which should help. Time to completion is a good approximation of the resources used. I don't really have any more specific advice, SPARQL is not my area of expertise. What you can do (longer term): * Help us think out of the box. Can we identify higher level use cases? Could we implement some of our workflows on a higher level API than SPARQL, which might allow for more internal optimizations? * Help us better understand the constraints. Document use cases on [1]. Sadly, we don't have the bandwidth right now to engage meaningfully in this conversation. Feel free to send thoughts already, but don't expect any timely response. Yet another thought is the large discrepancy between Virginia and Texas

...

data centers as I could see on Grafana [1]. As far as I understand the hardware (and software) are the same. So why is there this large difference? Rather than editing or BlazeGraph, could the issue be some form of network issue?

As pointed out by Lucas, this is expected. Due to how our GeoDNS works, we see more traffic on eqiad than on codfw. Thanks for the help! Guillaume [1] https://wikitech.wikimedia.org/wiki/Wikidata_query_service/Usage

...

[1] https://grafana.wikimedia.org/d/000000489/wikidata-query-service?panelId=8&… /Finn On 14/11/2019 10:50, Guillaume Lederrey wrote:

https://grafana.wikimedia.org/d/000000489/wikidata-query-service?orgId=1&am…

[3] https://phabricator.wikimedia.org/T238229 [4] https://blazegraph.com/ [5] https://phabricator.wikimedia.org/T212826 [6] https://phabricator.wikimedia.org/T238045 -- Guillaume Lederrey Engineering Manager, Search Platform Wikimedia Foundation UTC+1 / CET _______________________________________________ Wikidata mailing list Wikidata(a)lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata

_______________________________________________ Wikidata mailing list Wikidata(a)lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata

-- Guillaume Lederrey Engineering Manager, Search Platform Wikimedia Foundation UTC+1 / CET

Thad Guidry

4:01 p.m.

Is the Write Retention Queue adequate? Is the branching factor for the lexicon indices too large, resulting in a non-linear slowdown in the write rate over tim? Did you look into Small Slot Optimization? Are the Write Cache Buffers adequate? Is there a lot of Heap pressure? Is the MemoryManager have the maximum amount of RAM it can handle? 4TB? Is the RWStore handling the recycling well? Is the SAIL Buffer Capacity adequate? Are you not using exact range counts where you could be using fast range counts? Start at the Hardware side first however. Is the disk activity for writes really low...and CPU is very high? You have identified a bottleneck in that case, discover WHY that would be the case looking into any of the above. and a 100+ other things that should be looked at that all affect WRITE performance during UPDATES. https://wiki.blazegraph.com/wiki/index.php/IOOptimization https://wiki.blazegraph.com/wiki/index.php/PerformanceOptimization I would also suggest you start monitoring some of the internals of Blazegraph (JAVA) while in production with tools such as XRebel or AppDynamics. Thad https://www.linkedin.com/in/thadguidry/ On Thu, Nov 14, 2019 at 7:31 AM Guillaume Lederrey <glederrey(a)wikimedia.org> wrote:

...

Thanks for the feedback! On Thu, Nov 14, 2019 at 11:11 AM <fn(a)imm.dtu.dk> wrote:

[1] https://grafana.wikimedia.org/d/000000489/wikidata-query-service?panelId=8&… /Finn On 14/11/2019 10:50, Guillaume Lederrey wrote:

not

easily support sharding, so even throwing hardware at the problem isn't really an option. We are working on a few medium term improvements: * A dedicated updater service in Blazegraph, which should help increase the update throughput [5]. Finger crossed, this should be ready for initial deployment and testing by next week (no promise, we're doing

the

best we can). * Some improvement in the parallelism of the updater [6]. This has just been identified. While it will probably also provide some improvement

throughput, we haven't actually started working on that and we don't have any numbers at this point. Longer term: We are hiring a new team member to work on WDQS. It will take some time to get this person up to speed, but we should have more capacity to address the deeper issues of WDQS by January. The 2 main points we want to address are: * Finding a triple store that scales better than our current solution. * Better understand what are the use cases on WDQS and see if we can provide a technical solution that is better suited. Our intuition is that some of the use cases that require synchronous (or quasi synchronous) updates would be better implemented outside of a triple store. Honestly, we have no idea yet if this makes sense and what those alternate solutions might be. Thanks a lot for your patience during this tough time! Guillaume [1] https://query.wikidata.org/ [2]

https://grafana.wikimedia.org/d/000000489/wikidata-query-service?orgId=1&am…

_______________________________________________ Wikidata mailing list Wikidata(a)lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata

-- Guillaume Lederrey Engineering Manager, Search Platform Wikimedia Foundation UTC+1 / CET _______________________________________________ Wikidata mailing list Wikidata(a)lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata

Guillaume Lederrey

7:45 p.m.

Hello! Thanks for the suggestions! On Thu, Nov 14, 2019 at 5:02 PM Thad Guidry <thadguidry(a)gmail.com> wrote:

...

Start at the Hardware side first however.

...

Is the disk activity for writes really low...and CPU is very high? You have identified a bottleneck in that case, discover WHY that would be the case looking into any of the above.

Sounds like good questions, but outside of my area of expertise. I've created https://phabricator.wikimedia.org/T238362 to track it, and I'll see if someone can have a look. I know that we did multiple passes at tuning Blazegraph properties, with limited success so far.

...

and a 100+ other things that should be looked at that all affect WRITE performance during UPDATES. https://wiki.blazegraph.com/wiki/index.php/IOOptimization https://wiki.blazegraph.com/wiki/index.php/PerformanceOptimization I would also suggest you start monitoring some of the internals of Blazegraph (JAVA) while in production with tools such as XRebel or AppDynamics.

Both XRebel and AppDynamics are proprietary, so no way that we'll deploy them in our environment. We are tracking a few JMX based metrics, but so far, we don't really know what to look for. Thanks! Guillaume Thad

...

https://www.linkedin.com/in/thadguidry/ On Thu, Nov 14, 2019 at 7:31 AM Guillaume Lederrey < glederrey(a)wikimedia.org> wrote:

Thanks for the feedback! On Thu, Nov 14, 2019 at 11:11 AM <fn(a)imm.dtu.dk> wrote:

[1] https://grafana.wikimedia.org/d/000000489/wikidata-query-service?panelId=8&… /Finn On 14/11/2019 10:50, Guillaume Lederrey wrote:

Hello all! As you've probably noticed, the update lag on the public WDQS endpoint [1] is not doing well [2], with lag climbing to > 12h for some

servers.

We are tracking this on phabricator [3], subscribe to that task if you want to stay informed. To be perfectly honest, we don't have a good short term solution. The graph database that we are using at the moment (Blazegraph [4]) does

not

easily support sharding, so even throwing hardware at the problem

isn't

really an option. We are working on a few medium term improvements: * A dedicated updater service in Blazegraph, which should help

increase

the update throughput [5]. Finger crossed, this should be ready for initial deployment and testing by next week (no promise, we're doing

the

best we can). * Some improvement in the parallelism of the updater [6]. This has

just

been identified. While it will probably also provide some improvement

throughput, we haven't actually started working on that and we don't have any numbers at this point. Longer term: We are hiring a new team member to work on WDQS. It will take some

time

to get this person up to speed, but we should have more capacity to address the deeper issues of WDQS by January. The 2 main points we want to address are: * Finding a triple store that scales better than our current solution. * Better understand what are the use cases on WDQS and see if we can provide a technical solution that is better suited. Our intuition is that some of the use cases that require synchronous (or quasi synchronous) updates would be better implemented outside of a triple store. Honestly, we have no idea yet if this makes sense and what

those

alternate solutions might be. Thanks a lot for your patience during this tough time! Guillaume [1] https://query.wikidata.org/ [2]

https://grafana.wikimedia.org/d/000000489/wikidata-query-service?orgId=1&am…

_______________________________________________ Wikidata mailing list Wikidata(a)lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata

-- Guillaume Lederrey Engineering Manager, Search Platform Wikimedia Foundation UTC+1 / CET

Thad Guidry

10:38 p.m.

...

Hello! Thanks for the suggestions! On Thu, Nov 14, 2019 at 5:02 PM Thad Guidry <thadguidry(a)gmail.com> wrote:

Start at the Hardware side first however.

Is the disk activity for writes really low...and CPU is very high? You have identified a bottleneck in that case, discover WHY that would be the case looking into any of the above.

https://www.linkedin.com/in/thadguidry/ On Thu, Nov 14, 2019 at 7:31 AM Guillaume Lederrey < glederrey(a)wikimedia.org> wrote:

Thanks for the feedback! On Thu, Nov 14, 2019 at 11:11 AM <fn(a)imm.dtu.dk> wrote:

[1] https://grafana.wikimedia.org/d/000000489/wikidata-query-service?panelId=8&… /Finn On 14/11/2019 10:50, Guillaume Lederrey wrote: > Hello all! > > As you've probably noticed, the update lag on the public WDQS endpoint > [1] is not doing well [2], with lag climbing to > 12h for some servers. > We are tracking this on phabricator [3], subscribe to that task if you > want to stay informed. > > To be perfectly honest, we don't have a good short term solution. The > graph database that we are using at the moment (Blazegraph [4]) does not > easily support sharding, so even throwing hardware at the problem isn't > really an option. > > We are working on a few medium term improvements: > > * A dedicated updater service in Blazegraph, which should help increase > the update throughput [5]. Finger crossed, this should be ready for > initial deployment and testing by next week (no promise, we're doing the > best we can). > * Some improvement in the parallelism of the updater [6]. This has just > been identified. While it will probably also provide some improvement in > throughput, we haven't actually started working on that and we don't > have any numbers at this point. > > Longer term: > > We are hiring a new team member to work on WDQS. It will take some time > to get this person up to speed, but we should have more capacity to > address the deeper issues of WDQS by January. > > The 2 main points we want to address are: > > * Finding a triple store that scales better than our current solution. > * Better understand what are the use cases on WDQS and see if we can > provide a technical solution that is better suited. Our intuition is > that some of the use cases that require synchronous (or quasi > synchronous) updates would be better implemented outside of a triple > store. Honestly, we have no idea yet if this makes sense and what those > alternate solutions might be. > > Thanks a lot for your patience during this tough time! > > Guillaume > > > [1] https://query.wikidata.org/ > [2] > https://grafana.wikimedia.org/d/000000489/wikidata-query-service?orgId=1&am… > [3] https://phabricator.wikimedia.org/T238229 > [4] https://blazegraph.com/ > [5] https://phabricator.wikimedia.org/T212826 > [6] https://phabricator.wikimedia.org/T238045 > > -- > Guillaume Lederrey > Engineering Manager, Search Platform > Wikimedia Foundation > UTC+1 / CET > > _______________________________________________ > Wikidata mailing list > Wikidata(a)lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/wikidata > _______________________________________________ Wikidata mailing list Wikidata(a)lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata

_______________________________________________ Wikidata mailing list Wikidata(a)lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata

Denny Vrandečić

11:48 p.m.

Just wondering, is there a way to let volunteers look into the issue? (I guess no because it would give potentially access to the query stream, but maybe the answer is more optimistic) On Thu, Nov 14, 2019 at 2:39 PM Thad Guidry <thadguidry(a)gmail.com> wrote:

...

Hello! Thanks for the suggestions! On Thu, Nov 14, 2019 at 5:02 PM Thad Guidry <thadguidry(a)gmail.com> wrote:

Start at the Hardware side first however.

Is the disk activity for writes really low...and CPU is very high? You have identified a bottleneck in that case, discover WHY that would be the case looking into any of the above.

https://www.linkedin.com/in/thadguidry/ On Thu, Nov 14, 2019 at 7:31 AM Guillaume Lederrey < glederrey(a)wikimedia.org> wrote:

Thanks for the feedback! On Thu, Nov 14, 2019 at 11:11 AM <fn(a)imm.dtu.dk> wrote: > > Besides waiting for the new updater, it may be useful to tell us, what > we as users can do too. It is unclear to me what the problem is. For > instance, at one point I was worried that the many parallel requests > to > the SPARQL endpoint that we make in Scholia is a problem. As far as I > understand it is not a problem at all. Another issue could be the way > that we use Magnus Manske's Quickstatements and approve bots for high > frequency editing. Perhaps a better overview and constraints on > large-scale editing could be discussed? > To be (again) completely honest, we don't entirely understand the issue either. There are clearly multiple related issues. In high level terms, we have at least: * Some part of the update process on Blazegraph is CPU bound and single threaded. Even with low query load, if we have a high edit rate, Blazegraph can't keep up, and saturates a single CPU (with plenty of available resources on other CPUs). This is a hard issue to fix, requiring either splitting the processing over multiple CPU or sharding the data over multiple servers. Neither of which Blazegraph supports (at least not in our current configuration). * There is a race for resources between edits and queries: a high query load will impact the update rate. This could to some extent be mitigated by reducing the query load: if no one is using the service, it works great! Obviously that's not much of a solution. What you can do (short term): * Keep bots usage well behaved (don't do parallel queries, provide a meaningful user agent, smooth the load over time if possible, ...). As far as I can see, most usage are already well behaved. * Optimize your queries: better queries will use less resources, which should help. Time to completion is a good approximation of the resources used. I don't really have any more specific advice, SPARQL is not my area of expertise. What you can do (longer term): * Help us think out of the box. Can we identify higher level use cases? Could we implement some of our workflows on a higher level API than SPARQL, which might allow for more internal optimizations? * Help us better understand the constraints. Document use cases on [1]. Sadly, we don't have the bandwidth right now to engage meaningfully in this conversation. Feel free to send thoughts already, but don't expect any timely response. Yet another thought is the large discrepancy between Virginia and Texas > data centers as I could see on Grafana [1]. As far as I understand the > hardware (and software) are the same. So why is there this large > difference? Rather than editing or BlazeGraph, could the issue be some > form of network issue? > As pointed out by Lucas, this is expected. Due to how our GeoDNS works, we see more traffic on eqiad than on codfw. Thanks for the help! Guillaume [1] https://wikitech.wikimedia.org/wiki/Wikidata_query_service/Usage > > > [1] > > https://grafana.wikimedia.org/d/000000489/wikidata-query-service?panelId=8&… > > /Finn > > > > On 14/11/2019 10:50, Guillaume Lederrey wrote: > > Hello all! > > > > As you've probably noticed, the update lag on the public WDQS > endpoint > > [1] is not doing well [2], with lag climbing to > 12h for some > servers. > > We are tracking this on phabricator [3], subscribe to that task if > you > > want to stay informed. > > > > To be perfectly honest, we don't have a good short term solution. > The > > graph database that we are using at the moment (Blazegraph [4]) does > not > > easily support sharding, so even throwing hardware at the problem > isn't > > really an option. > > > > We are working on a few medium term improvements: > > > > * A dedicated updater service in Blazegraph, which should help > increase > > the update throughput [5]. Finger crossed, this should be ready for > > initial deployment and testing by next week (no promise, we're doing > the > > best we can). > > * Some improvement in the parallelism of the updater [6]. This has > just > > been identified. While it will probably also provide some > improvement in > > throughput, we haven't actually started working on that and we don't > > have any numbers at this point. > > > > Longer term: > > > > We are hiring a new team member to work on WDQS. It will take some > time > > to get this person up to speed, but we should have more capacity to > > address the deeper issues of WDQS by January. > > > > The 2 main points we want to address are: > > > > * Finding a triple store that scales better than our current > solution. > > * Better understand what are the use cases on WDQS and see if we can > > provide a technical solution that is better suited. Our intuition is > > that some of the use cases that require synchronous (or quasi > > synchronous) updates would be better implemented outside of a triple > > store. Honestly, we have no idea yet if this makes sense and what > those > > alternate solutions might be. > > > > Thanks a lot for your patience during this tough time! > > > > Guillaume > > > > > > [1] https://query.wikidata.org/ > > [2] > > > https://grafana.wikimedia.org/d/000000489/wikidata-query-service?orgId=1&am… > > [3] https://phabricator.wikimedia.org/T238229 > > [4] https://blazegraph.com/ > > [5] https://phabricator.wikimedia.org/T212826 > > [6] https://phabricator.wikimedia.org/T238045 > > > > -- > > Guillaume Lederrey > > Engineering Manager, Search Platform > > Wikimedia Foundation > > UTC+1 / CET > > > > _______________________________________________ > > Wikidata mailing list > > Wikidata(a)lists.wikimedia.org > > https://lists.wikimedia.org/mailman/listinfo/wikidata > > > > _______________________________________________ > Wikidata mailing list > Wikidata(a)lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/wikidata > -- Guillaume Lederrey Engineering Manager, Search Platform Wikimedia Foundation UTC+1 / CET _______________________________________________ Wikidata mailing list Wikidata(a)lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata

_______________________________________________ Wikidata mailing list Wikidata(a)lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata

Guillaume Lederrey

15 Nov 15 Nov

1:16 p.m.

On Fri, Nov 15, 2019 at 12:49 AM Denny Vrandečić <vrandecic(a)google.com> wrote:

...

Just wondering, is there a way to let volunteers look into the issue? (I guess no because it would give potentially access to the query stream, but maybe the answer is more optimistic)

There are ways, none of them easy. There are precedents for volunteers having access to our production environment. I'm not really sure what the process looks like. There is at least some NDA to sign and some vetting process. As you pointed out, this would give access to sensitive information, and to the ability to do great damage (power, responsibility and those kind of things). More realistically, we could provide more information for analysis. Heap dumps do contain private information, but thread dumps are pretty safe, so we could publish those. We would need to automate this on our side, but that might be an option. Of course, having access to limited information and no way to experiment on changes seriously limits the ability to investigate. I'll check with the team if that's something we are ready to invest in.

...

On Thu, Nov 14, 2019 at 2:39 PM Thad Guidry <thadguidry(a)gmail.com> wrote:

Hello! Thanks for the suggestions! On Thu, Nov 14, 2019 at 5:02 PM Thad Guidry <thadguidry(a)gmail.com> wrote:

Start at the Hardware side first however.

Is the disk activity for writes really low...and CPU is very high? You have identified a bottleneck in that case, discover WHY that would be the case looking into any of the above.

https://www.linkedin.com/in/thadguidry/ On Thu, Nov 14, 2019 at 7:31 AM Guillaume Lederrey < glederrey(a)wikimedia.org> wrote: > Thanks for the feedback! > > On Thu, Nov 14, 2019 at 11:11 AM <fn(a)imm.dtu.dk> wrote: > >> >> Besides waiting for the new updater, it may be useful to tell us, >> what >> we as users can do too. It is unclear to me what the problem is. For >> instance, at one point I was worried that the many parallel requests >> to >> the SPARQL endpoint that we make in Scholia is a problem. As far as I >> understand it is not a problem at all. Another issue could be the way >> that we use Magnus Manske's Quickstatements and approve bots for high >> frequency editing. Perhaps a better overview and constraints on >> large-scale editing could be discussed? >> > > To be (again) completely honest, we don't entirely understand the > issue either. There are clearly multiple related issues. In high level > terms, we have at least: > > * Some part of the update process on Blazegraph is CPU bound and > single threaded. Even with low query load, if we have a high edit rate, > Blazegraph can't keep up, and saturates a single CPU (with plenty of > available resources on other CPUs). This is a hard issue to fix, requiring > either splitting the processing over multiple CPU or sharding the data over > multiple servers. Neither of which Blazegraph supports (at least not in our > current configuration). > * There is a race for resources between edits and queries: a high > query load will impact the update rate. This could to some extent be > mitigated by reducing the query load: if no one is using the service, it > works great! Obviously that's not much of a solution. > > What you can do (short term): > > * Keep bots usage well behaved (don't do parallel queries, provide a > meaningful user agent, smooth the load over time if possible, ...). As far > as I can see, most usage are already well behaved. > * Optimize your queries: better queries will use less resources, which > should help. Time to completion is a good approximation of the resources > used. I don't really have any more specific advice, SPARQL is not my area > of expertise. > > What you can do (longer term): > > * Help us think out of the box. Can we identify higher level use > cases? Could we implement some of our workflows on a higher level API than > SPARQL, which might allow for more internal optimizations? > * Help us better understand the constraints. Document use cases on [1]. > > Sadly, we don't have the bandwidth right now to engage meaningfully in > this conversation. Feel free to send thoughts already, but don't expect any > timely response. > > Yet another thought is the large discrepancy between Virginia and >> Texas >> data centers as I could see on Grafana [1]. As far as I understand >> the >> hardware (and software) are the same. So why is there this large >> difference? Rather than editing or BlazeGraph, could the issue be >> some >> form of network issue? >> > > As pointed out by Lucas, this is expected. Due to how our GeoDNS > works, we see more traffic on eqiad than on codfw. > > Thanks for the help! > > Guillaume > > [1] https://wikitech.wikimedia.org/wiki/Wikidata_query_service/Usage > > > >> >> >> [1] >> >> https://grafana.wikimedia.org/d/000000489/wikidata-query-service?panelId=8&… >> >> /Finn >> >> >> >> On 14/11/2019 10:50, Guillaume Lederrey wrote: >> > Hello all! >> > >> > As you've probably noticed, the update lag on the public WDQS >> endpoint >> > [1] is not doing well [2], with lag climbing to > 12h for some >> servers. >> > We are tracking this on phabricator [3], subscribe to that task if >> you >> > want to stay informed. >> > >> > To be perfectly honest, we don't have a good short term solution. >> The >> > graph database that we are using at the moment (Blazegraph [4]) >> does not >> > easily support sharding, so even throwing hardware at the problem >> isn't >> > really an option. >> > >> > We are working on a few medium term improvements: >> > >> > * A dedicated updater service in Blazegraph, which should help >> increase >> > the update throughput [5]. Finger crossed, this should be ready for >> > initial deployment and testing by next week (no promise, we're >> doing the >> > best we can). >> > * Some improvement in the parallelism of the updater [6]. This has >> just >> > been identified. While it will probably also provide some >> improvement in >> > throughput, we haven't actually started working on that and we >> don't >> > have any numbers at this point. >> > >> > Longer term: >> > >> > We are hiring a new team member to work on WDQS. It will take some >> time >> > to get this person up to speed, but we should have more capacity to >> > address the deeper issues of WDQS by January. >> > >> > The 2 main points we want to address are: >> > >> > * Finding a triple store that scales better than our current >> solution. >> > * Better understand what are the use cases on WDQS and see if we >> can >> > provide a technical solution that is better suited. Our intuition >> is >> > that some of the use cases that require synchronous (or quasi >> > synchronous) updates would be better implemented outside of a >> triple >> > store. Honestly, we have no idea yet if this makes sense and what >> those >> > alternate solutions might be. >> > >> > Thanks a lot for your patience during this tough time! >> > >> > Guillaume >> > >> > >> > [1] https://query.wikidata.org/ >> > [2] >> > >> https://grafana.wikimedia.org/d/000000489/wikidata-query-service?orgId=1&am… >> > [3] https://phabricator.wikimedia.org/T238229 >> > [4] https://blazegraph.com/ >> > [5] https://phabricator.wikimedia.org/T212826 >> > [6] https://phabricator.wikimedia.org/T238045 >> > >> > -- >> > Guillaume Lederrey >> > Engineering Manager, Search Platform >> > Wikimedia Foundation >> > UTC+1 / CET >> > >> > _______________________________________________ >> > Wikidata mailing list >> > Wikidata(a)lists.wikimedia.org >> > https://lists.wikimedia.org/mailman/listinfo/wikidata >> > >> >> _______________________________________________ >> Wikidata mailing list >> Wikidata(a)lists.wikimedia.org >> https://lists.wikimedia.org/mailman/listinfo/wikidata >> > > > -- > Guillaume Lederrey > Engineering Manager, Search Platform > Wikimedia Foundation > UTC+1 / CET > _______________________________________________ > Wikidata mailing list > Wikidata(a)lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/wikidata > _______________________________________________ Wikidata mailing list Wikidata(a)lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata

_______________________________________________ Wikidata mailing list Wikidata(a)lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata

-- Guillaume Lederrey Engineering Manager, Search Platform Wikimedia Foundation UTC+1 / CET

Denny Vrandečić

18 Nov 18 Nov

5:53 p.m.

I don't know if there is actually someone who would be capable and have the time to do so, I just would hope there are such people - but it probably makes sense to check if there are actually volunteers before doing work to enable them :) On Fri, Nov 15, 2019 at 5:17 AM Guillaume Lederrey <glederrey(a)wikimedia.org> wrote:

...

On Fri, Nov 15, 2019 at 12:49 AM Denny Vrandečić <vrandecic(a)google.com> wrote:

Just wondering, is there a way to let volunteers look into the issue? (I guess no because it would give potentially access to the query stream, but maybe the answer is more optimistic)

On Thu, Nov 14, 2019 at 2:39 PM Thad Guidry <thadguidry(a)gmail.com> wrote:

Hello! Thanks for the suggestions! On Thu, Nov 14, 2019 at 5:02 PM Thad Guidry <thadguidry(a)gmail.com> wrote: > Is the Write Retention Queue adequate? > Is the branching factor for the lexicon indices too large, resulting > in a non-linear slowdown in the write rate over tim? > Did you look into Small Slot Optimization? > Are the Write Cache Buffers adequate? > Is there a lot of Heap pressure? > Is the MemoryManager have the maximum amount of RAM it can handle? > 4TB? > Is the RWStore handling the recycling well? > Is the SAIL Buffer Capacity adequate? > Are you not using exact range counts where you could be using fast > range counts? > > Start at the Hardware side first however. > Is the disk activity for writes really low...and CPU is very high? > You have identified a bottleneck in that case, discover WHY that would be > the case looking into any of the above. > Sounds like good questions, but outside of my area of expertise. I've created https://phabricator.wikimedia.org/T238362 to track it, and I'll see if someone can have a look. I know that we did multiple passes at tuning Blazegraph properties, with limited success so far. > and a 100+ other things that should be looked at that all affect WRITE > performance during UPDATES. > > https://wiki.blazegraph.com/wiki/index.php/IOOptimization > https://wiki.blazegraph.com/wiki/index.php/PerformanceOptimization > > I would also suggest you start monitoring some of the internals of > Blazegraph (JAVA) while in production with tools such as XRebel or > AppDynamics. > Both XRebel and AppDynamics are proprietary, so no way that we'll deploy them in our environment. We are tracking a few JMX based metrics, but so far, we don't really know what to look for. Thanks! Guillaume Thad > https://www.linkedin.com/in/thadguidry/ > > > On Thu, Nov 14, 2019 at 7:31 AM Guillaume Lederrey < > glederrey(a)wikimedia.org> wrote: > >> Thanks for the feedback! >> >> On Thu, Nov 14, 2019 at 11:11 AM <fn(a)imm.dtu.dk> wrote: >> >>> >>> Besides waiting for the new updater, it may be useful to tell us, >>> what >>> we as users can do too. It is unclear to me what the problem is. For >>> instance, at one point I was worried that the many parallel requests >>> to >>> the SPARQL endpoint that we make in Scholia is a problem. As far as >>> I >>> understand it is not a problem at all. Another issue could be the >>> way >>> that we use Magnus Manske's Quickstatements and approve bots for >>> high >>> frequency editing. Perhaps a better overview and constraints on >>> large-scale editing could be discussed? >>> >> >> To be (again) completely honest, we don't entirely understand the >> issue either. There are clearly multiple related issues. In high level >> terms, we have at least: >> >> * Some part of the update process on Blazegraph is CPU bound and >> single threaded. Even with low query load, if we have a high edit rate, >> Blazegraph can't keep up, and saturates a single CPU (with plenty of >> available resources on other CPUs). This is a hard issue to fix, requiring >> either splitting the processing over multiple CPU or sharding the data over >> multiple servers. Neither of which Blazegraph supports (at least not in our >> current configuration). >> * There is a race for resources between edits and queries: a high >> query load will impact the update rate. This could to some extent be >> mitigated by reducing the query load: if no one is using the service, it >> works great! Obviously that's not much of a solution. >> >> What you can do (short term): >> >> * Keep bots usage well behaved (don't do parallel queries, provide a >> meaningful user agent, smooth the load over time if possible, ...). As far >> as I can see, most usage are already well behaved. >> * Optimize your queries: better queries will use less resources, >> which should help. Time to completion is a good approximation of the >> resources used. I don't really have any more specific advice, SPARQL is not >> my area of expertise. >> >> What you can do (longer term): >> >> * Help us think out of the box. Can we identify higher level use >> cases? Could we implement some of our workflows on a higher level API than >> SPARQL, which might allow for more internal optimizations? >> * Help us better understand the constraints. Document use cases on >> [1]. >> >> Sadly, we don't have the bandwidth right now to engage meaningfully >> in this conversation. Feel free to send thoughts already, but don't expect >> any timely response. >> >> Yet another thought is the large discrepancy between Virginia and >>> Texas >>> data centers as I could see on Grafana [1]. As far as I understand >>> the >>> hardware (and software) are the same. So why is there this large >>> difference? Rather than editing or BlazeGraph, could the issue be >>> some >>> form of network issue? >>> >> >> As pointed out by Lucas, this is expected. Due to how our GeoDNS >> works, we see more traffic on eqiad than on codfw. >> >> Thanks for the help! >> >> Guillaume >> >> [1] https://wikitech.wikimedia.org/wiki/Wikidata_query_service/Usage >> >> >> >>> >>> >>> [1] >>> >>> https://grafana.wikimedia.org/d/000000489/wikidata-query-service?panelId=8&… >>> >>> /Finn >>> >>> >>> >>> On 14/11/2019 10:50, Guillaume Lederrey wrote: >>> > Hello all! >>> > >>> > As you've probably noticed, the update lag on the public WDQS >>> endpoint >>> > [1] is not doing well [2], with lag climbing to > 12h for some >>> servers. >>> > We are tracking this on phabricator [3], subscribe to that task if >>> you >>> > want to stay informed. >>> > >>> > To be perfectly honest, we don't have a good short term solution. >>> The >>> > graph database that we are using at the moment (Blazegraph [4]) >>> does not >>> > easily support sharding, so even throwing hardware at the problem >>> isn't >>> > really an option. >>> > >>> > We are working on a few medium term improvements: >>> > >>> > * A dedicated updater service in Blazegraph, which should help >>> increase >>> > the update throughput [5]. Finger crossed, this should be ready >>> for >>> > initial deployment and testing by next week (no promise, we're >>> doing the >>> > best we can). >>> > * Some improvement in the parallelism of the updater [6]. This has >>> just >>> > been identified. While it will probably also provide some >>> improvement in >>> > throughput, we haven't actually started working on that and we >>> don't >>> > have any numbers at this point. >>> > >>> > Longer term: >>> > >>> > We are hiring a new team member to work on WDQS. It will take some >>> time >>> > to get this person up to speed, but we should have more capacity >>> to >>> > address the deeper issues of WDQS by January. >>> > >>> > The 2 main points we want to address are: >>> > >>> > * Finding a triple store that scales better than our current >>> solution. >>> > * Better understand what are the use cases on WDQS and see if we >>> can >>> > provide a technical solution that is better suited. Our intuition >>> is >>> > that some of the use cases that require synchronous (or quasi >>> > synchronous) updates would be better implemented outside of a >>> triple >>> > store. Honestly, we have no idea yet if this makes sense and what >>> those >>> > alternate solutions might be. >>> > >>> > Thanks a lot for your patience during this tough time! >>> > >>> > Guillaume >>> > >>> > >>> > [1] https://query.wikidata.org/ >>> > [2] >>> > >>> https://grafana.wikimedia.org/d/000000489/wikidata-query-service?orgId=1&am… >>> > [3] https://phabricator.wikimedia.org/T238229 >>> > [4] https://blazegraph.com/ >>> > [5] https://phabricator.wikimedia.org/T212826 >>> > [6] https://phabricator.wikimedia.org/T238045 >>> > >>> > -- >>> > Guillaume Lederrey >>> > Engineering Manager, Search Platform >>> > Wikimedia Foundation >>> > UTC+1 / CET >>> > >>> > _______________________________________________ >>> > Wikidata mailing list >>> > Wikidata(a)lists.wikimedia.org >>> > https://lists.wikimedia.org/mailman/listinfo/wikidata >>> > >>> >>> _______________________________________________ >>> Wikidata mailing list >>> Wikidata(a)lists.wikimedia.org >>> https://lists.wikimedia.org/mailman/listinfo/wikidata >>> >> >> >> -- >> Guillaume Lederrey >> Engineering Manager, Search Platform >> Wikimedia Foundation >> UTC+1 / CET >> _______________________________________________ >> Wikidata mailing list >> Wikidata(a)lists.wikimedia.org >> https://lists.wikimedia.org/mailman/listinfo/wikidata >> > _______________________________________________ > Wikidata mailing list > Wikidata(a)lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/wikidata > -- Guillaume Lederrey Engineering Manager, Search Platform Wikimedia Foundation UTC+1 / CET _______________________________________________ Wikidata mailing list Wikidata(a)lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata

_______________________________________________ Wikidata mailing list Wikidata(a)lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata

Guillaume Lederrey

21 Nov 21 Nov

10:27 a.m.

Hello all! As you probably already know, the lag situation on WDQS is not improving as much as we'd like. Over the past week, we've managed to keep the lag mostly below 3 hours, but at the cost of a lot of manual work. And yes, we know that 3 hours of lag is already too much. Some updates on what we've been doing: * Testing of our new Merging Updater [1]. This did not go as planned. The throughput was worse than expected, and it was deleting more data than expected. We are investigating to see if this new updater has a bug, or if our previous updater was not cleaning up as much as it should (which would be good news)! We are investigating. * WMDE released a patch to expose the lag of WDQS through Wikidata [2]. This should allow edit bots to self throttle in case WDQS lag is climbing. * We are working on adding more parallelism to the updater [3]. Finger's crossed, this might help increase throughput a little bit. * We've moved one server from the internal WDQS cluster to the public one, to provide more resources. This has not had significant impact. We're looking at moving one of our test servers in production. It looks like throwing hardware at the problem isn't really working but we'll know for sure once we try. * Overall, we are still trying to figure out what's going on, adding some metrics, digging through the code and trying to make sense of all that. We are lacking knowledge and understanding, but we're learning. What's coming next: * We have a new team member starting December 9th. He will need to learn a lot before being effective, but a new set of eyes and one more brain on the problem will help in the medium term. * We are looking at ways to reduce / throttle the query load even more aggressively than we do at the moment. That could mean limiting the number of requests per second per user-agent/IP, or limiting the number of parallel requests or something else. * We will be looking at alternatives to Blazegraph. We need some quiet time (which has been seriously lacking lately) to do that. And we need a better understanding of the problem we are trying to solve to be able to make the right decision. In any case, this is something that will take time. What you can do to help: Honestly, not sure. Less query load on WDQS is likely to help, so if you have a bot, make sure the queries it makes are useful and optimized. I'll get back to you when we have something. Thanks all for your patience! Guillaume [1] https://phabricator.wikimedia.org/T231411 [2] https://phabricator.wikimedia.org/T221774 [3] https://phabricator.wikimedia.org/T238045 On Mon, Nov 18, 2019 at 6:54 PM Denny Vrandečić <vrandecic(a)google.com> wrote:

...

On Fri, Nov 15, 2019 at 12:49 AM Denny Vrandečić <vrandecic(a)google.com> wrote:

Just wondering, is there a way to let volunteers look into the issue? (I guess no because it would give potentially access to the query stream, but maybe the answer is more optimistic)

On Thu, Nov 14, 2019 at 2:39 PM Thad Guidry <thadguidry(a)gmail.com> wrote:

In the enterprise, most folks use either Java Mission Control, or just Java VisualVM profiler. Seeing sleeping Threads is often good to start with, and just taking a snapshot or even Heap Dump when things are really grinding slow would be useful, you can later share those snapshots/heap dump with the community or Java profiling experts to analyze later. https://visualvm.github.io/index.html Thad https://www.linkedin.com/in/thadguidry/ On Thu, Nov 14, 2019 at 1:46 PM Guillaume Lederrey < glederrey(a)wikimedia.org> wrote: > Hello! > > Thanks for the suggestions! > > On Thu, Nov 14, 2019 at 5:02 PM Thad Guidry <thadguidry(a)gmail.com> > wrote: > >> Is the Write Retention Queue adequate? >> Is the branching factor for the lexicon indices too large, resulting >> in a non-linear slowdown in the write rate over tim? >> Did you look into Small Slot Optimization? >> Are the Write Cache Buffers adequate? >> Is there a lot of Heap pressure? >> Is the MemoryManager have the maximum amount of RAM it can handle? >> 4TB? >> Is the RWStore handling the recycling well? >> Is the SAIL Buffer Capacity adequate? >> Are you not using exact range counts where you could be using fast >> range counts? >> >> > Start at the Hardware side first however. >> Is the disk activity for writes really low...and CPU is very high? >> You have identified a bottleneck in that case, discover WHY that would be >> the case looking into any of the above. >> > > Sounds like good questions, but outside of my area of expertise. I've > created https://phabricator.wikimedia.org/T238362 to track it, and > I'll see if someone can have a look. I know that we did multiple passes at > tuning Blazegraph properties, with limited success so far. > > >> and a 100+ other things that should be looked at that all affect >> WRITE performance during UPDATES. >> >> https://wiki.blazegraph.com/wiki/index.php/IOOptimization >> https://wiki.blazegraph.com/wiki/index.php/PerformanceOptimization >> >> I would also suggest you start monitoring some of the internals of >> Blazegraph (JAVA) while in production with tools such as XRebel or >> AppDynamics. >> > > Both XRebel and AppDynamics are proprietary, so no way that we'll > deploy them in our environment. We are tracking a few JMX based metrics, > but so far, we don't really know what to look for. > > Thanks! > > Guillaume > > Thad >> https://www.linkedin.com/in/thadguidry/ >> >> >> On Thu, Nov 14, 2019 at 7:31 AM Guillaume Lederrey < >> glederrey(a)wikimedia.org> wrote: >> >>> Thanks for the feedback! >>> >>> On Thu, Nov 14, 2019 at 11:11 AM <fn(a)imm.dtu.dk> wrote: >>> >>>> >>>> Besides waiting for the new updater, it may be useful to tell us, >>>> what >>>> we as users can do too. It is unclear to me what the problem is. >>>> For >>>> instance, at one point I was worried that the many parallel >>>> requests to >>>> the SPARQL endpoint that we make in Scholia is a problem. As far as >>>> I >>>> understand it is not a problem at all. Another issue could be the >>>> way >>>> that we use Magnus Manske's Quickstatements and approve bots for >>>> high >>>> frequency editing. Perhaps a better overview and constraints on >>>> large-scale editing could be discussed? >>>> >>> >>> To be (again) completely honest, we don't entirely understand the >>> issue either. There are clearly multiple related issues. In high level >>> terms, we have at least: >>> >>> * Some part of the update process on Blazegraph is CPU bound and >>> single threaded. Even with low query load, if we have a high edit rate, >>> Blazegraph can't keep up, and saturates a single CPU (with plenty of >>> available resources on other CPUs). This is a hard issue to fix, requiring >>> either splitting the processing over multiple CPU or sharding the data over >>> multiple servers. Neither of which Blazegraph supports (at least not in our >>> current configuration). >>> * There is a race for resources between edits and queries: a high >>> query load will impact the update rate. This could to some extent be >>> mitigated by reducing the query load: if no one is using the service, it >>> works great! Obviously that's not much of a solution. >>> >>> What you can do (short term): >>> >>> * Keep bots usage well behaved (don't do parallel queries, provide a >>> meaningful user agent, smooth the load over time if possible, ...). As far >>> as I can see, most usage are already well behaved. >>> * Optimize your queries: better queries will use less resources, >>> which should help. Time to completion is a good approximation of the >>> resources used. I don't really have any more specific advice, SPARQL is not >>> my area of expertise. >>> >>> What you can do (longer term): >>> >>> * Help us think out of the box. Can we identify higher level use >>> cases? Could we implement some of our workflows on a higher level API than >>> SPARQL, which might allow for more internal optimizations? >>> * Help us better understand the constraints. Document use cases on >>> [1]. >>> >>> Sadly, we don't have the bandwidth right now to engage meaningfully >>> in this conversation. Feel free to send thoughts already, but don't expect >>> any timely response. >>> >>> Yet another thought is the large discrepancy between Virginia and >>>> Texas >>>> data centers as I could see on Grafana [1]. As far as I understand >>>> the >>>> hardware (and software) are the same. So why is there this large >>>> difference? Rather than editing or BlazeGraph, could the issue be >>>> some >>>> form of network issue? >>>> >>> >>> As pointed out by Lucas, this is expected. Due to how our GeoDNS >>> works, we see more traffic on eqiad than on codfw. >>> >>> Thanks for the help! >>> >>> Guillaume >>> >>> [1] https://wikitech.wikimedia.org/wiki/Wikidata_query_service/Usage >>> >>> >>> >>>> >>>> >>>> [1] >>>> >>>> https://grafana.wikimedia.org/d/000000489/wikidata-query-service?panelId=8&… >>>> >>>> /Finn >>>> >>>> >>>> >>>> On 14/11/2019 10:50, Guillaume Lederrey wrote: >>>> > Hello all! >>>> > >>>> > As you've probably noticed, the update lag on the public WDQS >>>> endpoint >>>> > [1] is not doing well [2], with lag climbing to > 12h for some >>>> servers. >>>> > We are tracking this on phabricator [3], subscribe to that task >>>> if you >>>> > want to stay informed. >>>> > >>>> > To be perfectly honest, we don't have a good short term solution. >>>> The >>>> > graph database that we are using at the moment (Blazegraph [4]) >>>> does not >>>> > easily support sharding, so even throwing hardware at the problem >>>> isn't >>>> > really an option. >>>> > >>>> > We are working on a few medium term improvements: >>>> > >>>> > * A dedicated updater service in Blazegraph, which should help >>>> increase >>>> > the update throughput [5]. Finger crossed, this should be ready >>>> for >>>> > initial deployment and testing by next week (no promise, we're >>>> doing the >>>> > best we can). >>>> > * Some improvement in the parallelism of the updater [6]. This >>>> has just >>>> > been identified. While it will probably also provide some >>>> improvement in >>>> > throughput, we haven't actually started working on that and we >>>> don't >>>> > have any numbers at this point. >>>> > >>>> > Longer term: >>>> > >>>> > We are hiring a new team member to work on WDQS. It will take >>>> some time >>>> > to get this person up to speed, but we should have more capacity >>>> to >>>> > address the deeper issues of WDQS by January. >>>> > >>>> > The 2 main points we want to address are: >>>> > >>>> > * Finding a triple store that scales better than our current >>>> solution. >>>> > * Better understand what are the use cases on WDQS and see if we >>>> can >>>> > provide a technical solution that is better suited. Our intuition >>>> is >>>> > that some of the use cases that require synchronous (or quasi >>>> > synchronous) updates would be better implemented outside of a >>>> triple >>>> > store. Honestly, we have no idea yet if this makes sense and what >>>> those >>>> > alternate solutions might be. >>>> > >>>> > Thanks a lot for your patience during this tough time! >>>> > >>>> > Guillaume >>>> > >>>> > >>>> > [1] https://query.wikidata.org/ >>>> > [2] >>>> > >>>> https://grafana.wikimedia.org/d/000000489/wikidata-query-service?orgId=1&am… >>>> > [3] https://phabricator.wikimedia.org/T238229 >>>> > [4] https://blazegraph.com/ >>>> > [5] https://phabricator.wikimedia.org/T212826 >>>> > [6] https://phabricator.wikimedia.org/T238045 >>>> > >>>> > -- >>>> > Guillaume Lederrey >>>> > Engineering Manager, Search Platform >>>> > Wikimedia Foundation >>>> > UTC+1 / CET >>>> > >>>> > _______________________________________________ >>>> > Wikidata mailing list >>>> > Wikidata(a)lists.wikimedia.org >>>> > https://lists.wikimedia.org/mailman/listinfo/wikidata >>>> > >>>> >>>> _______________________________________________ >>>> Wikidata mailing list >>>> Wikidata(a)lists.wikimedia.org >>>> https://lists.wikimedia.org/mailman/listinfo/wikidata >>>> >>> >>> >>> -- >>> Guillaume Lederrey >>> Engineering Manager, Search Platform >>> Wikimedia Foundation >>> UTC+1 / CET >>> _______________________________________________ >>> Wikidata mailing list >>> Wikidata(a)lists.wikimedia.org >>> https://lists.wikimedia.org/mailman/listinfo/wikidata >>> >> _______________________________________________ >> Wikidata mailing list >> Wikidata(a)lists.wikimedia.org >> https://lists.wikimedia.org/mailman/listinfo/wikidata >> > > > -- > Guillaume Lederrey > Engineering Manager, Search Platform > Wikimedia Foundation > UTC+1 / CET > _______________________________________________ > Wikidata mailing list > Wikidata(a)lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/wikidata > _______________________________________________ Wikidata mailing list Wikidata(a)lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata

_______________________________________________ Wikidata mailing list Wikidata(a)lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata

-- Guillaume Lederrey Engineering Manager, Search Platform Wikimedia Foundation UTC+1 / CET

1632

days inactive

1639

days old

wikidata@lists.wikimedia.org

Manage subscription

10 comments

5 participants

tags (0)

participants (5)

Denny Vrandečić
fn＠imm.dtu.dk
Guillaume Lederrey
Lucas Werkmeister
Thad Guidry