Hi!
Now, obviously endpoints referenced in a federated
query via a
service clause have to be open - so any attacker could send his
queries directly instead of squeezing them through some other
endpoint. The only scenario I can think of is that an attackers IP
already is blocked by the attacked site. If (instead of much more
common ways to fake an IP) the attacker would choose to do it by
federated queries through WDQS, this _could_ result in WDQS being
blocked by this endpoint.
This is not what we are concerned with. What we are concerned with is
that federation essentially requires you to run an open proxy - i.e. to
allow anybody to send requests to any URL. This is not acceptable to us
because this means somebody could abuse this both to try and access our
internal infrastructure and to launch attacks to other sites using our
site as a platform.
We could allow, if there is enough demand, to access specific
whitelisted endpoints but so far we haven't found any way to allow
access to any SPARQL endpoint without essentially allowing anybody to
launch arbitrary network connections from our server.
provide for the linked data cloud. This must not
involve the
highly-protected production environment, but could be solved by an
additional unstable/experimental endpoint under another address.
The problem is we can not run production-quality endpoint in
non-production environment. We could set up an endpoint on the Labs, but
this endpoint would be underpowered and we won't be able to guarantee
any quality of service there. To serve the amount of Wikidata data and
updates, the machines should have certain hardware capabilities, which
Labs machines currently do not have.
Additionally, I'm not sure running open proxy even there would be a good
idea. Unfortunately, in the internet environment of today there is no
lack of players that would want to abuse such thing for nefarious purposes.
We will keep looking for solution for this, but so far we haven't found one.
Thanks,
--
Stas Malyshev
smalyshev(a)wikimedia.org