Hi Erik,
Nice to hear from you.
On Tue, Oct 3, 2017 at 11:48 PM, Erik Moeller <eloquence(a)gmail.com> wrote:
The power of an open, nonprofit approach to "knowledge as a service"
is precisely to democratize access to knowledge graph information: to
make it available to nonprofits, public institutions, communities,
individuals. This includes projects like the "Structured Data for
Wikimedia Commons" effort, which is a potential game-changer for
institutions like galleries, libraries, archives and museums.
Nor is such an approach inherently monopolistic: quite the opposite.
Wikidata is well-suited for a certain class of data-related problems
but not so much for others. Everything around Wikidata is evolving in
the direction of federation: federated queries across multiple open
datasets, federated installations of the Wikibase software, and so on.
If anything, it seems likely that a greater emphasis on "knowledge as
a service" will unavoidably decentralize influence and control, and
bring knowledge from other knowledge providers into the Wikimedia
context.
... and it will all become one free mush everyone copies to make a buck. We
are already in a situation today where anyone asking Siri, the Amazon Echo,
Google or Bing about a topic is likely to get the same answer from all of
them, because they all import Wikimedia content, which comes free of
charge. I find that worrying, because as an information delivery system,
it’s not robust. You change one source, and all the other sources change as
well. That's a huge vulnerability. No one looking at the system as a whole
would design it that way.
Internet manipulation is a big topic in the news these days. We have
millions of people in the United States and UK wondering whether
sophisticated, targeted online manipulation put Trump into the White House
and took Britain out of the EU.[1] The same people that once expressed
unadulterated optimism about the Internet’s effect on the world, believing
it would democratise and decentralise everything (a related Berners-Lee
statement is quoted approvingly in the draft Appendix[2]), are now sounding
alarms that the Internet has opened new and far more insidious avenues of
influence, among them targeted ads and viral lies.[3]
If Wikimedia content does come to play the essential role envisaged, anyone
with a vested interest will have a powerful motive to try and subvert this
knowledge base, using the most sophisticated SEO, AI, cyberattack and
socio-political methods known today or yet to be imagined. Do we really
expect that Wikimedia will somehow be immune to such attacks? Do we expect
that volunteers will be able to keep up with this in real time?
The draft Appendix states that "In a world where some try to limit,
control, or manipulate information, we seek to be a beacon of facts,
openness, and good faith". No one can criticise such aspirations. But this
upbeat and self-flattering message ignores that on its present scale,
Wikimedia content has already been demonstrated to be politically
corruptible, serving as a handy and welcome tool in the hands of precisely
those who do seek to "limit, control or manipulate information."[4][5][6]
Even if we agree on nothing else, and you choose to be a blue-eyed optimist
and I a jaundiced pessimist, we should be able to agree that an openly
editable online database underpinning the content delivery of literally
more AI tools and digital assistants than there are people on the planet[7]
will be a sitting duck for bad-faith actors, from conflicted editors,
political factions and SEO experts to government-sponsored hackers, and
that there will be challenges to be faced and prepared for.
Speaking about AI development, Elon Musk warned earlier this year that
people will sometimes "get so engrossed in their work that they don’t
really realize the ramifications of what they’re doing"[8] and that even
with the best intentions, it's perfectly possible to "produce something
evil by accident."[9] He's right.
People get carried away by new technological possibilities, and fail to
look at potential downsides of what they are doing. They’re not always
obvious. I mean, take Facebook. Millions of people flocked to the free
platform, using it as a welcome means to stay in touch with friends and
family. Nobody in their wildest dreams would have thought that their
participation in that trend, just so they could keep up with cousin Pete
and reconnect with old school friends, might one day undermine democracy.
Yet that is exactly what is being investigated now.[1] As we speak,
Congress and the Senate Intelligence Committee are still trying to find out
from Facebook, Twitter and Google exactly what happened.[10] Meanwhile,
Trump is in power. Whatever the eventual findings, these very public
discussions and worries should make clear that successful, well-timed
manipulation of content delivered automatically by AI tools to vast numbers
of people can have staggering global consequences that removal of corrupted
content after the event won't undo.
Life teaches that every action has unforeseen consequences, and that the
path to hell is paved with the best intentions. Free online services seemed
like a wonderful thing. It’s taken us years to figure out that there are
new and unexpected prices to be paid.
I would have loved to have seen a risk assessment attached to this
strategic direction, along with an open discussion of potential negative
impacts on humanity that might result from a system where one knowledge
service provider has such a global impact. Knowing that monocultures are
inherently more unstable and more easily corrupted than pluralist systems,
what are the worst things that could happen? What sort of fail-safes and
redundancies would make the overall system less vulnerable? There is still
time to do that work, I guess, and I’d suggest it would be work worth doing
and consulting a broad range of experts over.
I had no involvement with this document and don't
know what focusing
on "knowledge of a service" really will mean in practice. But if it
means things like improving Wikidata, building better APIs and content
formats, building better Labs^WCloud infrastructure, then the crucial
point is not that companies may benefit from such work, but that
_everybody else does, too_. And that is what distinguishes it from the
prevailing extract-and-monetize paradigm. For-profits exploting free
knowledge projects for commercial gain? That's the _current state_. To
change it, we have to make it easier to replicate what they are doing:
through open data, open APIs, open code.
You didn't really address my social justice argument. This is a much more
parochial concern, and your perspective is bound to be different, as you
personally have profited handsomely from your involvement in Wikimedia, but
is it just that some of the world's most profitable companies earn billions
from volunteers' work, gaining political power in the process, while
volunteers actually pay to go online and access or purchase the sources
they need to do their work? Yes or no?
As I've mentioned before, Google has a full digital copy somewhere on its
servers of pretty much any source any Wikimedian might ever want to access.
When the WMF talks to Google, I'd really like them to inquire, for once,
what Google could do for volunteers, rather than what volunteers could do
for Google.
Best,
Andreas
[1]
https://www.theguardian.com/technology/2017/may/07/the-great-british-brexit…
[2]
https://meta.wikimedia.org/w/index.php?title=Strategy/Wikimedia_movement/20…
citing
https://www.independent.co.uk/life-style/gadgets-and-tech/news/25-years-of-…
[3]
http://www.bbc.co.uk/news/technology-39246810
[4]
https://www.dailydot.com/layer8/croatian-wikipedia-fascist-takeover-controv…
[5]
http://www.eurasianet.org/node/72831
[6]
https://en.wikipedia.org/wiki/Wikipedia:Wikipedia_Signpost/2015-10-07/Op-ed
[7]
https://www.cnet.com/uk/news/digital-assistants-to-surpass-global-populatio…
[8]
http://www.newsweek.com/elon-musk-world-government-summit-556211
[9]
https://www.vanityfair.com/news/2017/03/elon-musk-billion-dollar-crusade-to…
[10]
http://money.cnn.com/2017/10/01/media/facebook-russia-ads-congress/index.ht…