With my tech evangelist hat on...

Google's philanthropy is nearly boundless when it comes to the promotion of knowledge.  Why? Because indeed it's in their best interest otherwise no one can prosper without knowledge.  They aggregate knowledge for the benefit of mankind, and then make a profit through advertising ... all while making that knowledge extremely easy to be found for the world.

Nothing in this world is entirely free (servers must spin, cooling must be provided, bugs squashed...).
To that end, Google and others understand this and help defray substantial costs of providing free knowledge in multiple domains especially in those domains that contribute to tech and human goodwill (science & medicine).
Sometimes with direct cash donations to WMF, even just this year with $2 million being decided by Google employees to give to WMF!!!

Other times it's with talent from interns they pay for during the summer, or tech knowledge exchanges to help tackle problems we have.
Still other times it's just their 20% employee time helping the world keep Open Source libraries up to date or giving the world Open Source tools that we ourselves use across WMF every minute of the day.
Then there are all the trickle down benefits (increasing privacy, REALLY?, yes Really!, reducing security risks, better performance, etc.) from those Open Source libraries & tools with things like ClusterFuzz, TensorFlow, Go, Kubernetes, and 1000's of others.

On Fri, Sep 20, 2019 at 5:25 AM Luca Martinelli <martinelliluca@gmail.com> wrote:
Hi Sebastian,

I'll try to take on some of your doubts, hopefully helping you to
solve them, or at least to give you some starting points.

Il giorno ven 20 set 2019 alle ore 10:48 Sebastian Hellmann
<hellmann@informatik.uni-leipzig.de> ha scritto:
> 1. there was a Knowledge Engine Project which failed, but in principle had the right idea: https://en.wikipedia.org/wiki/Knowledge_Engine_(Wikimedia_Foundation)
> This was aimed to "democratize the discovery of media, news and information", in particular counter-moving the traffic sink by Google providing Wikipedia's information in Google Search.

I don't remember/know much about the Knowledge Engine (KE), but to
quote Liam Wyatt/User:Wittylama, "the crime wasn't thinking about it,
it was the cover-up".

In other words, and based on what I remember and know, the Wikipedia
internal search engine always sucked, and KE was an hypothesis of
solving this problem. The main problems were:
1) an overall sensation - I repeat: SENSATION - that WMF was ready to
compete with Google on the "search engine market", something that was
never discussed within and/or with the community;
2) that this project was pushed in a very "secretive" way, i.e. it was
discovered by chance with an announcement of WMF winning a grant from
[I don't remember which institution, sorry], and the more questions
were raised about it, the less answers the then-Executive Director
seemed to be willing to give.

IMHO, having an internal engine that helps people getting what they're
looking for is a great idea, and the way it was conducted was indeed a
crime, because (again IMHO) we lost a good opportunity to start our
work several years in advance. What makes me still angry about it was
the way the whole thing was conducted: we still lack most pieces of
the whole thing, and this may fuel non-NPOV reconstructions as well as
unnecessary spin-off discussions that bring us further away from the
solution we were trying to achieve.

> Now that there is Wikidata, this is much better for Google because they can take the CC-0 data as they wish.

KE and Wikidata are two separate issues. I'm sure Wikidata would have
played a role in KE, given its important role in linking concepts and
items, but they're still two separate things.

As for Google picking data from Wikidata, they do the same from
countless databases (disregarding of their license), so all I can say
is that, if I were Google, I'd do the very same thing. The difference
between Google and Wikidata, and the reason why I still think Wikidata
is better, is that the latter releases its data to *everybody*, while
the former keeps it only to itself.

And I want to stress that "everybody" part: when we do synchronisation
with a GLAM database, we give them back an extremely valuable
feedback, in terms of link to other databases they can freely access,
as well as in terms of hints for data clean-up - which, again, is
something that Google doesn't provide at all.

> 3. I was under the impression that Google bought Freebase and then started Wikidata as a non-threatening model to the data they have in their Knowledge Graph
>Could someone give me some pointers about the financial connections of Google and Wikimedia (this should be transparent, right?) and also who pushed the Wikidata movement into life in 2012?

Wikidata started as an independent project by some of the people who
worked on Semantic MediaWiki (there are so many of them I fear I might
miss some of them, and that would be embarrassing for me), not as a
Google project.

It was originally financed *also* by Google, yes, but it was a small
part compared to the aid from other institutions, such as the Allen
Institute for Artificial Intelligence, the Gordon and Betty Moore
Foundation, the Wikimedia Foundation itself, and others.

> Google was also mentioned in https://blog.wikimedia.org/2017/10/30/wikidata-fifth-birthday/ but while it reads "Freebase, was discontinued because of the superiority of Wikidata’s approach and active community." I know the story as: Google didn't want its competitors to have the data and the service. Not much of Freebase did end up in Wikidata.

I remember the story as "Google couldn't make anymore money out of
Freebase, that was being also superseded by other internal systems
*and* Wikidata, so Denny pushed Google to donate Freebase's triples to

This is basically the same (well, with due proportions) that happened
with OpenRefine, which originally was called Google Refine and that
was discontinued because Google couldn't do any profit with it, and
now is one of the most valuable tools that we can use to clean up and
re-conciliate data with Wikidata.

As for the integration of the data, I don't have any precise data
about it, but I'm sure that a fair part of Freebase did end up in
Wikidata, just as much as many other big databases did.

> As I said, I don't want to push any opinions in any directions. I am more asking for more information about the connection of Google to Wikidata (financially), then Google to WMF and also I am asking about any strategic advantages for Google in relation to their competition.

I cannot properly answer you about this. WMF and Google are in my view
"frenemies": Google is, and will always be, a Big Tech company and WMF
is, and will always be, a champion of free knowledge. You just can't
do free knowledge by forcing Big Tech companies to NOT pick up your
tools and data, though, as much I as think it'd be unnecessary for us
just to NOT take any help from Google, if we can work together on
several objectives. This is ok to me, as much as we keep being
transparent on this - which I recognise to be your point and your
motivation beneath your email, so don't worry about it. ;)

I hope I helped you in wrapping your head about the whole thing. :)


Luca "Sannita" Martinelli

Wikidata mailing list