I'll try to take on some of your doubts, hopefully helping you to
solve them, or at least to give you some starting points.
Il giorno ven 20 set 2019 alle ore 10:48 Sebastian Hellmann
<hellmann(a)informatik.uni-leipzig.de> ha scritto:
1. there was a Knowledge Engine Project which failed,
but in principle had the right idea:
This was aimed to "democratize the discovery of media, news and information",
in particular counter-moving the traffic sink by Google providing Wikipedia's
information in Google Search.
I don't remember/know much about the Knowledge Engine (KE), but to
quote Liam Wyatt/User:Wittylama, "the crime wasn't thinking about it,
it was the cover-up".
In other words, and based on what I remember and know, the Wikipedia
internal search engine always sucked, and KE was an hypothesis of
solving this problem. The main problems were:
1) an overall sensation - I repeat: SENSATION - that WMF was ready to
compete with Google on the "search engine market", something that was
never discussed within and/or with the community;
2) that this project was pushed in a very "secretive" way, i.e. it was
discovered by chance with an announcement of WMF winning a grant from
[I don't remember which institution, sorry], and the more questions
were raised about it, the less answers the then-Executive Director
seemed to be willing to give.
IMHO, having an internal engine that helps people getting what they're
looking for is a great idea, and the way it was conducted was indeed a
crime, because (again IMHO) we lost a good opportunity to start our
work several years in advance. What makes me still angry about it was
the way the whole thing was conducted: we still lack most pieces of
the whole thing, and this may fuel non-NPOV reconstructions as well as
unnecessary spin-off discussions that bring us further away from the
solution we were trying to achieve.
Now that there is Wikidata, this is much better for
Google because they can take the CC-0 data as they wish.
KE and Wikidata are two separate issues. I'm sure Wikidata would have
played a role in KE, given its important role in linking concepts and
items, but they're still two separate things.
As for Google picking data from Wikidata, they do the same from
countless databases (disregarding of their license), so all I can say
is that, if I were Google, I'd do the very same thing. The difference
between Google and Wikidata, and the reason why I still think Wikidata
is better, is that the latter releases its data to *everybody*, while
the former keeps it only to itself.
And I want to stress that "everybody" part: when we do synchronisation
with a GLAM database, we give them back an extremely valuable
feedback, in terms of link to other databases they can freely access,
as well as in terms of hints for data clean-up - which, again, is
something that Google doesn't provide at all.
3. I was under the impression that Google bought
Freebase and then started Wikidata as a non-threatening model to the data they have in
their Knowledge Graph
Could someone give me some pointers about the financial connections of Google and
Wikimedia (this should be transparent, right?) and also who pushed the Wikidata movement
into life in 2012?
Wikidata started as an independent project by some of the people who
worked on Semantic MediaWiki (there are so many of them I fear I might
miss some of them, and that would be embarrassing for me), not as a
It was originally financed *also* by Google, yes, but it was a small
part compared to the aid from other institutions, such as the Allen
Institute for Artificial Intelligence, the Gordon and Betty Moore
Foundation, the Wikimedia Foundation itself, and others.
Google was also mentioned in
but while it reads
"Freebase, was discontinued because of the superiority of Wikidata’s approach and
active community." I know the story as: Google didn't want its competitors to
have the data and the service. Not much of Freebase did end up in Wikidata.
I remember the story as "Google couldn't make anymore money out of
Freebase, that was being also superseded by other internal systems
*and* Wikidata, so Denny pushed Google to donate Freebase's triples to
This is basically the same (well, with due proportions) that happened
with OpenRefine, which originally was called Google Refine and that
was discontinued because Google couldn't do any profit with it, and
now is one of the most valuable tools that we can use to clean up and
re-conciliate data with Wikidata.
As for the integration of the data, I don't have any precise data
about it, but I'm sure that a fair part of Freebase did end up in
Wikidata, just as much as many other big databases did.
As I said, I don't want to push any opinions in
any directions. I am more asking for more information about the connection of Google to
Wikidata (financially), then Google to WMF and also I am asking about any strategic
advantages for Google in relation to their competition.
I cannot properly answer you about this. WMF and Google are in my view
"frenemies": Google is, and will always be, a Big Tech company and WMF
is, and will always be, a champion of free knowledge. You just can't
do free knowledge by forcing Big Tech companies to NOT pick up your
tools and data, though, as much I as think it'd be unnecessary for us
just to NOT take any help from Google, if we can work together on
several objectives. This is ok to me, as much as we keep being
transparent on this - which I recognise to be your point and your
motivation beneath your email, so don't worry about it. ;)
I hope I helped you in wrapping your head about the whole thing. :)
Luca "Sannita" Martinelli