I first heard about Wikidata at
SemTech in San Francisco and I was told
very directly that they were not interested in working with anybody who was
experienced with putting data from generic database in front of users
because they had worked so hard to get academic positions and get a grant
from the Allen Institute and it is more cost-effective and more compatible
with academic advancement to hire a bunch of young people who don't know
anything but will follow orders.
I am, frankly, baffled by this story. It very likely was me, presenting
Wikidata at SemTech in SF, so it probably was me you have been talking
with, but I have no recollection of a conversation going the way you
If I remember the timing correctly, I didn't have an academic position at
the time of SemTech. Actually, I gave up my academic position to move to
Berlin and work on Wikidata.
The donors on Wikidata never exercised any influence on the projects,
beyond requiring reports on the progress.
I cannot imagine that I would ever have said that we "were not interested
in working with anybody who was experienced with putting data from generic
database in front of users", because, really, that would make no sense to
say. I also do not remember having gotten an application from you.
Regarding the team that we wanted and eventually did hire, I would sternly
disagree with the description of "a bunch of young people who don't know
anything but will follow orders" - from the applications we got we choose
the most suitable team we could pull together. And considering the
discussions we had in the following months, following orders was neither
their strength nor the qualification they were chosen for. Nor did they
consist only of young people. Instead, it turned out, they were exactly the
kind of independent thinkers with dedication to the goal and quality that
we were aiming for. Fortunately, for the project.
Maybe the conversation went differently than you are remembering it.
E.g. I would have insisted on building Wikidata on top of MediaWiki (for
E.g. I would have insisted on everyone to work on Wikidata to move to
Berlin (because I thought it would be the only possibility to get the
project to an acceptable state in the original timeframe, so that we can
ensure its future sustainability).
E.g. I would have disagreed on being able to use RDF/SPARQL backends back
then out of the box to be Wikidata's backend (but I would have been open
for anyone showing me that I was wrong, and indeed very happy because,
seriously, I have an unreasonable fondness for SPARQL and RDF).
E.g. I would have disagreed that our job as Wikimedia is to spend too many
resource in pretty frontends (because that is something the community can
do, and as we see, is doing very well - I think Wikimedia should really
concentrate on those pieces of work that cannot and are not being done by
E.g. I would have insisted on not outsourcing any major part of the
development effort to an external service provider.
E.g. it could be that we already had all positions filled, and simply no
money for more people (really depends on the timing).
So there are plenty of points we might have disagreed with, and which,
maybe misunderstood, maybe subtly altered by the passage of time in a
fallible memory, have lead to the recollection of our conversation that you
presented, but, for the reasons mentioned above, I think that your
recollection is incorrect.
On Fri Feb 20 2015 at 12:42:44 PM Daniel Kinzler <
I understand your frustration, but let me put a few things into
For reference: I'm employed by WMDE and work on wikibase/wikidata. I have
working on MediaWiki since 2005, and am being payed for it since 2008.
Am 20.02.2015 um 19:14 schrieb Paul Houle:
I am not an academic. The people behind Wikidata
To the extend that most of us have some college degree. The only "full"
involved is Markus Krötzsch, who together with Denny Vrandecic developed
the concepts behind Wikidata. He acts as an advisor to the Wikidata
doesn't have any formal position.
Oh, we also have a group of students working on their bachelor project
I first heard about Wikidata at SemTech in San
Francisco and I was told
directly that they were not interested in working
with anybody who was
experienced with putting data from generic database in front of users
they had worked so hard to get academic positions
and get a grant from
Institute and it is more cost-effective and more
advancement to hire a bunch of young people who
don't know anything but
Auch. Working with such people would be a drag. Luckily, we have an
of full blooded programmers. Not that we get everything right, or done in
RDF* and SPARQL* do not come from an academic
background but from a
organization that expects to make money by
satisfying people's needs
and it is
being supported by a number of other commercial
You'll be happy to hear that we are working with high priority to finally
provide full query functionality. We are still evaluating options (WMF's
Stas have been visiting the WMDE office for this, just this week - have a
trip home, guys!), but the current favorite is, in fact, BlazeGraph,
BigData, by the people who came up with RDF* and RDR. If we end up using
chances are good that we will be exposing a SPARQL endpoint directly.
We may still find a deal breaker though, so no promise. Another option
Neo4J, using a graph oriented mapping. We could still expose SPARQL
upon Gremlin, IIRC), but I suspect that we'd probably rather expose
more domain specific, perhaps based on Magnus' WDQ syntax, that operates
directly on the graph.
This is something that builds on everything
successful about RDF and
adds the "missing links" that it takes
to implement data wikis. If
starting Wikidata today or if the kind of
billionaire who buys sports
way I might buy a game console wanted to fund an
effort to keep
RDF*/SPARQL* is the way to do it.
I still stand by the decision not to use a triple store as the primary
for wikidata, for various reasons (MediaWiki integration, especially
being among the most important ones).
But I'm all for mapping our internal model to RDF, and exposing a SPARQL
endpoint, if we can do that in a reliable manner with the available
I'd rather have limited query functionality with five nines uptime than a
endpoint that is down half the time.
Speaking of mapping to RDF: Have you read
Wikidata is playing to whims of a few rich
people and it could disappear at any time when those people get tired
of it or
decide they have what they want and don't
want to make it any easier for
competitors to follow them.
Wikidata development and hosting is funded by donations to Wikimedia,
Wikimedia projects. The first year of development was indeed funded by
companies and trusts (AI2, Google, and the Moore Foundation), but to my
knowledge they never tried to influence our decisions.
We have never had academic funding. I don't think we are going to say
"no" if we
can get any, though.
The trouble is that most people interested in
open data seem to think
is worth nothing and other people's time is
worth nothing and aren't
in paying even a small amount for services so the
producers throw stuff
almost works over the wall. I don't think it
would be all that
difficult for me
to do for Wikidata what I did for Freebase but I
am not doing it
aren't going to pay for it.
If you mail me an application/offer, I'm happy to forward and, depending
content, champion it. Wikimedia doesn't pay as well as big tech companies
(Wikimedia operates on a shoe string budget, compared to other sites with
upwards of 100k hits per second), but the pay isn't shoddy either. Come
visit! Let's talk!
Senior Software Developer
Gesellschaft zur Förderung Freien Wissens e.V.
Wikidata-l mailing list