Thanks for this breakdown of events/intentions/grant request. I can't help
wondering whether this grant will produce anything at all that we can use.
As I recall we talked a lot about how bad search was in general on
Wikipedia projects, and the example used to demonstrate how poor this was,
was a comparison test. Gerard mentioned how badly Wikimedia Commons
responds to the search term "horse" as compared to Google's interpretation
of "horse". I believe the conclusion was that we needed to integrate Google
search into Wikipedia, not try to compete with Google at their game.
Meanwhile, with Wikidata, we are very slowly filling the "depicts" property
with "horse" for artwork items of horses, but it will take years probably
before all images on Commons with horses in them have found their way to
Wikidata, much less get tagged with a depicts property! Looking at "horse"
in reasonator does indicate some progress, however, note that not all
images served up by Reasonator actually show a horse:
Why should we try to beat Google at search? These days, if I am looking for
an image of a horse on Wikimedia Commons, I dump this into Google: "
horse" and then I click on images. This is the
most effective way for me to find images on commons that I know are there
(inlcuding ones I uploaded myself).
On Fri, Feb 12, 2016 at 11:31 AM, Liam Wyatt <liamwyatt(a)gmail.com> wrote:
This Grant document for a “Knowledge Engine by
Wikipedia” is
*specifically and overtly stating* that its purpose is to start work
on an search engine as a rival for Google/Yahoo. That is the end goal
of the project. Near near the bottom of page 10 it summarises the
whole project as:
"knowledge Engine by Wikipedia will be the internet's first
transparent search engine, and the first one originated by the
Wikimedia Foundation". It will, "democratize the discovery of media,
news and information – it will make the Internet's most relevant
information more accessible and openly curated, and it will create an
open data engine that's completely free of commercial interests.
Today, commercial search engines dominate search engine use of the
internet...". A separate summary on page 2 states, "The project will
pave the way for non-commercial information to be found and utilised
by internet users".
At the bottom of page 13, the primary risk identified is "interference
by Google, Yahoo or another big commercial search engine could
suddenly devote resources to a similar project". As SarahSV pointed
out above, If the "Knowledge Engine by Wikipedia" is only about
improving the inter-connectedness of the Wikimedia sister projects by
improving how internal systems work - which no one is disputing is a
very useful goal - then google/yahoo releasing a new search engine
product would not be counted as the project's "biggest challenge".
- "Non commercial" -
The document itself refers to "non commercial" several times, and
seems to be using the term loosely. Nevertheless, it seems clear to me
that any reasonable person who is not deeply-immersed in
copyright-debates about the definition of "free" would understand the
words "non-commercial" in the context of *this document* to mean that
the search engine is *operated* non-commercially. Now, I do
acknowledge that a grant-request is by definition a “sales pitch” and
you have to write your request using the terminology and focus areas
of the grant-giver. However, it is my understanding that Lila
specifically wanted to build this - a competitor to Google - and that
this is most clearly expressed in the summary on page 10. It describes
the 6 principles through which the “Knowledge Engine by Wikipedia”
will "upend the commercial structure [of search engines]". These are
Public Curation, Transparency, Open Data, Privacy, No Advertising and
'Internalisation'. Nothing in this document talks about ways to limit
the *content* of the search engine to only "non-commercial" stuff (and
I if it did, then we would be talking about this:
https://search.creativecommons.org/ ).
- Lack of Strategy -
Now, maybe an open-source search engine would be a good thing for the
WMF to create! But that would be a major strategic decision. It would
be, in effect, a new sister project to sit alongside (above?)
Wikipedia, Commons, Wikidata etc. However, this concept appears
*nowhere* in the current strategy consultation documents on Meta. As I
wrote on my blog last week: "Of 18 different approaches identified in
the...consultation process only one of them seems directly related to
[search]: 'Explore ways to scale machine-generated, machine-verified
and machine-assisted content'. It is also literally the last of the 18
topics listed".
http://wittylama.com/2016/01/30/strategy-controversy-part-2/
It seems to me extremely damaging for the relationship with the Knight
Foundation if Lila has approached them for funding a search engine,
without first having a strategic plan. Either the Board knew about
this and didn't see a problem, or they were incorrectly informed about
the grant's purpose. Either is very bad. And let me be very clear -
this is not a case of the Grants team going off by themselves. This is
an executive decision by either the Board to Lila, or Lila by herself.
The latter seems more likely given her own statement on her talkpage:
“I saw the Wikimedia movement as the most motivated and sincere group
of beings, united in their mission to build a rocket to explore
Universal Free Knowledge. The words “search” and “discovery” and
“knowledge” swam around in my mind with some rocket to navigate it.
However, “rocket” didn’t seem to work, but in my mind, the rocket was
really just an engine, or a portal, a TARDIS, that transports people
on their journey through Universal Free Knowledge.”
https://meta.wikimedia.org/wiki/User_talk:LilaTretikov_(WMF)#Knowledge_Engi…
As pointed out by Risker back in May 2015, the Search team had already
been created and seemed *disproportionately* large
https://meta.wikimedia.org/wiki/Talk:Wikimedia_Foundation_Annual_Plan/2015-…
It seems clear to me that this was done in anticipation of the
“Knowledge Engine by Wikipedia” project, as it is described in this
grant document. I also understand that this very high initial target
has since been reduced, a lot. From a fully-fledged competitor to
Google, to a search engine of freely-licensed works, and now to this:
"improving the existing CirrusSearch infrastructure with better
relevance, multi language, multi projects search and incorporating new
[external] data sources for our projects."
https://www.mediawiki.org/wiki/Wikimedia_Discovery/FAQ#Are_you_building_Goo…
However, this change is NOT represented in the actual grant document.
That either means we misled the Knight Foundation at the start, or we
changed our mind since then but didn't tell them. Both of these
options are grounds, as per the contract page 5, for handing back the
money. Much more likely, in my opinion, is that the Knight Foundation
knew that trying to create a non-profit search engine was high-risk or
at the very least extremely ambitious. But, because they like us, they
gave a small amount to explore the concept and to fund the actually
useful stuff (the "outcomes" of this first stage, as listed on page 3
and also page 12).
Let me reiterate - improving the "discoverability" of our own content
across wikis/sister-projects is a very good goal.
Consolidation/Integration of projects' content is a much desired goal
(see also the strong desire for the 'structured data' project on
Commons). However that is NOT what has been "sold" in this grant.
- Cost -
Page 10 of this text specifically says that the cost of the first
stage of "Knowledge Engine by Wikipedia" is $2.5 million, and that the
grant is for 1 year starting in September 2015. Page 2 says that the
whole project is in 4 stages, each lasting approximately 18 months =
6 years. This grant of $250,000 therefore only covers 10% of the cost,
of the first stage, of the total project.
As SarahEV said above:
"The document says the ‘Search Engine by Wikipedia’ budget for
2015–2016 ($2.4 million) was approved by the board [page 9]. Can you
point us to which board meeting approved it and what was discussed
there?" I second this question, because I'm not seeing it in the
annual plan:
https://wikimediafoundation.org/wiki/2015-2016_Annual_Plan
(as also noted by Pine, earlier in this thread)
There is no way that Lila approached the Knight Foundation asking to
fund only 10% of the first year of a 6 year project. Instead, $250,000
seems to me to be the lowest possible outcome of a grant request
negotiation. Clearly we would have asked for a large proportion of the
total amount and we've been rejected right the back down to 10% of
stage 1. Assuming that the first stage is also the cheapest of the 4
stages ("discovery, advisory, community, extension" - as per page 2)
then we can reliably extrapolate that the $2.5million x 4 = $10Million
is the absolute *minimum* amount that was planned to be spent, and we
would have originally asked for at *least* 50% of that. As pointed out
by Doc James on the [public] WikipediaWeekly Facebook group -
estimates presented to the board were in the range of
tens-of-millions.
https://www.facebook.com/groups/wikipediaweekly/permalink/956660811048417/
Unless the WMF expected to fund the difference between their initial
request to the Knight Foundation (let alone the much reduced amount
they actually received) with funding from *other* grants, then the
expectation is that a *significant proportion* of the remaining cost
would be borne by the WMF’s “annual fundraiser”. If so, that should
have meant that the financial demands of this project should be
disclosed to the community and to the donors.
It is, therefore, inconceivable to me that the Executive Director
would privately propose to an external partner that they would
undertake a 6 year project to build a search-engine that will have
massive cost, staffing, strategic and content implications - entirely
without an official WMF strategy covering that period, no indication
in the current annual plan, without the awareness of the community,
and unclearly communicated to the Board. I find the fact that this
could have been done to be a deep breach of our values - and not
wholly unrelated to the current sudden exodus of long-serving members
of WMF staff.
Liam / Wittylama
_______________________________________________
Wikimedia-l mailing list, guidelines at:
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
New messages to: Wikimedia-l(a)lists.wikimedia.org
Unsubscribe:
https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
<mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe>