This Grant document for a “Knowledge Engine by Wikipedia” is *specifically and overtly stating* that its purpose is to start work on an search engine as a rival for Google/Yahoo. That is the end goal of the project. Near near the bottom of page 10 it summarises the whole project as:
"knowledge Engine by Wikipedia will be the internet's first transparent search engine, and the first one originated by the Wikimedia Foundation". It will, "democratize the discovery of media, news and information – it will make the Internet's most relevant information more accessible and openly curated, and it will create an open data engine that's completely free of commercial interests. Today, commercial search engines dominate search engine use of the internet...". A separate summary on page 2 states, "The project will pave the way for non-commercial information to be found and utilised by internet users".
At the bottom of page 13, the primary risk identified is "interference by Google, Yahoo or another big commercial search engine could suddenly devote resources to a similar project". As SarahSV pointed out above, If the "Knowledge Engine by Wikipedia" is only about improving the inter-connectedness of the Wikimedia sister projects by improving how internal systems work - which no one is disputing is a very useful goal - then google/yahoo releasing a new search engine product would not be counted as the project's "biggest challenge".
- "Non commercial" -
The document itself refers to "non commercial" several times, and seems to be using the term loosely. Nevertheless, it seems clear to me that any reasonable person who is not deeply-immersed in copyright-debates about the definition of "free" would understand the words "non-commercial" in the context of *this document* to mean that the search engine is *operated* non-commercially. Now, I do acknowledge that a grant-request is by definition a “sales pitch” and you have to write your request using the terminology and focus areas of the grant-giver. However, it is my understanding that Lila specifically wanted to build this - a competitor to Google - and that this is most clearly expressed in the summary on page 10. It describes the 6 principles through which the “Knowledge Engine by Wikipedia” will "upend the commercial structure [of search engines]". These are Public Curation, Transparency, Open Data, Privacy, No Advertising and 'Internalisation'. Nothing in this document talks about ways to limit the *content* of the search engine to only "non-commercial" stuff (and I if it did, then we would be talking about this: https://search.creativecommons.org/ ).
- Lack of Strategy -
Now, maybe an open-source search engine would be a good thing for the WMF to create! But that would be a major strategic decision. It would be, in effect, a new sister project to sit alongside (above?) Wikipedia, Commons, Wikidata etc. However, this concept appears *nowhere* in the current strategy consultation documents on Meta. As I wrote on my blog last week: "Of 18 different approaches identified in the...consultation process only one of them seems directly related to [search]: 'Explore ways to scale machine-generated, machine-verified and machine-assisted content'. It is also literally the last of the 18 topics listed". http://wittylama.com/2016/01/30/strategy-controversy-part-2/
It seems to me extremely damaging for the relationship with the Knight Foundation if Lila has approached them for funding a search engine, without first having a strategic plan. Either the Board knew about this and didn't see a problem, or they were incorrectly informed about the grant's purpose. Either is very bad. And let me be very clear - this is not a case of the Grants team going off by themselves. This is an executive decision by either the Board to Lila, or Lila by herself. The latter seems more likely given her own statement on her talkpage:
“I saw the Wikimedia movement as the most motivated and sincere group of beings, united in their mission to build a rocket to explore Universal Free Knowledge. The words “search” and “discovery” and “knowledge” swam around in my mind with some rocket to navigate it. However, “rocket” didn’t seem to work, but in my mind, the rocket was really just an engine, or a portal, a TARDIS, that transports people on their journey through Universal Free Knowledge.” https://meta.wikimedia.org/wiki/User_talk:LilaTretikov_(WMF)#Knowledge_Engin...
As pointed out by Risker back in May 2015, the Search team had already been created and seemed *disproportionately* large https://meta.wikimedia.org/wiki/Talk:Wikimedia_Foundation_Annual_Plan/2015-1... It seems clear to me that this was done in anticipation of the “Knowledge Engine by Wikipedia” project, as it is described in this grant document. I also understand that this very high initial target has since been reduced, a lot. From a fully-fledged competitor to Google, to a search engine of freely-licensed works, and now to this: "improving the existing CirrusSearch infrastructure with better relevance, multi language, multi projects search and incorporating new [external] data sources for our projects." https://www.mediawiki.org/wiki/Wikimedia_Discovery/FAQ#Are_you_building_Goog...
However, this change is NOT represented in the actual grant document. That either means we misled the Knight Foundation at the start, or we changed our mind since then but didn't tell them. Both of these options are grounds, as per the contract page 5, for handing back the money. Much more likely, in my opinion, is that the Knight Foundation knew that trying to create a non-profit search engine was high-risk or at the very least extremely ambitious. But, because they like us, they gave a small amount to explore the concept and to fund the actually useful stuff (the "outcomes" of this first stage, as listed on page 3 and also page 12).
Let me reiterate - improving the "discoverability" of our own content across wikis/sister-projects is a very good goal. Consolidation/Integration of projects' content is a much desired goal (see also the strong desire for the 'structured data' project on Commons). However that is NOT what has been "sold" in this grant.
- Cost -
Page 10 of this text specifically says that the cost of the first stage of "Knowledge Engine by Wikipedia" is $2.5 million, and that the grant is for 1 year starting in September 2015. Page 2 says that the whole project is in 4 stages, each lasting approximately 18 months = 6 years. This grant of $250,000 therefore only covers 10% of the cost, of the first stage, of the total project.
As SarahEV said above: "The document says the ‘Search Engine by Wikipedia’ budget for 2015–2016 ($2.4 million) was approved by the board [page 9]. Can you point us to which board meeting approved it and what was discussed there?" I second this question, because I'm not seeing it in the annual plan: https://wikimediafoundation.org/wiki/2015-2016_Annual_Plan (as also noted by Pine, earlier in this thread)
There is no way that Lila approached the Knight Foundation asking to fund only 10% of the first year of a 6 year project. Instead, $250,000 seems to me to be the lowest possible outcome of a grant request negotiation. Clearly we would have asked for a large proportion of the total amount and we've been rejected right the back down to 10% of stage 1. Assuming that the first stage is also the cheapest of the 4 stages ("discovery, advisory, community, extension" - as per page 2) then we can reliably extrapolate that the $2.5million x 4 = $10Million is the absolute *minimum* amount that was planned to be spent, and we would have originally asked for at *least* 50% of that. As pointed out by Doc James on the [public] WikipediaWeekly Facebook group - estimates presented to the board were in the range of tens-of-millions. https://www.facebook.com/groups/wikipediaweekly/permalink/956660811048417/ Unless the WMF expected to fund the difference between their initial request to the Knight Foundation (let alone the much reduced amount they actually received) with funding from *other* grants, then the expectation is that a *significant proportion* of the remaining cost would be borne by the WMF’s “annual fundraiser”. If so, that should have meant that the financial demands of this project should be disclosed to the community and to the donors.
It is, therefore, inconceivable to me that the Executive Director would privately propose to an external partner that they would undertake a 6 year project to build a search-engine that will have massive cost, staffing, strategic and content implications - entirely without an official WMF strategy covering that period, no indication in the current annual plan, without the awareness of the community, and unclearly communicated to the Board. I find the fact that this could have been done to be a deep breach of our values - and not wholly unrelated to the current sudden exodus of long-serving members of WMF staff.
Liam / Wittylama
On Fri, Feb 12, 2016 at 4:31 PM, Liam Wyatt liamwyatt@gmail.com wrote:
This Grant document for a “Knowledge Engine by Wikipedia” is *specifically and overtly stating* that its purpose is to start work on an search engine as a rival for Google/Yahoo. That is the end goal of the project.
See also:
http://www.theregister.co.uk/2016/02/12/wikipedia_grant_build_search_engine_...
It is, therefore, inconceivable to me that the Executive Director would privately propose to an external partner that they would undertake a 6 year project to build a search-engine that will have massive cost, staffing, strategic and content implications - entirely without an official WMF strategy covering that period, no indication in the current annual plan, without the awareness of the community, and unclearly communicated to the Board. I find the fact that this could have been done to be a deep breach of our values - and not wholly unrelated to the current sudden exodus of long-serving members of WMF staff.
Liam / Wittylama
Thanks for this breakdown of events/intentions/grant request. I can't help wondering whether this grant will produce anything at all that we can use. As I recall we talked a lot about how bad search was in general on Wikipedia projects, and the example used to demonstrate how poor this was, was a comparison test. Gerard mentioned how badly Wikimedia Commons responds to the search term "horse" as compared to Google's interpretation of "horse". I believe the conclusion was that we needed to integrate Google search into Wikipedia, not try to compete with Google at their game. Meanwhile, with Wikidata, we are very slowly filling the "depicts" property with "horse" for artwork items of horses, but it will take years probably before all images on Commons with horses in them have found their way to Wikidata, much less get tagged with a depicts property! Looking at "horse" in reasonator does indicate some progress, however, note that not all images served up by Reasonator actually show a horse: https://tools.wmflabs.org/reasonator/?q=Q726&lang=en
Why should we try to beat Google at search? These days, if I am looking for an image of a horse on Wikimedia Commons, I dump this into Google: " site.commons.wikimedia.org horse" and then I click on images. This is the most effective way for me to find images on commons that I know are there (inlcuding ones I uploaded myself).
On Fri, Feb 12, 2016 at 11:31 AM, Liam Wyatt liamwyatt@gmail.com wrote:
This Grant document for a “Knowledge Engine by Wikipedia” is *specifically and overtly stating* that its purpose is to start work on an search engine as a rival for Google/Yahoo. That is the end goal of the project. Near near the bottom of page 10 it summarises the whole project as:
"knowledge Engine by Wikipedia will be the internet's first transparent search engine, and the first one originated by the Wikimedia Foundation". It will, "democratize the discovery of media, news and information – it will make the Internet's most relevant information more accessible and openly curated, and it will create an open data engine that's completely free of commercial interests. Today, commercial search engines dominate search engine use of the internet...". A separate summary on page 2 states, "The project will pave the way for non-commercial information to be found and utilised by internet users".
At the bottom of page 13, the primary risk identified is "interference by Google, Yahoo or another big commercial search engine could suddenly devote resources to a similar project". As SarahSV pointed out above, If the "Knowledge Engine by Wikipedia" is only about improving the inter-connectedness of the Wikimedia sister projects by improving how internal systems work - which no one is disputing is a very useful goal - then google/yahoo releasing a new search engine product would not be counted as the project's "biggest challenge".
- "Non commercial" -
The document itself refers to "non commercial" several times, and seems to be using the term loosely. Nevertheless, it seems clear to me that any reasonable person who is not deeply-immersed in copyright-debates about the definition of "free" would understand the words "non-commercial" in the context of *this document* to mean that the search engine is *operated* non-commercially. Now, I do acknowledge that a grant-request is by definition a “sales pitch” and you have to write your request using the terminology and focus areas of the grant-giver. However, it is my understanding that Lila specifically wanted to build this - a competitor to Google - and that this is most clearly expressed in the summary on page 10. It describes the 6 principles through which the “Knowledge Engine by Wikipedia” will "upend the commercial structure [of search engines]". These are Public Curation, Transparency, Open Data, Privacy, No Advertising and 'Internalisation'. Nothing in this document talks about ways to limit the *content* of the search engine to only "non-commercial" stuff (and I if it did, then we would be talking about this: https://search.creativecommons.org/ ).
- Lack of Strategy -
Now, maybe an open-source search engine would be a good thing for the WMF to create! But that would be a major strategic decision. It would be, in effect, a new sister project to sit alongside (above?) Wikipedia, Commons, Wikidata etc. However, this concept appears *nowhere* in the current strategy consultation documents on Meta. As I wrote on my blog last week: "Of 18 different approaches identified in the...consultation process only one of them seems directly related to [search]: 'Explore ways to scale machine-generated, machine-verified and machine-assisted content'. It is also literally the last of the 18 topics listed". http://wittylama.com/2016/01/30/strategy-controversy-part-2/
It seems to me extremely damaging for the relationship with the Knight Foundation if Lila has approached them for funding a search engine, without first having a strategic plan. Either the Board knew about this and didn't see a problem, or they were incorrectly informed about the grant's purpose. Either is very bad. And let me be very clear - this is not a case of the Grants team going off by themselves. This is an executive decision by either the Board to Lila, or Lila by herself. The latter seems more likely given her own statement on her talkpage:
“I saw the Wikimedia movement as the most motivated and sincere group of beings, united in their mission to build a rocket to explore Universal Free Knowledge. The words “search” and “discovery” and “knowledge” swam around in my mind with some rocket to navigate it. However, “rocket” didn’t seem to work, but in my mind, the rocket was really just an engine, or a portal, a TARDIS, that transports people on their journey through Universal Free Knowledge.”
https://meta.wikimedia.org/wiki/User_talk:LilaTretikov_(WMF)#Knowledge_Engin...
As pointed out by Risker back in May 2015, the Search team had already been created and seemed *disproportionately* large
https://meta.wikimedia.org/wiki/Talk:Wikimedia_Foundation_Annual_Plan/2015-1... It seems clear to me that this was done in anticipation of the “Knowledge Engine by Wikipedia” project, as it is described in this grant document. I also understand that this very high initial target has since been reduced, a lot. From a fully-fledged competitor to Google, to a search engine of freely-licensed works, and now to this: "improving the existing CirrusSearch infrastructure with better relevance, multi language, multi projects search and incorporating new [external] data sources for our projects."
https://www.mediawiki.org/wiki/Wikimedia_Discovery/FAQ#Are_you_building_Goog...
However, this change is NOT represented in the actual grant document. That either means we misled the Knight Foundation at the start, or we changed our mind since then but didn't tell them. Both of these options are grounds, as per the contract page 5, for handing back the money. Much more likely, in my opinion, is that the Knight Foundation knew that trying to create a non-profit search engine was high-risk or at the very least extremely ambitious. But, because they like us, they gave a small amount to explore the concept and to fund the actually useful stuff (the "outcomes" of this first stage, as listed on page 3 and also page 12).
Let me reiterate - improving the "discoverability" of our own content across wikis/sister-projects is a very good goal. Consolidation/Integration of projects' content is a much desired goal (see also the strong desire for the 'structured data' project on Commons). However that is NOT what has been "sold" in this grant.
- Cost -
Page 10 of this text specifically says that the cost of the first stage of "Knowledge Engine by Wikipedia" is $2.5 million, and that the grant is for 1 year starting in September 2015. Page 2 says that the whole project is in 4 stages, each lasting approximately 18 months = 6 years. This grant of $250,000 therefore only covers 10% of the cost, of the first stage, of the total project.
As SarahEV said above: "The document says the ‘Search Engine by Wikipedia’ budget for 2015–2016 ($2.4 million) was approved by the board [page 9]. Can you point us to which board meeting approved it and what was discussed there?" I second this question, because I'm not seeing it in the annual plan: https://wikimediafoundation.org/wiki/2015-2016_Annual_Plan (as also noted by Pine, earlier in this thread)
There is no way that Lila approached the Knight Foundation asking to fund only 10% of the first year of a 6 year project. Instead, $250,000 seems to me to be the lowest possible outcome of a grant request negotiation. Clearly we would have asked for a large proportion of the total amount and we've been rejected right the back down to 10% of stage 1. Assuming that the first stage is also the cheapest of the 4 stages ("discovery, advisory, community, extension" - as per page 2) then we can reliably extrapolate that the $2.5million x 4 = $10Million is the absolute *minimum* amount that was planned to be spent, and we would have originally asked for at *least* 50% of that. As pointed out by Doc James on the [public] WikipediaWeekly Facebook group - estimates presented to the board were in the range of tens-of-millions. https://www.facebook.com/groups/wikipediaweekly/permalink/956660811048417/ Unless the WMF expected to fund the difference between their initial request to the Knight Foundation (let alone the much reduced amount they actually received) with funding from *other* grants, then the expectation is that a *significant proportion* of the remaining cost would be borne by the WMF’s “annual fundraiser”. If so, that should have meant that the financial demands of this project should be disclosed to the community and to the donors.
It is, therefore, inconceivable to me that the Executive Director would privately propose to an external partner that they would undertake a 6 year project to build a search-engine that will have massive cost, staffing, strategic and content implications - entirely without an official WMF strategy covering that period, no indication in the current annual plan, without the awareness of the community, and unclearly communicated to the Board. I find the fact that this could have been done to be a deep breach of our values - and not wholly unrelated to the current sudden exodus of long-serving members of WMF staff.
Liam / Wittylama
Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines New messages to: Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe
On 12.2.2016, at 18.31, Liam Wyatt liamwyatt@gmail.com wrote:
- Lack of Strategy -
Now, maybe an open-source search engine would be a good thing for the WMF to create! But that would be a major strategic decision.
Search is a critical feature in all online services, especially for a service with a mission to "empower and engage people around the world to collect and develop educational content under a free license or in the public domain, and to disseminate it effectively and globally".
Putting resources to improve search is no a "major strategic decision”. it is business-as-usual.
Also federated / semantic search to all the Wikipedia projects and outside sources of free content is definitely worth of exploring. Any strategies should have space to explore things that are advancing the mission.
- Teemu
-------------------------------------------------- Teemu Leinonen http://teemuleinonen.fi +358 50 351 6796 Media Lab http://mlab.uiah.fi Aalto University School of Arts, Design and Architecture --------------------------------------------------
Hoi, Amen Thanks, GerardM
On 15 February 2016 at 23:36, Leinonen Teemu teemu.leinonen@aalto.fi wrote:
On 12.2.2016, at 18.31, Liam Wyatt liamwyatt@gmail.com wrote:
- Lack of Strategy -
Now, maybe an open-source search engine would be a good thing for the WMF to create! But that would be a major strategic decision.
Search is a critical feature in all online services, especially for a service with a mission to "empower and engage people around the world to collect and develop educational content under a free license or in the public domain, and to disseminate it effectively and globally".
Putting resources to improve search is no a "major strategic decision”. it is business-as-usual.
Also federated / semantic search to all the Wikipedia projects and outside sources of free content is definitely worth of exploring. Any strategies should have space to explore things that are advancing the mission.
- Teemu
Teemu Leinonen http://teemuleinonen.fi +358 50 351 6796 Media Lab http://mlab.uiah.fi Aalto University School of Arts, Design and Architecture
Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines New messages to: Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe
wikimedia-l@lists.wikimedia.org