Dan,
I understand you are currently only working on internal search, representing stage 1 of this project. But does the long-term vision of the subsequent stages still include things like –
1. incorporation of non-Wikimedia sources in search results, 2. an open source knowledge engine like IBM's Watson (i.e. an answer engine based on structured data), 3. an open source search engine, 4. public curation of relevance (i.e. volunteer-based search results ranking)?
1 and 4 remain mentioned in the Discovery FAQ[1] on MediaWiki; 1, 2 and 3 have been mentioned in recent on-wiki discussions by Jimmy Wales.[2] In fact, I see little in the Knowledge Engine grant agreement that is incompatible with the FAQ and those recent discussions.
From a fundraising point of view, I could fully understand why the WMF
might consider it a desirable long-term goal to turn wikipedia.org into a high-traffic search and answer engine. I am not sure how successful such an attempt would be – internet users have a marked preference for one-stop shops, and it would take some really nifty features to entice users away from the established search and answer engines – but I do understand why the idea would be attractive.
Andreas
[1] https://www.mediawiki.org/wiki/Wikimedia_Discovery/FAQ [2] https://en.wikipedia.org/wiki/User_talk:Jimbo_Wales
On Tue, Mar 29, 2016 at 12:14 AM, Dan Garry dgarry@wikimedia.org wrote:
A few of us from Discovery (myself, Tomasz Finc, Erik Bernhardson, and some others too) had the opportunity to meet Sylvain recently when he was in San Francisco. We chatted about touch points between Discovery and Common Search.
One important thing I personally learnt from chatting to Sylvain is that the challenges are *very* different for building an in-site search (like Discovery) and building a web search (like Common Search). Data that's critically important for one may be close to irrelevant for the other. Scaling issues for one may not even exist for the other. We did identify a few areas where Common Search may be creating datasets that would be useful for us in Discovery; Sylvain said he'd be in touch with us if that happens.
We're going to keep in touch and see if we can help each other out in the future.
Thanks, Dan
On 18 March 2016 at 22:15, Erik Moeller eloquence@gmail.com wrote:
2016-03-18 21:44 GMT-07:00 SarahSV sarahsv.wiki@gmail.com:
On Fri, Mar 18, 2016 at 5:17 PM, carl hansen <carlhansen1234@gmail.com
wrote:
https://about.commonsearch.org/
"We are building a nonprofit search engine for the Web"
Sounds alot like Knowledge Engine, if there were such a thing. Any overlap with wikimedia projects?
Thanks for the link, Carl. Erik and Lydia are advisors, so perhaps they could say a bit more about it.
Sylvain has been working on this stuff for a while, blissfully ignorant of Wikimedia's discussions of search engines, rocketships and so on. He reached out to me shortly before the public announcement and we've talked a bit about governance, community & funding models. I've agreed to provide some continued advice along the way but have not otherwise been involved.
He recently posted on wikitech-l asking for suggestions how Wikipedia/Wikidata could be integrated: https://lists.wikimedia.org/pipermail/wikitech-l/2016-March/084984.html
There's a lot of heavy lifting still until Common Search can become a viable project even for narrowly defined purposes but I think it's a very worthwhile effort. It also is -- I think correctly -- based on the largest pre-existing open effort to index the web, the Common Crawl. This could lead to a mutually beneficial relationship between Common Search and Common Crawl. From a Wikimedia perspective, it might develop into an opportunity to jointly showcase some of the amazing stuff that Wikidata can already do.
Erik
Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines New messages to: Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe
-- Dan Garry Lead Product Manager, Discovery Wikimedia Foundation _______________________________________________ Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines New messages to: Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe