A few of us from Discovery (myself, Tomasz Finc, Erik Bernhardson, and some
others too) had the opportunity to meet Sylvain recently when he was in San
Francisco. We chatted about touch points between Discovery and Common
One important thing I personally learnt from chatting to Sylvain is that
the challenges are *very* different for building an in-site search (like
Discovery) and building a web search (like Common Search). Data that's
critically important for one may be close to irrelevant for the other.
Scaling issues for one may not even exist for the other. We did identify a
few areas where Common Search may be creating datasets that would be useful
for us in Discovery; Sylvain said he'd be in touch with us if that happens.
We're going to keep in touch and see if we can help each other out in the
On 18 March 2016 at 22:15, Erik Moeller <eloquence(a)gmail.com> wrote:
2016-03-18 21:44 GMT-07:00 SarahSV
On Fri, Mar 18, 2016 at 5:17 PM, carl hansen
> "We are building a nonprofit search engine for the Web"
> Sounds alot like Knowledge Engine, if there were such a thing.
> Any overlap with wikimedia projects?
Thanks for the link, Carl. Erik and Lydia are
advisors, so perhaps they
could say a bit more about it.
Sylvain has been working on this stuff for a while, blissfully
ignorant of Wikimedia's discussions of search engines, rocketships and
so on. He reached out to me shortly before the public announcement and
we've talked a bit about governance, community & funding models. I've
agreed to provide some continued advice along the way but have not
otherwise been involved.
He recently posted on wikitech-l asking for suggestions how
Wikipedia/Wikidata could be integrated:
There's a lot of heavy lifting still until Common Search can become a
viable project even for narrowly defined purposes but I think it's a
very worthwhile effort. It also is -- I think correctly -- based on
the largest pre-existing open effort to index the web, the Common
Crawl. This could lead to a mutually beneficial relationship between
Common Search and Common Crawl. From a Wikimedia perspective, it might
develop into an opportunity to jointly showcase some of the amazing
stuff that Wikidata can already do.
Wikimedia-l mailing list, guidelines at:
New messages to: Wikimedia-l(a)lists.wikimedia.org