I agree that this sounds like an interesting experiment. I hope that you
get good faith editors. I worry that you’ll get COI editors playing with the
search rankings. Do you have a way in mind to deal with that issue?
Hi
all;
I'm starting a new project, a wiki search engine. It uses MediaWiki,
Semantic MediaWiki and other minor extensions, and some tricky templates and
bots.
I remember Wikia Search and how it failed. It had the mini-article
thingy for the introduction, and then a lot of links compiled by a crawler. Also
something similar to a social network.
My project idea (which still needs
a cool name) is different. Althought it uses an introduction and images copied
from Wikipedia, and some links from the "External links" sections, it is only a
start. The purpose is that community adds, removes and orders the results for
each term, and creates redirects for similar terms to avoid
duplicates.
Why this? I think that Google PageRank isn't enough. It is
frequently abused by farmlinks, SEOs and other people trying to put their
websites above.
Search "Shakira" in Google for example. You see 1)
Official site, 2) Wikipedia 3) Twitter 4) Facebook, then some videos, some news,
some images, Myspace. It wastes 3 or more results in obvious nice sites (WP, TW,
FB). The wiki search engine puts these sites in the top, and an introduction and
related terms, leaving all the space below to not so obvious but interesting
websites. Also, if you search for "semantic queries" like "right-wing
newspapers" in Google, you won't find real newspapers but "people and sites
discussing about ring-wing newspapers". Or latex and LaTeX being shown in the
same results pages. These issues can be resolved with disambiguation result
pages.
How we choose which results are above or below? The rules are not
fully designed yet, but we can put official sites in the first place, then .gov
or .edu domains which are important ones, and later unofficial websites, blogs,
giving priority to local language, etc. And reaching consensus.
We can
control aggresive spam with spam blacklists, semi-protect or protect highly
visible pages, and use bots or tools to check changes.
It obviously has a
CC BY-SA license and results can be exported. I think that this approach is the
opposite to Google today.
For weird queries like "Albert Einstein
birthplace" we can redirect to the most obvious results page (in this case
Albert Einstein) using a hand-made redirect or by software (some little change
in MediaWiki).
You can check a pretty alpha version here
http://www.todogratix.es (only Spanish by
now sorry) which I'm feeding with some bots.
I think that it is an
interesting experiment. I'm open to your questions and
feedback.
Regards,
emijrp
--
Emilio J. Rodríguez-Posada. E-mail: emijrp
AT gmail DOT com
Pre-doctoral student at the
University of Cádiz (Spain)
_______________________________________________
Wiki-research-l mailing
list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l