[WikiEN-l] Long-term searchability of the internet

Carcharoth carcharothwp at googlemail.com
Sat Jan 15 04:41:09 UTC 2011


(Following on from another thread)

I have a theory that Wikipedia makes only *part* of the Internet not
suck. Wikipedians aggregate online knowledge (and offline as well, but
let's stick to online here), thus making it easier to find information
about something, especially when there are lots of ambiguous hits on a
Google search and you don't know enough to refine the search. But the
useful parts of the internet (i.e. not the social media and similarly
non-transient information-deficient areas of the internet) didn't stop
growing when Wikipedia came along.

In theory, if the growth of the information-dense parts of the
internet has continued to outstrip the growth of Wikipedia and the
ability of Wikipedians to aggregate that knowledge base, then large
parts of that part of the internet should still "suck" (to continue
using that terminology) - i.e. be less amenable to searching due to
absence of information or poorly organised information. I base this on
many years of searching daily for information about topics ranging
from the well-known to the borderline obscure to the outright obscure.

Over the years since Wikipedia started, the ability to find
information online has changed beyond recognition. Around about 2004-5
(I need to check dates here), Wikipedia was rising rapidly up the
search rankings, and now comes top or near the top on most searches.
But there are still many, many topics on which no articles, or only
redlinks, exist. I come across these daily when searching, and see
that information on these topics is out there, scattered around if you
search on Google, but hasn't been aggregated yet.

The question I have is whether the growth in the amount of
unaggregated information (and I include other information-organising
sites here, not just Wikipedia) will always outstrip the ability of
various processes (include the growth of Wikipedia) to aggregate it
into something more useful? If the long-term answer is yes, then
information overload is inevitable (and search engines will gradually
start to suck again). If the long-term answer is no, then at some
point the online aggregation (or co-ordination of data to form
information in the real sense) will start to overtake the flow of
information from offline to online, and order will continue to emerge
from the (relative) chaos.

The key seems to be the quality of the information put online.
Well-organised and searchable sites and databases are good. Poorly
organised information sources, less so, as while they can in theory be
found by search engines, they may be less easy to distinguish from the
background noise, though it also depends greatly on the amount of
information you start with when carrying out a search for more
information.

To take a specific example, I very occasionally come across names of
people or topics where it is next-to-impossible to find out anything
meaningful about them because the name is identical to that of someone
else. Sometimes this is companies that name themselves after something
well-known and any search is swamped by hits to that well-known
namesake. Other times, it is someone more famous swamping a relatively
obscure person - a recent example I found here is the physicist Otto
Klemperer. Despite having the name and profession, it is remarkably
difficult to find information about the physicist as opposed to the
composer. If I had a birth year, it would be much easier, of course.

Carcharoth



More information about the WikiEN-l mailing list