[Foundation-l] RFC: A Wikipedia/etc.-like Web Directory (e.g: dmoz.org, the old dir.yahoo.com , etc)
Shlomi Fish
shlomif at iglu.org.il
Tue Oct 27 19:04:51 UTC 2009
Hi all!
This is a request-for-comments (RFC) about an idea that had surfaced on
#wikipedia at the time about creating an open web directory similar to
http://www.dmoz.org/ only world-editable and with a more convenient interface.
This was motivated after I was referred to the "Wikipedia is not a web
directory" section of:
http://en.wikipedia.org/wiki/Wikipedia:What_Wikipedia_is_not
History of Web Directories:
---------------------------
I'm not sure how many of the younger folks here are very familiar with the
history and motivation behind web directories, so I'll explain a little to the
best of my knowledge.
Back when the Internet and the World Wide Web started to become popular,
search engines were much less accurate than Google, or the search engines that
now compete with it, using similar algorithms. As a result, it was often hard
to find stuff on the Internet using Lycos or different search engines. As a
result, people have actively used web-directories and especially yahoo.com
(which started as a hand-maintained directory by two Stanford students, and
grew into a successful Internet company), as a way to find resources that were
considered high-quality by human editors.
Yahoo and similar directories organised the content in a tree of categories,
with some stuff like "symbolic links", etc. Part of the problem with Yahoo was
that it was closed for edits only by human editors, which caused it to quickly
grow out-of-date. As a result, it was eventually surpassed in
comprehensiveness and accuracy by dmoz.org:
http://en.wikipedia.org/wiki/Open_Directory_Project
dmoz.org gained some notoriety after Google periodically mirrored it as the
Google directory (with some enhancements like sort-by-page-rank and a faster
load time, and a better search). Eventually, Google removed it from their
front page and search results in favour of Froogle and other stuff which were
in my (possibly non-representative) opinion much less useful than their
Directory, and dmoz.org went into much greater obscurity. Soon afterwards, the
English wikipedia and other wikimedia projects started gaining a lot of
momentum, popularity and page rank, which caused it to rank high among many
search engine searches (although to the defence of Google and other search
engines, one should note that they do seem to have a diversification
algorithm, which makes the search results not be dominated by a single source
- whether wikipedia.org or whatever).
Why a Web Directory:
--------------------
While I enjoy the English wikipedia a lot (and have contributed to it - see:
http://en.wikipedia.org/wiki/User:Shlomif ), I still think that web
directories have been having (or possibly and unfortunately "had been having")
their advantages and appeal. The primary reason is because they list any site
of interest, including many that would be considered as not "notable" enough
for inclusion under the relevant "External Links" in the Wikipedia, but still
may prove of interest. They also serve a similar purpose to the wikipedias'
category pseudo-trees of allowing to find similar articles of interest.
A lot of techno-geeks are now saying "Category trees are dead! Tags are the
future". It is true that traditionally the filesystems of popular operating
systems such as UNIX (e.g: Linux, Mac OS X, etc.), DOS/Windows, etc. are
organised in a directory tree and not a tag, which inspired a lot of Internet-
stuff to be similar (as the protocols mirrored the semantics of the UNIX file
system). However, there are many good reasons (besides ease-of-implementation)
why they are organised in a hierarchy, instead of in free-form tags. (You can
see the Google Reader feeds-organised-in-tags or the Flock browser huge tag-
based bookmarks menu for why they sometimes fail). Not to mention that like in
wikipedia, a certain resource can be tagged with more than one category like
Isaac Newton ( http://en.wikipedia.org/wiki/Isaac_Newton ) belongs to "17th-
century English people" , "Fellows of the Royal Society" , "English
alchemists", etc.
So I still think the idea of a web directory appeals to me.
The problems with ODP/dmoz.org:
-------------------------------
As someone who used to be a dmoz.org editor, I found two main problems with
it:
1. Too much red-tape: an editor could only edit the categories he was given
permissions for, and not anything above. There were some meta-editors who can
edit anything and can also give permissions for more categories which take
time, but I still have been thinking that the best thing would be a wikipedia-
like "everyone can edit everything unless explicitly forbidden" thing.
Another thing I didn't like about this red-tape and authority was an incident
where as I edited the Perl "FAQs, Tutorials and Helps" category and added a
sub-category of "Tutorials" where I placed some stuff. Then when an editor
reviewed my work when I asked for another category, they didn't like the fact
that one of the texts for the mission statement only reflected my thoughts,
and so deleted the category and moved everything I wrote their to the parent
category. This naturally was a destructive change that made me frustrated, as
I would have been happy to change the mission statement or guidelines of the
category after the fact.
2. The UI was lacking: there were many forms required to review, submit and/or
edit a single link, the editing server was kinda slow, there was very little
AJAX, and editing in general was much less convenient than the wikipedia edit
link which gives a gigantic textarea with a convenient and concise syntax.
-------------
For a long time I felt guilty about not dedicating enough time to edit
dmoz.org, and had reminders to edit it occasionally (which I tended to ignore)
but eventually passively stopped editing. I now realise I could not be blamed
for my lack of enthusiasm.
Note that I still feel that dmoz.org is a useful resource which is often fun
and useful. As great as the Wikipedia is, I still think there's a place for a
high-profile web-directory. Maybe this is one of the trends that will become
retro, like push technology which was considered a fad was re-incarnated as
RSS/Atom feeds which seem to have gained a lot of popularity, and even proved
to have some business potential.
The Challenges of a more open / more free web directory:
--------------------------------------------------------
I'm not sure that a wikimedia-sponsored web directory is a good idea yet. But
here are some thoughts about the challenges:
1. The three S's: Spam, spam, spam. A web directory is likely to be a huge
spam target and will need good anti-spam controls. However, I personally think
that while spam should be a factor we take into consideration, it should not
prevent us from creating new and exciting user-contributed web sites.
One of the reasons I hate spam is not so much that I am bothered by it
arriving in my inbox, but rather because it makes some people paranoid. My
personal web-site contains an <a href="mailto:shlomif at iglu.org.il"
rel="webmaster">shlomif at iglu.org.il</a> E-mail at the bottom of each page, but
lately most sites I visited either had it obscured under many ways, or even
just had a contact form. Some people have even told me that I should hide my
web address to reduce the amount of spam I receive because "prevention is
better than the cure".
I'm sorry, but I'd rather not destroy paradise just so I can save it. I'd
rather see some spam on blogs and in E-mail than destroy their
usability/accessibility, and by corrollary think that a more open web-
directory should not have fear of spam as the main obstacle in its way.
2. We may wish to build upon the existing data of the ODP which is syndicated
as machine-readable data under this licence:
http://en.wikipedia.org/wiki/Open_Directory_License
Which:
<<<<<<<<
The Free Software Foundation describes the ODL as a non-free license, citing
the right to redistribute a given version not being permanent, and the
requirement to check for changes to the license.
>>>>>>>>
Whether something is indeed free/open or not is a term of much debate as I
mention here in a somewhat different context:
http://www.shlomifish.org/philosophy/computers/open-source/foss-licences-wars/
Whether the interpretation of the FSF to the freeness of the licence is
correct here, and whether it matters much in this case (as RMS himself was
quoted as saying that commercial games can have "non-free" art and plots as
long as their engines are free and it was OK ethically and morally). Still it
may prove to be a problem if we want to gain some public acceptance for the
directory.
3. Shouldn't we try to convince dmoz.org to remedy the two problems I've
mentioned, rather than starting our own competing and diverging effort?
-----------------------
Like I said earlier, I'm still very sceptical about whether this idea will
work and be a good one. At the moment, I'm unemployed by choice, but still
have many other endeavours and different priorities and so cannot commit to
dedicating a lot of time to this wiki-directory. I'm already active in the
English wikipedia, the English wiktionary, used to edit the English wikiquote
and would like to work again, and naturally have my own web-sites and blogs
(not really wikis, though I have comments there), which often take greater
precedence and interest. So my expectation is that if such an effort is
started, it will need to grow organically in a similar way that wikinews or
wikibooks or some of the popular topical Wikia wikis have gained public
acceptance.
Regards,
Shlomi Fish
--
-----------------------------------------------------------------
Shlomi Fish http://www.shlomifish.org/
"Star Trek: We, the Living Dead" - http://shlom.in/st-wtld
Chuck Norris read the entire English Wikipedia in 24 hours. Twice.
More information about the foundation-l
mailing list