[Foundation-l] RFC: A Wikipedia/etc.-like Web Directory (e.g: dmoz.org, the old dir.yahoo.com , etc)

Shlomi Fish shlomif at iglu.org.il
Tue Oct 27 19:04:51 UTC 2009


Hi all!

This is a request-for-comments (RFC) about an idea that had surfaced on 
#wikipedia at the time about creating an open web directory similar to 
http://www.dmoz.org/ only world-editable and with a more convenient interface. 
This was motivated after I was referred to the "Wikipedia is not a web 
directory" section of:

http://en.wikipedia.org/wiki/Wikipedia:What_Wikipedia_is_not

History of Web Directories:
---------------------------

I'm not sure how many of the younger folks here are very familiar with the 
history and motivation behind web directories, so I'll explain a little to the 
best of my knowledge. 

Back when the Internet and the World Wide Web started to become popular, 
search engines were much less accurate than Google, or the search engines that 
now compete with it, using similar algorithms. As a result, it was often hard 
to find stuff on the Internet using Lycos or different search engines. As a 
result, people have actively used web-directories and especially yahoo.com 
(which started as a hand-maintained directory by two Stanford students, and 
grew into a successful Internet company), as a way to find resources that were 
considered high-quality by human editors.

Yahoo and similar directories organised the content in a tree of categories, 
with some stuff like "symbolic links", etc. Part of the problem with Yahoo was 
that it was closed for edits only by human editors, which caused it to quickly 
grow out-of-date. As a result, it was eventually surpassed in 
comprehensiveness and accuracy by dmoz.org: 

http://en.wikipedia.org/wiki/Open_Directory_Project

dmoz.org gained some notoriety after Google periodically mirrored it as the 
Google directory (with some enhancements like sort-by-page-rank and a faster 
load time, and a better search). Eventually, Google removed it from their 
front page and search results in favour of Froogle and other stuff which were 
in my (possibly non-representative) opinion much less useful than their 
Directory, and dmoz.org went into much greater obscurity. Soon afterwards, the 
English wikipedia and other wikimedia projects started gaining a lot of 
momentum, popularity and page rank, which caused it to rank high among many 
search engine searches (although to the defence of Google and other search 
engines, one should note that they do seem to have a diversification 
algorithm, which makes the search results not be dominated by a single source 
- whether wikipedia.org or whatever).

Why a Web Directory:
--------------------

While I enjoy the English wikipedia a lot (and have contributed to it - see:
http://en.wikipedia.org/wiki/User:Shlomif ), I still think that web 
directories have been having (or possibly and unfortunately "had been having") 
their advantages and appeal. The primary reason is because they list any site 
of interest, including many that would be considered as not "notable" enough 
for inclusion under the relevant "External Links" in the Wikipedia, but still 
may prove of interest. They also serve a similar purpose to the wikipedias' 
category pseudo-trees of allowing to find similar articles of interest.

A lot of techno-geeks are now saying "Category trees are dead! Tags are the 
future". It is true that traditionally the filesystems of popular operating 
systems such as UNIX (e.g: Linux, Mac OS X, etc.), DOS/Windows, etc. are 
organised in a directory tree and not a tag, which inspired a lot of Internet-
stuff to be similar (as the protocols mirrored the semantics of the UNIX file 
system). However, there are many good reasons (besides ease-of-implementation) 
why they are organised in a hierarchy, instead of in free-form tags. (You can 
see the Google Reader feeds-organised-in-tags or the Flock browser huge tag-
based bookmarks menu for why they sometimes fail). Not to mention that like in 
wikipedia, a certain resource can be tagged with more than one category like 
Isaac Newton ( http://en.wikipedia.org/wiki/Isaac_Newton ) belongs to "17th-
century English people" , "Fellows of the Royal Society" , "English 
alchemists", etc.

So I still think the idea of a web directory appeals to me.

The problems with ODP/dmoz.org:
-------------------------------

As someone who used to be a dmoz.org editor, I found two main problems with 
it:

1. Too much red-tape: an editor could only edit the categories he was given 
permissions for, and not anything above. There were some meta-editors who can 
edit anything and can also give permissions for more categories which take 
time, but I still have been thinking that the best thing would be a wikipedia-
like "everyone can edit everything unless explicitly forbidden" thing.

Another thing I didn't like about this red-tape and authority was an incident 
where as I edited the Perl "FAQs, Tutorials and Helps" category and added a 
sub-category of "Tutorials" where I placed some stuff. Then when an editor 
reviewed my work when I asked for another category, they didn't like the fact 
that one of the texts for the mission statement only reflected my thoughts, 
and so deleted the category and moved everything I wrote their to the parent 
category. This naturally was a destructive change that made me frustrated, as 
I would have been happy to change the mission statement or guidelines of the 
category after the fact.

2. The UI was lacking: there were many forms required to review, submit and/or 
edit a single link, the editing server was kinda slow, there was very little 
AJAX, and editing in general was much less convenient than the wikipedia edit 
link which gives a gigantic textarea with a convenient and concise syntax. 

-------------

For a long time I felt guilty about not dedicating enough time to edit 
dmoz.org, and had reminders to edit it occasionally (which I tended to ignore) 
but eventually passively stopped editing. I now realise I could not be blamed 
for my lack of enthusiasm.

Note that I still feel that dmoz.org is a useful resource which is often fun 
and useful. As great as the Wikipedia is, I still think there's a place for a 
high-profile web-directory. Maybe this is one of the trends that will become 
retro, like push technology which was considered a fad was re-incarnated as 
RSS/Atom feeds which seem to have gained a lot of popularity, and even proved 
to have some business potential. 

The Challenges of a more open / more free web directory:
--------------------------------------------------------

I'm not sure that a wikimedia-sponsored web directory is a good idea yet. But 
here are some thoughts about the challenges:

1. The three S's: Spam, spam, spam. A web directory is likely to be a huge 
spam target and will need good anti-spam controls. However, I personally think 
that while spam should be a factor we take into consideration, it should not 
prevent us from creating new and exciting user-contributed web sites.

One of the reasons I hate spam is not so much that I am bothered by it 
arriving in my inbox, but rather because it makes some people paranoid. My 
personal web-site contains an <a href="mailto:shlomif at iglu.org.il" 
rel="webmaster">shlomif at iglu.org.il</a> E-mail at the bottom of each page, but 
lately most sites I visited either had it obscured under many ways, or even 
just had a contact form. Some people have even told me that I should hide my 
web address to reduce the amount of spam I receive because "prevention is 
better than the cure".

I'm sorry, but I'd rather not destroy paradise just so I can save it. I'd 
rather see some spam on blogs and in E-mail than destroy their 
usability/accessibility, and by corrollary think that a more open web-
directory should not have fear of spam as the main obstacle in its way.

2. We may wish to build upon the existing data of the ODP which is syndicated 
as machine-readable data under this licence:

http://en.wikipedia.org/wiki/Open_Directory_License

Which:

<<<<<<<<
The Free Software Foundation describes the ODL as a non-free license, citing 
the right to redistribute a given version not being permanent, and the 
requirement to check for changes to the license.
>>>>>>>>

Whether something is indeed free/open or not is a term of much debate as I 
mention here in a somewhat different context:

http://www.shlomifish.org/philosophy/computers/open-source/foss-licences-wars/

Whether the interpretation of the FSF to the freeness of the licence is 
correct here, and whether it matters much in this case (as RMS himself was 
quoted as saying that commercial games can have "non-free" art and plots as 
long as their engines are free and it was OK ethically and morally). Still it 
may prove to be a problem if we want to gain some public acceptance for the 
directory.

3. Shouldn't we try to convince dmoz.org to remedy the two problems I've 
mentioned, rather than starting our own competing and diverging effort?

-----------------------

Like I said earlier, I'm still very sceptical about whether this idea will 
work and be a good one. At the moment, I'm unemployed by choice, but still 
have many other endeavours and different priorities and so cannot commit to 
dedicating a lot of time to this wiki-directory. I'm already active in the 
English wikipedia, the English wiktionary, used to edit the English wikiquote 
and would like to work again, and naturally have my own web-sites and blogs 
(not really wikis, though I have comments there), which often take greater 
precedence and interest. So my expectation is that if such an effort is 
started, it will need to grow organically in a similar way that wikinews or 
wikibooks or some of the popular topical Wikia wikis have gained public 
acceptance.

Regards,

	Shlomi Fish

-- 
-----------------------------------------------------------------
Shlomi Fish       http://www.shlomifish.org/
"Star Trek: We, the Living Dead" - http://shlom.in/st-wtld

Chuck Norris read the entire English Wikipedia in 24 hours. Twice.



More information about the foundation-l mailing list