So, it seems (if I interpret Jimbo's mail on wikitech and the discussion
here correctly) that most of us would like *some kind* of category
scheme in wikipedia. I do, too! But, we seem to differ on the details
So far, I saw three concepts:
1. Simple categories like "Person", "Event", etc.; about a dozen total.
2. Categories and subcategories, like
"Science/Biology/Biochemistry/Proteomics", which can be "scaled down" to
#1 as well ("Humankind/Person" or something)
3. Complex object structures with machine-readable meta-knowledge
encoded into the articles, which would allow for quite complex
queries/summaries, like "biologists born after 1860".
1. Easy to edit (the wiki way!)
2. Still easy to edit, but making wikipedia browseable by category,
fine-tune Recent Changes, etc.
3. Strong improvement in search functions, meta-knowledge available for
1. Not much of a help...
2. We'd need to agree on a category scheme, and maintenance might get a
3. Quite complex to edit (e.g., "<category type='person'
occupation='biologist' birth_month='5' birth_day='24' birth_year='1874'
For a wikipedia I'd have to write myself, I'd choose #3, but with
respect to the wiki way, #2 seems more likely to achieve consensus (if
there is such a thing;-)
[Could people please make unofficial translations of this for posting
on *.wikipedia.org wikipedias?]
I'm pleased to announce the existence of the Wikimedia Foundation,
Inc., a nonprofit corporation organized under the laws of Florida,
United States. I am transferring to this new corporation the assets
1. All Nupedia.com/net/org/etc. domain names
2. All Wikipedia.com/net/org/etc. domain names
3. All copyrights in software or articles that were previously owned by
Bomis, Inc. and already placed under a copyleft license. (This
includes work-for-hire by Jason, Tim, Larry, Toan, Liz, and myself, as
well as any other Bomis employees who may have worked on these
projects as a part of their job, but doesn't include any work by those
parties conducted on their own time or while not an employee of
(All that stuff was already under GNU GPL or GNU FDL, so the
contribution of copyrights is basically a formality. Even so, we want
to set a good example.)
4. Additionally, I am contributing all of my personal copyrights to work
already released under GNU GPL or GNU FDL in Wikipedia and Nupedia to
For the time being, the two machines on which Wikipedia runs will be
continue to be owned by Bomis, but my intention is to donate those if
the tax implications make sense. I have to consult with an accountant
on that, first.
I do NOT encourage you to make donations to the Wikimedia Foundation
just yet! I am still working on tax exempt status with the IRS, and I
have not yet set up a bank account for the foundation anyway. Those
things will take a couple more weeks.
On my TODO list here are:
1. Complete the IRS process for tax-exempt status
2. Create forms for the transferance of copyrights to the foundation,
if anyone wants to do that. See:
for reasons why this might a Good Thing.
3. Setup a bank account for the foundation
4. Setup a merchant account for the foundation to make credit-card
donations an easy option
I'll update everyone in a couple of weeks with the status.
On 6/30/03 5:05 PM, "Erik Moeller" <erik_moeller(a)gmx.de> wrote:
> maybe that's a silly question, but why don't we define a policy on
> interpage-anchor-linking? There may be a few instances in which it is
> useful (FAQ? link to "External links" section, which is always titled the
> same anyway?) without any drawbacks.
This whole discussion REALLY should be taking place on the wikipedia-l list.
> Tomasz Wegrzanowski wrote:
> It's wrong to say that most Europeans believe in god or that they don't -
> most don't give a shit either way. They managed to see the real question
> and answered it "no" - they aren't going to give church any powers.
> Having done so, whether they think "god exists" or not is as important as
> believing in fluffy elephants bringing luck or in existence of Neptune -
> it doesn't affect them, so they may answer like others do, or the way
> they feel any given day.
Come on now, guys. Take it to Usenet. :)
Wikipedia: The Free Encyclopedia
>>The strongest adherent of the
>>scientific method admits and accepts that no scientific hypothesis is
>>absolutely true. He does not feel threatened by the emergence of some
>>bizarre theory, and is probably more effective in his refutations by
>>allowing for the possibility of a new and perhaps unlikely hypothesis.
> In defence of the poor scientist-in-real-life,
> he (or she) may not feel /threatened/ by bizarre theories,
> but he often feels /exhausted/ from constantly refuting them,
> and even /annoyed/ when called upon to refute (IHO) really stupid ones.
> But I think that our NPOV method can still deal with this.
> In the cube example, we don't allow anything with no supporters
> (that falls under Wikipedia's ban on original research),
> and we don't feel the need to refute things with no arguments
> (it's enough to state the fact that the position is a fringe one).
> Then once the arguments for the cubical Earth are presented,
> we only have to lay out the counter-arguments once, there in the article.
> If the counter-counter-arguments etc get to be too long,
> then we simply spin things off into [[Cubical Earth]].
> People that don't want to deal with this inane crackpot nonsense
> can rightly point its adherents to that article.
I don't either belive in NPOV. Never did and never will. Whatever
happens Wikipedia will get the POV of the masses, a conglomerat of all
authors working on the project. My task is therefore to add my pov on
things to increase the sum of all recorded human knowledge that
wikipedia is (coming to be). Over time most of the articles converge to
something that becomes the POV of the masses - aka NPOV. Some wont, like
the articles on drugs, because there seem to be a lot more potheads that
write those than read them.
Like the cubic earth example, writing "the earth is spherical" wouldnt
be correct because 0,001% know it is cubic. Even worse to say that the
holocaust has happened because 0,1% know it has. Or saying that God's
existance hasn't been proven because 10% know it has.
I have just checked out the phase3 CVS repository.
Now I've fixed my first bug. How do I get write access so I can commit
it? Or does somebody want to look over it first?
What I'm wanting to commit fixes the bug I reported at
&group_id=34373&atid=411195. (Sorry I accidentally reported it as a
feature request, but I couldn't find a way of deleting it, nor would the
bug tracker let me submit the same thing again as a bug report.)
(At least, I hope it is a major new thread!)
Well, we've argued for a few days, and a lot of ideas have been thrown
around, and the tension between various competing principles and/or
ideals has been explored fairly effectively.
So I wonder if we could work towards some consensus that we can all
(or nearly all) agree on. What steps might we take that all parties
to the discussion might agree on?
Can everyone chip into this thread with accomodative ideas that you
think everyone might agree with?
By way of example, in the "fair use" debate, the one thing that all
parties could agree on is the importance of prioritizing the _tagging_
of images with their status. Whatever might end up being done, even
if -nothing much- is the answer, that tagging is something that no one
has objected to.
So what might we do in the content metadata arena that no one (or
nearly no one) would object to?
I think that there is broad support for a categorization system which
is broad and flexible, and not necessarily aimed at content
advisories, but which might, in part, be used to address that issue as
Here's what I propose -- and I'm talking at the level of policy, not
at the level of technical implementation, because there's a lot more
to be thought about and said in that regard -- is that we move towards
the implementation of a content metadata or categorization system with
the following features:
1. Categories should be non-normative in nature. That is, "mature
content" is an invalid category, because it suggests a value judgment
that we want to leave to the end user. "Explicit sexual content",
while still perhaps vague in some respects, is at least
non-normative... it might be a good thing or a bad thing, depending on
2. Categories should be infinitely wiki-editable, especially in the
beginning, because "a priori" categorization is impossible and likely
to lead to a lot of problems. Ideas that only sysops can create new
categories should be avoided until we actually determine empirically
that it's necessary. (One thing we know from our wiki experience is
that something as obviously insane as letting anyone in the world edit
works amazingly well!)
3. We should quite possibly accept _as an editing principle_, a sort
of Ockham's razor for categories -- not multiplying them needlessly,
while at the same time, not hesitating to let people experiment with
categories as they see fit. But, we'll try not to fight about it too
much, especially at first.
4. The categorization system should be simple -- i.e., articles can
be tagged with as many categories as we like, and that's that. They
are not required, and if people want to work on them, they can, and if
they don't want to work on them, they don't have to do so.
5. Especially initially, website impacts of categories should be very
minimal... i.e. we don't try anything radical regarding filtered
searches or automatic index pages or anything too exciting like that.
If we did that, we'd be introducing nothing harmful, and doing something
positive for *multiple* purposes, not just the "content advisory" purpose.
And we'd get to find out, in an experimental environment, whether
categorization prompts massive flamewars, etc.
And then in a while, we can revisit the issue, and see what we think
could be done with these categories.
p.s. Examples of useful categories that are non-normative:
graphic sexual content
advanced mathematical content
gay and lesbian studies
All of these might be useful for content re-users who might like to
extract a subset of our data. For example, an "Encyclopedia of Sexual
Practices" might want to extract just those articles flagged with
'graphic sexual content' or 'sexual content'. A "edupedia" project
could automatically extract articles that avoid certain topics, or
focus on certain other topics.
----- End forwarded message -----
As I said on wikitech-l, I implemented The Erik-style categories at the test site.
I recommend starting at http://test.wikipedia.org/wiki/Anatomy for test purposes.
> ABOUT USAGE
> I expect
> * lots of arguments about category schemes
> * pragmatic agreement on the most obvious categories
> * much useful entry of data according to various schemes
> * eventual and wonderful automatic indexes, Bayesian
> auto-classification, etc. etc.
Automatic functions will be *way* less efficient (e.g., slow) in the Erik scheme.
> If we do this right, we can probably _eventually_ get rid of most of the
> list articles entirely: ''but'' there is one fly in the ointment with
> doing things this way: we lose the ability to create links in category
> lists to non-existing articles. This would be a great pity, as it is one
> of the most useful ways to create lists of new candidates for articles
> within a topic.
I vaguely remember a page with requested topics (*not* Most Wanted)...
P.S.: As already said on wikitech-l, I'll copy this implementation into the CVS after a few days, *unless* I am told about basic problems with it.
Here is the current draft of the letter I will send to HMSO. I plan
to send it on 3 July (the mail isn't picked up the following day ;->).
Please reply with your comments and suggestions before that date.
Controller and Queen's Printer
Dear Ms. Tullo:
I am an administrator of the Wikipedia, a multilingual project to
create a complete and accurate open content encyclopedia. The
English-language version can be seen on the Web at
http://www.wikipedia.org/. We gather information from many sources,
and government Web sites are often very useful to us. We have
reviewed the terms of the Crown copyright, but are still unsure
whether we can use that material.
We understand that we can use your material "free of charge in any
format or medium provided it is reproduced accurately and not used in
a misleading context [and] the source of the material [is] identified
and the copyright status acknowledged." Our question centers on the
interaction between the Crown copyright and our own. We maintain
copyright over the material we create, but license its use under the
GNU Free Documentation License, which was designed by the Free
Software Foundation (FSF) for free content. The license text can be
found at http://www.gnu.org/copyleft/fdl.html. It stipulates that any
copy of the material, even if modified, must carry the same license.
Wikipedia is the largest documentation project to use this license.
Thus, under the terms of the GFDL, we cannot pass the Crown copyright
restrictions on to third-generation re-copiers. For example, I might
copy material from the Royal Navy's site into an article about HMS
SCEPTRE, doing so accurately, honestly, and with attribution, but a
third party who copied the article onto his Web site, removed the
attribution, and somehow altered the information to be deceptive would
not be in violation of our license.
As participants in a unique and highly visible project who are freely
releasing our work to the public, we are punctilious about respecting
copyright. We would be very grateful if you could provide us with a
statement of HMSO's opinion on our ability to copy material from HM
Government Web sites and relicense it under the GFDL.
I can be contacted by e-mail at sean(a)epoptic.com, by telephone at
01-310-739-xxxx (I am in time zone UTC-7 -- please call in your late
afternoon), and by post at Sean Barrett, xxxx xxxxxx Avenue, Los
Angeles CA 90xxx, USA. Thank you very much for your time and
> Date: Thu, 26 Jun 2003 21:29:53 -0700
> From: Sean Barrett <sean(a)epoptic.org>
> Thus, under the terms of the GFDL, we cannot pass the Crown copyright
> restrictions on to third-generation re-copiers. For example, I might
> copy material from the Royal Navy's site into an article about HMS
> SCEPTRE, doing so accurately, honestly, and with attribution, but a
> third party who copied the article onto his Web site, removed the
> attribution, and somehow altered the information to be deceptive would
> not be in violation of our license.
I recommend using a more positive example of breaking Crown
Copyright, such as creating derivative works. The deceptive editing
example is likely the reason why they don't allow modifications in
the first place. Also, you might want to mention that the GFDL has
no restrictions against commercial use.
- Stephen Gilbert
Wikipedia: The Free Encyclopedia