Most of you would be aware of some of the discussions that have occurred
around Wikipedia in the Norwegian languages. Since the last round of
discussions on this list, there has been a lot of internal debate, as
well as what seems to be a fairly widely accepted agreement following
This e-mail intends to, after a brief recap on Norwegian language and
wikipedia issues, take those interested through the latest development
and will stake out the road ahead. It is also intended to inform the
international community about the current agreement on no.wikipedia, so
as to prevent misunderstandings in the future.
Finally, we will mention an unfortunate reaction to the vote by a small
number of users at the Norwegian Bokmål/Riksmål (no:) wikipedia who want
to disregard the result of the voting and are planning to create a
_third_ Norwegian wikipedia with the sole mission of mixing the contents
of the two current Norwegian versions.
== A short language history of Norway ==
Spoken Norwegian ("norsk") (ISO 639-2 alpha-2 code "no") is in a fairly
unique situation compared to most other languages of the world in that
it has two widely accepted written standards, Bokmål (ISO 639-2 alpha-2
code "nb") and Nynorsk (ISO 639-2 alpha-2 code "nn"). By national
legislation they are both regarded as official written forms of
Norwegian. In addition, many people still make a distinction between
Bokmål and its precursor which still is in use, Riksmål.
Briefly speaking, Bokmål and Riksmål are descendants of the Danish
written language. Until the 1800s, Danish was the only widely used
written language in Norway as a result of four centuries of union with
Denmark. With increasing independence came a wish to norwegianise the
Danish standard, with Knud Knudsen at the forefront for changing parts
of the vocabulary and orthographics. Thus, Riksmål, and later Bokmål,
resulted. These forms together are today probably used by about 90% of
Norway's population, or somewhere around 3,500,000 people.
Parallel to this development, a new written standard was created by Ivar
Aasen. He travelled extensively throughout Norway, and based his new
language, landsmål, on the grammar and vocabulary of dialect samples
from around the country. This was later renamed Nynorsk. Modern Nynorsk
differs significantly from modern Bokmål, and may be linguistically
looked upon as as different (or as similar if you like) as Swedish is to
Danish. For English or Dutch/German speakers, the differences may be
likened to those between (Lowland) Scots and English or Low German and
Dutch. Today it is estimated that about 500,000-600,000 people have
Nynorsk as their first written language.
More information about the Norwegian language history can be found in
English, German, French, Spanish or Portuguese on the website of the
Norwegian Language Council:
== A short history of Wikipedia in Norwegian ==
The first Norwegian wikipedia started 26 November 2001 on the subdomain
no.wikipedia.org. As most wikipedias, its contributor and article count
started really picking up around the end of 2003. At the time, it
accepted all written standards of Norwegian, although the amount of
Nynorsk was minimal. There were already several debates about the
feasibilty and appropriateness of keeping the two languages united on
one Wikipedia. On 31 July 2004 a Wikipedia for Nynorsk was created.
The creation of nn:, however, split the community at no: wikipedia. Many
felt that given that Nynorsk now had its own wikipedia, no: should
become a Bokmål/Riksmål Wikipedia only. Others disapproved and claimed
that there was no need to change and that it should continue its
language policy of accepting all and keep its interwiki link name of
Nynorsk Wikipedia soon proved a success, as it within the next few
months gathered several people who had felt uncomfortable in the
(mainly) Bokmål environment at no:. The name displayed in interwiki
links became "Norsk (nynorsk)" (languages are not spelt with upper case
in Norwegian). To date it continues to be one of the fastest growing
wikipedias, with a steady article increase, now at over 6000 articles
and >50 editors with more than 10 edits since arrival.
== Votes ==
The issue of no:'s language policy has come up time and again, and a
vote was held in March ([[:no:Wikipedia:Målform]]) as to which policy to
adapt. Independent of the method of the tally (whether or not to include
new contributors etc.) there was a majority for switching to a
Bokmål/Riksmål only language policy (50% for Bokmål/Riksmål, 43.2% for
Bokmål/Riksmål/Nynorsk/Høgnorsk, and 6.8% for the official variants
Following this result, there is now going to be a vote on which
interwiki link name will most appropriately reflect the current language
policy of no:. The result of this vote will most likely be either "Norsk
(bokmål)" or "Norsk (bokmål/riksmål)".
Understandably, there has also been a debate as to whether the subdomain
should change from "no" to "nb", as this is the correct representation
of Bokmål according to ISO 639-2. However, there is some resentment
towards such a move and currently a general acceptance in letting the
Bokmål wikipedia stay at "no". The alternative some have suggested is a
server-side redirect from "no" to "nb", in the same way that "nb" today
is a server-side redirect to the equivalent page on "no".
== Summary of the problem ==
Unfortunately, a small group of users (who all write Bokmål/Riksmål) are
ignoring the results from the vote, and are claiming they want to
re-establish a wikipedia for all written standards of Norwegian. They
claim they have been in touch with people centrally in Wikimedia
(developers? stewards?) and that they have so far received positive
comments. With this email, we would like to state the fact that there
have been no official decisions about creating a third Norwegian
wikipedia containing both Bokmål and Nynorsk, it is merely an unofficial
initiative from a small group of users which started a sign-on list at
[[:no:Bruker:Norsk_Wikipedia]]. A spontaneous list with signatures
against this activity was immediately created at
[[:no:Wikipedia-diskusjon:Fellesnorsk]]. The process of creating a third
Norwegian wikipedia has not gone through a voting process in any of the
two existing Norwegian wikipedias (no: and nn:) and can not be
considered as a decision by the Norwegian Wikipedia community.
We believe the creation of a third wikipedia under the Wikimedia
foundation would have a serious and unfortunate impact on the existing
wikipedias in Norwegian, no: and nn:, and would undermine Wikipedia's
reputation in Norway. This being said, we are all for extensive co-
operation between the four Scandinavian language wikipedias (including
Swedish and Danish), as evident by the recent creation of
[[:meta:Skanwiki]], the Scandinavian meta-pages, and the use of featured
articles from neighbour wikipedias.
== Conclusion ==
Hopefully, this letter will help people better understand the
complicated language situation of the Norwegian Wikipedia community, so
as to give a background on which discussion can take place on this list
in the future, such as the inevitable debate following a possible
request for a re-establishment of the common (and third!) Norwegian
>From the community of no.wikipedia.org and nn.wikipedia.org,
Bjarte Sørensen [[:meta:User:BjarteSorensen]] (Administrator/bureaucrat on nn:)
Lars Alvik [[:no:User:Profoss]] (Administrator/bureaucrat on no:)
Øyvind A. Holm [[:no:User:Sunny256]] (Administrator on no:)
Onar Vikingstad [[:no:User:Vikingstad]] (Administrator on no:)
Jon Harald Søby [[:no:User:Jhs]] (Administrator on no:)
Chris Nyborg [[:no:User:Cnyborg]] (Administrator on no:)
Guttorm Flatabø [[:no:User:Dittaeva]] (Administrator on nn:)
Gunleiv Hadland [[:meta:User:Gunnernett]] (Administrator on nn:)
Jarle Fagerheim [[:nn:User:Jarle]] (Administrator on nn:)
Øyvind Jo Heimdal Eik [[:en:User:Pladask]] (Administrator on nn: and no:)
Kristian André Gallis [[:nn:User:Kristaga]]
Vegard Wærp [[:no:User:Vegardw]]
Nina Aldin Thune [[:no:User:Nina]]
Thor-Rune Hansen [[:no:User:ThorRune]]
Claes Tande [[:no:User:Ctande]]
Arnt-Erik Krokaa [[:no:User:AEK]]
Rune Sattler [[:no:User:Shauni]]
I would like to invite you to join a chat about the relationship
between the Wikimedia community and the Open Access movement in
scientific publishing. This will explore issues of licensing, content
sharing, technology, and hopefully result in mutual commitments to
In a nutshell: December 17, 2006; irc.freenode.net; 21:00 UTC; #openaccess
for more (including a link to a web interface for accessing the IRC
channel). I would appreciate it if you would add yourself to the "I
want to attend!" list on the page, so we have an idea how many people
Peace & Love,
DISCLAIMER: This message does not represent an official position of
the Wikimedia Foundation or its Board of Trustees.
I am working on a new anti-vandalism application for Wikipedia and the other
Wikimedia projects. Before I get really deep into coding, I need to make sure
that it will actually be used.
The basic problem that the application addresses is vandalism getting through
Wikipedia's vandalism catching systems. The Wikipedia community does an
excellent job overall, but every once in a while vandalism (subtle or
obvious) gets through. I personally have come across a few pieces of
vandalism that were months old.
The way the problem is addressed is to gather all edits together on a central
server. Approved users would connect to the server and examine the edits for
vandalism. If a certain number of users approve the edit it is removed from
the pool. Edits marked as vandalism ("condemned") would be removed after the
vandalism has been entirely dealt with: revert, warn, speedy delete, etc.
There are various tricks I can put on the central server to reduce the number
of edits that need to be reviewed. The most obvious is a whitelist, but there
are many other techniques such as combining edits made in close succession by
a single editor to a single article.
Now to my questions.
- Does this sound like a good idea in general?
- Is there already a project similar to mine that I would be unnecessarily
- A significant number of users are needed to make the system work. Will the
system probably be popular enough to get this minimum number of editors?
Forwarding the post on my blog.
-------- Original-Nachricht --------
Well, there is a nice website that can help us with that question ...
and that is from the institution that cares about this officially - the
Region of Sardinia.
When it comes to the Limba Sarda Comuna used on the actual Sardinian
wikipedia <http://sc.wikipedia.org> there is no doubt that the language
exists, but we must appreciate that it is an artificial language that
was created out of the living languages of Sardinia. The website of the
Region of Sardinia
Limba sarda comuna: una lingua realmente esistente: Sa Limba sarda
comuna è naturale per il 92,8 per cento, è in posizione mediana rispetto
a tutti i dialetti del sardo e può ancora essere migliorata per farla
diventare la lingua ufficiale dei sardi.
Limba sarda comuna: a language that in fact exists: Sa Limba sarda
comuna is natural be 92,8 per cent, it is in an intermediate position
compared to all Sardinian dialects and can still be improved to have it
become the official language of the Sardinian people
So they still want to improve the language ... nice ... 92,8 per cent of
it is natural that means 7,2 percent is not natural. If I consider these
percentages to what translators work with every day, that is the
"matches" we get in our CAT tools, then 92,8 percent is a low percentage
of being "natural". It seems to be high, but in fact it is not ...
Let's say I translate any kind of text (a sentence for example) and my
analysis software tells me that the text is up to 93% percent equal to
another sentence I translated before, this means that I cannot leave the
sentence as is, because I will need to change at least one word in the
sentence to make it a proper translation of what is there.
Just to give you an example:
The house on the hill is green - that is what was translated before. Now
I get such a 92,8 per cent match with a sentence like: the tree on the
hill is green. If I left it as is: it would state something completely
You can also look at it like this:
The house on the hill is nice and green. - that is 100% English
The house on the hill is nice and vert. - that is approx. 89 % English +
(it is just a matter of playing with the amount of words to get the 92,8%)
So what these 92,8% tell us: even if a huge part of it is considered to
be built out of the "natural language part" it is still an artificial
But what is a language and what is a dialect? Well: that very much
depends from which POV you look at things. But ISO determined some rules
to understand what a language is and what not. That is, before you can
get an ISO 639 code for a language you need to prove that this languabe
complies to the standard. Of course there are living languages that
don't have an ISO code, because up to now nobody cared for them - I am
just thinking about Griko Salentino, a language spoken and written in
Italy - but if people care about that language, they will ask for it.
What is a dialect ...
a) a language without an army
b) a way of expressing orally that developed out of a language and that
has some differences , for example in pronunciation, some expressions
etc, even having the same basics when it comes to grammar (just to
mention one example)
Campidanese (ISO 639-3: sro)
Gallurese (ISO 639-3: sdn)
Logudorese (ISO 639-3: src)
Sassarese (ISO 639-3: sdc)
be dialects of the Common Sardinian Language? Well ... only from a
logical POV this is not possible, because they were there long before
the Common Sardinian Language was created ...
By having their ISO 639 code, when they requested that code, they
complied to the requests of the International Standardisation
Organisation <http://en.wikipedia.org/wiki/ISO> and therefore, on an
international level they are considered to be languages even with an ISO
Please let me repeat: there are languages that don't have one, but these
can request a code ...
When it comes to the language committee we had to draw a line somewhere
and this line should not come from us, that is: it is NOT up to the
members of the language committee to decide what a language is or not.
We needed some kind of standard to apply and the clearest one was and
still is the ISO standard. So if somebody wants to complain and say that
the four languages above are in fact dialects of Sardinian and not
languages, we should kindly invite them to create their papers and
contact ISO directly to have the ISO 639-3 language code taken away ...
it is NOT up to the language committee to take such decisions.
Another thing people should then also consider to do: also UNESCO states
that these four languages are languages and they are in the red book of
endangered languages - so if whoever states that they are not languages
and he/she is so sure about it: they should also contact UNESCO. It is
NOT up to the language committee to take such decisions as to delete
four languages out of the endangered languages list ...
Sorry for me being so ironical, but: when such discussions about what is
and what is not a language come up ... well: before you come to us,
please go to the INTERNATIONAL bodies that deal with the question.
We are only normal people that base their decisions on standards and can
tell people where to go to request their code, but we can nor create
that code, nor influence what is recognised on an international level.
(Nor do we want to do that).
Now to the question of sc.wikipedia ... I remember that, at the
beginning, sc.wikipedia tried to host all of the Sardinian languages,
then someone came up and decided to make sc.wikipedia a Limba Sarda
Comune wikipedia only. Well: the Limba Sarda Comune is being used by
Sardinian Authorities to facilitate their work.
In any case the code "sc" stands for the macro language Sardinian and
not for the Limba Sarda Comune, so there is no reason why it should have
the right to claim that code for the language. That is the Limba Sarda
Comune, like any other language in the world that wants recognition by
ISO must request an own ISO 639 code. It is not an option to simply say:
now let's take that one since it is there ... well the one that is there
stands for something else.
The question of the actual sc.wikipedia came up because of people
telling us that Sassarese is not a language, but a dialect of Sardinian
and that the Limba Sarda Comune (Common Sardinian Language) is the only
"right language" of Sardinia.
Well again: it is not us who is going to decide on Sassarese and the
other three being or not being a language - we rely on ISO 639-3 codes
since we had to draw a line and avoid to simply assert things. It is not
us who is going to decide if the Limba Sarda Comune is going to get an
ISO 639 code. If you, who read this, are interested in this matter, it
is up to you to get things on their way.
See: the decision to base whatever we do on ISO 639-3 was one of the
wisest decisions ever taken within the language committee ... imagine
which fights (almost all political based) we would have if we did not do
Just to make things clear - I repeat it again:
a) we do NOT decide if something is a language or not
b) we base our decisions on ISO 639-3
c) we actually need a solution for various scripts used for one language
d) we would love to see Multilingual Mediawiki there since it could be
used to create easily sustainable communities
e) we are not going to go ahead on discussing if Sassarese is a language
or not (it has a code)
f) we will need to find a solution for Limba Sarda Comune which does NOT
have an ISO 639 code and is using the sc code in an improper way.
Thank you for your patience and understanding.
Posted By Sabine Cretella to words & more
at 9/11/2007 08:53:00 AM
Every now and again my bot gets blocked because it doesn't have a bot
bit. Please stop that! If you want my bot to have a bot bit, then GIVE
IT ONE!!!! If you need information first, you can ask me and I'd be
happy to help you. But a bot bit is there FOR YOU, not for me. If you
want it, you ask for it. Why force me to ask you for something that
you want? If you want it, then ask for it. It's as simple as that.
Andre Engels, andreengels(a)gmail.com
ICQ: 6260644 -- Skype: a_engels
Tooltip: "Wikipedia's role as brain-extension, while a little troubling,
is also really cool."
I would say: Wikipedia might become to factual knowledge what the pocket
calculator was to mental arithmetic ;-)
My "missing images" tool, a toolserver script to find free images to
use in WIkipedia articles by looking at what other langues have about
the topic, has been running for almost a year now, and was used ~42000
Recently, I published "WatchFlickr", which does a very similar thing,
but searches free Flickr images.
Enter the FIST (Free Image Search Tool), a "unified" version of both
tools, and then some:
It has the known, now "merged" features:
* Follow language links and look if they have free images
* Look on Flickr for free images
I have added:
* Direct Wikimedia Commons search (results from the internal Commons search)
* GIMP-SAVVY (lots of PD images from the US Gov.)
Together with a truckload of options (scan title lists, categories to
depth X, replace placeholder images) and fine-tunings for the
individual searches, it is already likely to be one of the most
comprehensive free image (meta-)search tools on the web.
Please try it out, report bugs and feature requests (link in the
header bar), and throw lots'o'images at wikipedia articles :-)
Both Missing Images and WatchFlickr will eventually be "phased out"
and redirect to FIST.
If you know any other sources of free media, I'll be happy to add them, provided
* they have at least a few thousand images
* they have an API, to are easy to screen-scrape;-)
P.S.: Sorry to spam wikipedia-l with this, but it seemed significant
to me, and the easiest way to reach wikipedians in all languages.
As I said, I am sending out the single requests for single languages
that still need translating for the Fundraiser 2007.
The above combination is valid for this text:
Please just copy and paste the sourcetext to the page that opens by
clicking on the red language link and overwrite it in your language.
Thanks for helping out.
p.s. The fundraiser of the Wikimedia Foundation will start soon ... stay
Knowledge is a human right. Help to protect it. Donate for Wikipedia.