Most of you would be aware of some of the discussions that have occurred
around Wikipedia in the Norwegian languages. Since the last round of
discussions on this list, there has been a lot of internal debate, as
well as what seems to be a fairly widely accepted agreement following
This e-mail intends to, after a brief recap on Norwegian language and
wikipedia issues, take those interested through the latest development
and will stake out the road ahead. It is also intended to inform the
international community about the current agreement on no.wikipedia, so
as to prevent misunderstandings in the future.
Finally, we will mention an unfortunate reaction to the vote by a small
number of users at the Norwegian Bokmål/Riksmål (no:) wikipedia who want
to disregard the result of the voting and are planning to create a
_third_ Norwegian wikipedia with the sole mission of mixing the contents
of the two current Norwegian versions.
== A short language history of Norway ==
Spoken Norwegian ("norsk") (ISO 639-2 alpha-2 code "no") is in a fairly
unique situation compared to most other languages of the world in that
it has two widely accepted written standards, Bokmål (ISO 639-2 alpha-2
code "nb") and Nynorsk (ISO 639-2 alpha-2 code "nn"). By national
legislation they are both regarded as official written forms of
Norwegian. In addition, many people still make a distinction between
Bokmål and its precursor which still is in use, Riksmål.
Briefly speaking, Bokmål and Riksmål are descendants of the Danish
written language. Until the 1800s, Danish was the only widely used
written language in Norway as a result of four centuries of union with
Denmark. With increasing independence came a wish to norwegianise the
Danish standard, with Knud Knudsen at the forefront for changing parts
of the vocabulary and orthographics. Thus, Riksmål, and later Bokmål,
resulted. These forms together are today probably used by about 90% of
Norway's population, or somewhere around 3,500,000 people.
Parallel to this development, a new written standard was created by Ivar
Aasen. He travelled extensively throughout Norway, and based his new
language, landsmål, on the grammar and vocabulary of dialect samples
from around the country. This was later renamed Nynorsk. Modern Nynorsk
differs significantly from modern Bokmål, and may be linguistically
looked upon as as different (or as similar if you like) as Swedish is to
Danish. For English or Dutch/German speakers, the differences may be
likened to those between (Lowland) Scots and English or Low German and
Dutch. Today it is estimated that about 500,000-600,000 people have
Nynorsk as their first written language.
More information about the Norwegian language history can be found in
English, German, French, Spanish or Portuguese on the website of the
Norwegian Language Council:
== A short history of Wikipedia in Norwegian ==
The first Norwegian wikipedia started 26 November 2001 on the subdomain
no.wikipedia.org. As most wikipedias, its contributor and article count
started really picking up around the end of 2003. At the time, it
accepted all written standards of Norwegian, although the amount of
Nynorsk was minimal. There were already several debates about the
feasibilty and appropriateness of keeping the two languages united on
one Wikipedia. On 31 July 2004 a Wikipedia for Nynorsk was created.
The creation of nn:, however, split the community at no: wikipedia. Many
felt that given that Nynorsk now had its own wikipedia, no: should
become a Bokmål/Riksmål Wikipedia only. Others disapproved and claimed
that there was no need to change and that it should continue its
language policy of accepting all and keep its interwiki link name of
Nynorsk Wikipedia soon proved a success, as it within the next few
months gathered several people who had felt uncomfortable in the
(mainly) Bokmål environment at no:. The name displayed in interwiki
links became "Norsk (nynorsk)" (languages are not spelt with upper case
in Norwegian). To date it continues to be one of the fastest growing
wikipedias, with a steady article increase, now at over 6000 articles
and >50 editors with more than 10 edits since arrival.
== Votes ==
The issue of no:'s language policy has come up time and again, and a
vote was held in March ([[:no:Wikipedia:Målform]]) as to which policy to
adapt. Independent of the method of the tally (whether or not to include
new contributors etc.) there was a majority for switching to a
Bokmål/Riksmål only language policy (50% for Bokmål/Riksmål, 43.2% for
Bokmål/Riksmål/Nynorsk/Høgnorsk, and 6.8% for the official variants
Following this result, there is now going to be a vote on which
interwiki link name will most appropriately reflect the current language
policy of no:. The result of this vote will most likely be either "Norsk
(bokmål)" or "Norsk (bokmål/riksmål)".
Understandably, there has also been a debate as to whether the subdomain
should change from "no" to "nb", as this is the correct representation
of Bokmål according to ISO 639-2. However, there is some resentment
towards such a move and currently a general acceptance in letting the
Bokmål wikipedia stay at "no". The alternative some have suggested is a
server-side redirect from "no" to "nb", in the same way that "nb" today
is a server-side redirect to the equivalent page on "no".
== Summary of the problem ==
Unfortunately, a small group of users (who all write Bokmål/Riksmål) are
ignoring the results from the vote, and are claiming they want to
re-establish a wikipedia for all written standards of Norwegian. They
claim they have been in touch with people centrally in Wikimedia
(developers? stewards?) and that they have so far received positive
comments. With this email, we would like to state the fact that there
have been no official decisions about creating a third Norwegian
wikipedia containing both Bokmål and Nynorsk, it is merely an unofficial
initiative from a small group of users which started a sign-on list at
[[:no:Bruker:Norsk_Wikipedia]]. A spontaneous list with signatures
against this activity was immediately created at
[[:no:Wikipedia-diskusjon:Fellesnorsk]]. The process of creating a third
Norwegian wikipedia has not gone through a voting process in any of the
two existing Norwegian wikipedias (no: and nn:) and can not be
considered as a decision by the Norwegian Wikipedia community.
We believe the creation of a third wikipedia under the Wikimedia
foundation would have a serious and unfortunate impact on the existing
wikipedias in Norwegian, no: and nn:, and would undermine Wikipedia's
reputation in Norway. This being said, we are all for extensive co-
operation between the four Scandinavian language wikipedias (including
Swedish and Danish), as evident by the recent creation of
[[:meta:Skanwiki]], the Scandinavian meta-pages, and the use of featured
articles from neighbour wikipedias.
== Conclusion ==
Hopefully, this letter will help people better understand the
complicated language situation of the Norwegian Wikipedia community, so
as to give a background on which discussion can take place on this list
in the future, such as the inevitable debate following a possible
request for a re-establishment of the common (and third!) Norwegian
>From the community of no.wikipedia.org and nn.wikipedia.org,
Bjarte Sørensen [[:meta:User:BjarteSorensen]] (Administrator/bureaucrat on nn:)
Lars Alvik [[:no:User:Profoss]] (Administrator/bureaucrat on no:)
Øyvind A. Holm [[:no:User:Sunny256]] (Administrator on no:)
Onar Vikingstad [[:no:User:Vikingstad]] (Administrator on no:)
Jon Harald Søby [[:no:User:Jhs]] (Administrator on no:)
Chris Nyborg [[:no:User:Cnyborg]] (Administrator on no:)
Guttorm Flatabø [[:no:User:Dittaeva]] (Administrator on nn:)
Gunleiv Hadland [[:meta:User:Gunnernett]] (Administrator on nn:)
Jarle Fagerheim [[:nn:User:Jarle]] (Administrator on nn:)
Øyvind Jo Heimdal Eik [[:en:User:Pladask]] (Administrator on nn: and no:)
Kristian André Gallis [[:nn:User:Kristaga]]
Vegard Wærp [[:no:User:Vegardw]]
Nina Aldin Thune [[:no:User:Nina]]
Thor-Rune Hansen [[:no:User:ThorRune]]
Claes Tande [[:no:User:Ctande]]
Arnt-Erik Krokaa [[:no:User:AEK]]
Rune Sattler [[:no:User:Shauni]]
Anthony DiPierro wrote:
> So Larry Sanger is complaining about Wikipedia being too anarchistic
> and Nicholas Carr is complaining about it being too hierarchical. You
> guys must be doing something right :).
> I thought the most insightful part of the article was this:
> "I think that the Wikipedia community made a mistake when it decided
> that it's the wiki part that explained Wikipedia's success. They
> proceeded to apply the same software and content development system,
> which happened to work (more or less) for an encyclopedia, to develop
> very different kinds of projects: a dictionary, news articles, editing
> public domain books, writing new books from scratch, and several more
> things. It seems they found they had a whopping good hammer and
> suddenly everything looked like a nail."
> I think I've fallen into that trap myself a few times.
One of his useful observations, yes. Although it could be pointed out
that plenty of people in the community support the principle that being
a wiki is secondary to producing an encyclopedia (though certainly still
a free and collaborative encyclopedia). That would be why we make
adjustments of the kind that prompted Carr's pronouncement.
And for the other projects, a wiki is sometimes far from ideal, true
enough. That's been recognized for some time, which is why there are
ongoing efforts to adapt or extend the software to better serve those
objectives. In the meantime, the wiki is the tool we've got. Tired
cliches aside, I'm sure better ones would be welcomed if somebody cares
to offer us more tools.
Is there any way we can take action against Baidu Baike? Surely there is
some legal means of recourse, because they apparently are taking things from
the Chinese Wikipedia wholesale, with utter copyright violations. We have to
deal with copyright issues, I don't see why we should idly stand by and let
ourselves get trampled over with. Since we have been sending cease and
desist notices to small mirrors now, surely this is the ripe target, as
Baidu Baike is state-sponsored by the People's Republic of China.
As an example of lifting, see their
This occurs for the huge majority of articles (90,000). For context, these
are articles which discuss Sun Yat-Sen. Our version at zh was much earlier,
with last edits in January, while Baidu Baike, set up in April has only made
minor excisions and modifications. It is clear that the contributors to the
Wikipedia article hold the copyright.
I think we should give up any hope of negotiating with the PRC entirely, as
they obviously have taken a hostile and insulting stance to us by lifting
material wholesale, then saying it is their copyright. This is such a gross
violation of Wikipedia's philosophy, and PRC being a member of the World
Trade Organisation (with compulsory ratification of the WIPO), some form of
action *must* be taken. I do not think this is a time for lega pacifism.
If we could initiate any attempts I think the whole community will cheer on.
The right to sue is clearly there, perhaps the tediousness is in collecting
copyright holders, but then again I think we can make a point of this case.
The PRC is required to comply, because such a gross violation can mean PRC's
expulsion from the WTO. If anything I think we should treat Baidu Baike as
an enemy, and drop any pretenses or hopes that the PRC will ever willingly
As a matter of fact, there are online magazine selling books in the pms
language, but Amazon is obviously not one of those. Is there anyway to add a
link to the list? I suppose it's many editions having this problem.
I have made a proposal at http://meta.wikimedia.org/wiki/Wikimark2 for
a Wikimarkup v2. I was unsure where to mention that I'd made this proposal.
I am also aware that an update to syntax isn't pertinent or desperately
needed; nevertheless I'd like feedback on the ideas from the community. Like
Wikitax, the proposal can become a non-MediaWiki-integrated language on its
own, or, if there is support, it can indeed be the next base of MediaWiki
Disclaimer! I'm not being a "noob" with funny ideas that disrupt
unnecessarily - I would probably have thought that of me if I saw this
post straight-off - I've just had what I believe is a good idea for a
MediaWiki syntax which, regardless of status-quo, I thought might be
interesting to some of you. Let me know what you think at my Wikipedia
I've added those per your suggestion, although i just made the YEH as IE
for now... I really have to look further into digraphs etc.
Do you have any bitext? (e.g. aligned translations -- one side Tajik,
the other side Farsi?) That would make development a hell of a lot
PS. I now have a ~8,000 entry wordlist and growing... I might play
around with that in the future.
On Sun, 2006-05-28 at 18:04 -0700, Mark Williamson wrote:
> Also, you're missing a few letters.
> I'm not 100% certain about the *proper* conversion of the following
> passage, but:
> оаضое доئме шварое омнета созмон млл мтаҳд дрборҳ арضҳ мшвақ ҳо бҳ
> ҷмҳваре осломе бҳ таваофқ нздеки шднд: ҳоваеҳ свалоно [ соата پيш ۴:۵۴
> ] дрҳолекиҳ ҳоваеҳ свалоно мсвал сеоста خорҷе отаҳодеҳ орваپо др ншста
> вазерон омвар خорҷҳ отаҳодеҳ орваپо дрборҳ нздеки шдн оаضое доئме
> шварое омнета созмон млл мтаҳд бҳ таваофқе дрборҳ арضҳ мшвақ ҳо бҳ
> оерон др озое дста брдоштан таҳрон оз брномҳ ҳстаҳ ое خбр ме дҳд،
> мқомҳое оژонс бен олмлле онрژе отаме оз оҳтамол тавақф غне созе
> овароневам др оерон сخн вофтанд ва озҳор доштанд бшрط оенкиҳ омрекио
> таضмен кинд др صдд сқваط ҳкивамта осломе брнеоед، таҳрон брномҳ ҳстаҳ
> ое мтавақф ме созд. бо онкиҳ ҷрҷ бваш рئес ҷмҳваре омрекио оз оҳтамол
> бррсе мшвақ ҳое орваپо бҳ оерон др صварта дста брдоштан ҷмҳваре осломе
> оз брномҳ ҳстаҳ ое сخн вофта، киондвалезо роес вазер омвар خорҷҳ
> омрекио вофта оз омрекио خваостаҳ ншдҳ омнета рژем осломе ро таضмен
> кинд. др ҳмен ҳол рҳбр ҷмҳваре осломе вофта донш ҳстаҳ ое оендҳ
> блндмдта онрژе оерон ро таضмен ме кинд ва оен ро нбоед бҳ ҳеچ бҳоее оз
> дста дод. оз свае девор рвасеҳ др сфр оевовароеваонф дбер шварое
> омнета рвасеҳ бҳ таҳрон бор девор پешнҳод غне созе овароневам др خоки
> рвасеҳ ро бҳ оерон ороئҳ кирд.
> As you can see, there're quite a few letters missing. One thing worth
> noting is that if you have the sequece "ое", it's actually much more
> likely to be "оя".
> But while I'm at it, I'll give you the proper transliterations for the
> missed letters:
<ض > д
<ئ > it means there is another vowel, as in اسرائیل (Esraail; Israel)
<پ > п
<خ > х as in سخن > сухан (speech)
<ژ > ж as in نژادی > нажоди (ethnic)
<غ > ғ as in غول > ғӯл (gigantic)
<ص > с
<ط > т as in شرط > шарт (wager)
<چ > ч as in کوچ > кӯч (migration)
> On 28/05/06, Francis Tyers <spectre(a)ivixor.net> wrote:
> > Actually the opposite ;)
> > The italics were the ones that seemed to me to be discernable.
> > I'm working on messing around with the vowels. It would be helpful if
> > there was an "official" transliteration standard, but I can't seem to
> > find one.
> > Thanks for the link to RFE, I've actually already been trawling it to
> > download as much as I can in Tajik, then I intend to produce a wordlist
> > (possibly by frequency) and experiment with comparing "transliterated"
> > Farsi with the wordlist by edit distance.
> > It would be helpful to have a bilingual dictionary, but I don't think
> > any currently exist in machine tractable form.
> > Regards,
> > Fran
> > On Sun, 2006-05-28 at 14:37 -0700, Mark Williamson wrote:
> > > Also, I noticed that you seemed to use italics for what you took to be
> > > incorrect transliterations.
> > >
> > > In fact:
> > >
> > > "бднео" is an incorrect transliteration of "ба дунё", same with "лҳоз"
> > > and "лиҳози"; "ҳқвақ" and "ҳуқуқ"; "боҳм" and "бо ҳам", "бробрнд" and
> > > "баробаранд"; "ҳмҳ" and "Ҳама"; "ваҷдон" and "виҷдонанд" (except for
> > > the -and suffix); "нсбта" and "нисбат"; "бекидевор" and "ба якдигар";
> > > and possibly even "бо рваҳ бробре" and "бародарвор".
> > >
> > > This is because not all vowels are always explicitly indicated in
> > > Farsi. The only way to know is to be a native speaker or to use a
> > > dictionary.
> > >
> > > Mark
> > >
> > > On 28/05/06, Mark Williamson <node.ue(a)gmail.com> wrote:
> > > > ...having said that, that won't fix the fact that Tajik uses more
> > > > Russian loanwords than Farsi.
> > > >
> > > > Mark
> > > >
> > > > On 28/05/06, Mark Williamson <node.ue(a)gmail.com> wrote:
> > > > > Hi Francis,
> > > > >
> > > > > I have a suggestion to improve this software: build a corpus from the
> > > > > Tajik RFE website http://www.ozodi.org/
> > > > >
> > > > > As you can pretty obviously tell, not all vowels are indicated in
> > > > > Farsi, so in some cases there *should* be multiple candidates for
> > > > > transliteration. For example, Farsi "yeh" can be transliterated in a
> > > > > number of different ways.
> > > > >
> > > > > In these cases, a simple search of the corpus should reveal which
> > > > > alternative is an actual word, or which is most frequent, and select
> > > > > it.
> > > > >
> > > > > Mark
> > > > >
> > > > > On 28/05/06, Francis Tyers <spectre(a)ivixor.net> wrote:
> > > > > > Are there actually any Tajik native speakers working on the Tajik
> > > > > > Wikipedia at the moment?
> > > > > >
> > > > > > I'd like to discuss some software I'm making with them...
> > > > > >
> > > > > > http://188.8.131.52/~spectre/tajik/tajik.php
> > > > > >
> > > > > > I've had a look over at tg. but it seems to be very inactive.
> > > > > >
> > > > > > Regards,
> > > > > >
> > > > > > Fran
> > > > > >
> > > > > > _______________________________________________
> > > > > > Wikipedia-l mailing list
> > > > > > Wikipedia-l(a)Wikimedia.org
> > > > > > http://mail.wikipedia.org/mailman/listinfo/wikipedia-l
> > > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Refije dirije lanmè yo paske nou posede pwòp bato.
> > > > >
> > > >
> > > >
> > > > --
> > > > Refije dirije lanmè yo paske nou posede pwòp bato.
> > > >
> > >
> > >
> > _______________________________________________
> > Wikipedia-l mailing list
> > Wikipedia-l(a)Wikimedia.org
> > http://mail.wikipedia.org/mailman/listinfo/wikipedia-l
I thought some people might be interested that Larry Sanger's written an
article clarifying what he has in mind for a collaborative model that
improves on what he sees as Wikipedia's shortcomings:
(I don't agree with most of the article, but not all his suggestions are
completely wrong, either.)
The User Apcbg is using Wikimedia's space to violate copyrights and for his
own personal essays on several WIkimedias. He has so far focused on the
Hispanic wikis like es:, gl:, ca:, eu:, pt:, and lad:
This message is to alert the Wikipedian community and to request that they
keep an eye out for this behavior. When seen, the user should be warned and
the undesirable content should be deleted.
It's recently become clear to us, having met with chapter organizers
and talked with various members of the community, that the role of the
newly formed Chapters Committee is still not well-understood (or
well-known). Here's an overview:
The Chapters Committee  is a Wikimedia committee  created on 15
January 2006 to coordinate Wikimedia Foundation efforts regarding
local chapters. Its duties include:
* Facilitating the creation of Wikimedia chapters , acting in an
advisory capacity and responsible for granting final approval of
* Acting as the Foundation's point of contact for chapters, and the
line of communication with chapters from within the Foundation.
* Negotiating agreements between the Foundation and chapters, as for
fundraising, sponsorship, and trademark use.
* Coordinating interchapter communication.
* Generally acting on behalf of the Foundation to serve its interests
in chapter matters.
A more complete overview of the work of the Chapters committee can be
found here: http://meta.wikimedia.org/wiki/Chapters_committee/Scope_and_area_of_delegat…
The committee consists of five members:
* Nathan Carter (m:User:Cartman02au)
* Łukasz Garczewski (m:User:TOR)
* Austin Hair (m:User:Austin) - vice chair
* Delphine Ménard (m:User:Notafish) - chair
* Hari Prasad Nadig (en:User:HPN)
And two advisers:
* Arne Klempert (m:User:Akl)
* Andrew Lih (en:User:Fuzheado)
In a word or in a hundred, if you don't know what a chapter is, if you
want to create a chapter, if you wonder what a chapter can do, if you
want to get in touch with a chapter, do not hesitate to contact us,
either on IRC in #wikimedia-chapters or by mailing chaptercommittee-l
AT wikimedia PUNTO org. English is preferred, so that all of us
understand, but we get by in Spanish, French, German, Polish, Kannada,
Italian, Hindi, Sanskrit, Chinese...and I think that's it.
for the Chapters Committee
PS. Could the translator-l addressees translate/forward this message
to their language list? Thank you!
I'm wondering where I can find how to fix the sort order in Thai Wikipedia.
The problem is that, in Thai language, words starting with a vowel alphabet
need to use the second alphabet (consonant alphabet) as the sort key but the
Mediawiki always use the first alphabets sorting by its unicode.
I searched some and found out that someone already asked this question in
(#32, Oct 05). And also the related one is already mentioned in unicode
collation algorithm for Thai/Lao (
http://www.unicode.org/unicode/reports/tr10/ in #3.1.3 Rearrangement). And
also I found someone talking about LC_COLLATE in English WP for sorting
specific alphabet order (but I really don't know about this).
Thank you for any answer.