There's a list of Wiktionaries by raw size at
Do all Wiktionaries follow the same format, with one wiki article
per word, containing sections for language / part of speech /
aspects and then numbered lists for meanings? E.g.
# The frozen, crystalline state of water
# A shade of white
# Random electrical noise
# Weather when snow is falling
# Bluff draw in poker
Or is there any Wiktionary that breaks this pattern? Does this
pattern have a name? What do you call it when/if some Wiktionary
breaks this pattern?
How did we end up with disambiguation pages on Wikipedia, strictly
keeping one page per meaning of a word, but not on Wiktionary?
Is that because Wiktionary spun off before disambiguation pages
were invented on Wikipedia, and the news never spread to
Wiktionary? Or is it because the Oxford English Dictionary
differs from Encyclopaedia Britannica in this respect, and we want
to keep the best practice? Or why? One could say that all
meanings of "snow" are the same word (by etymology), and should
logically be in one page. But this is not true of "pen"
(etymology 1--4) and the keeping of foreign words of similar
spelling in the same page (Norwegian "pen" meaning "fine"). Has
there been a discussion about this, and where can that be found? I
found something from December 2002,
But the voice of reason, Imran, left the project a year later.
Another discussion took place in December 2005,
(It appears to be a December issue, so I apologize for bringing it
up a few weeks early this year.)
In the English Wiktionary, what percentage of words are in
English? And is the "long tail" of foreign languages similar over
all Wiktionaries? Is there any major Wiktionary that has a higher
concentration of words in the own language?
If the above pattern holds, a simple count of all level-2 headings
from the database dump could give the answer. For example, in the
dump of the Swedish Wiktionary, having 46500 articles and being
the 13th biggest, these level-2 headings appear most frequently:
2510 ==Svenska== Swedish
1847 ==Tvärspråkligt== Translingual
625 ==Engelska== English
343 ==Historik== Etymology
267 ==Tyska== German
245 ==Danska== Danish
230 ==Norska== Norwegian
217 ==Spanska== Spanish
217 ==Franska== French
192 ==Italienska== Italian
184 ==Nederländska== Dutch
169 ==Finska== Finnish
152 ==Polska== Polish
135 ==Serbiska== Serbian
122 ==Rumänska== Romanian
116 ==Interlingua== Interlingua
109 ==Ungerska== Hungarian
Lars Aronsson (lars(a)aronsson.se)
Aronsson Datateknik - http://aronsson.se
we've set up a blog to accompany our annual fundraiser. The headlines
from the blog will be featured in the sitenotice:
I'd like to invite you to submit posts to the blog. These posts can be
provocative, and should give compelling reasons to support the
Wikimedia Foundation. You can draft posts here:
Posts will be selected by a number of people: Cary Bass (our Volunteer
Coordinator), Sandy Ordonez (our Communications Manager), Sue Gardner
(Special Advisor to the Board), and myself. We'll probably try to have
a new post every 2-3 days at least.
Once again, the point of these posts is first and foremost to invite
the general public to donate. :-) Please submit stories in this
If you are willing to act as a moderator for comments to vet out spam
& trolling, please contact Cary Bass at <cbass AT wikimedia DOT org>.
For now, this is an experiment and as such, only in English. We will
set up blogs in other languages if this one has a measurable impact on
Thanks for any and all help!
Member of the Board
I'm writing a new Wiktionary parser and I'm wondering if anybody else
who has made or is making or wants to make a Wiktionary parser would
like to share some thoughts.
My main aim is to mine translation data to use with my other project,
Linguaphile, a language translator.
At the moment I'm parsing the XML dump file but I also want an
interface to fetch wiktext from the live Wiktionary.
I'm focusing on the English Wiktionary first because I know its
format, but I'd also like to target the other bigger Wiktionaries.
Another thing I'm thinking about is a central repository for
Wiktionary parser source code. The code I'm making now is in Perl but
I'm sure others have code in Python.
I know several people have parsed the English Wiktionary - has anybody
made parsers for other Wiktionaries yet?
Let's hear what you are working on.
Andrew Dunbar (hippietrail)
Yes, the situation is confusing and I guess we would like to have a
uniform logo but as already stated in many further discussions (please see
http://lists.wikimedia.org/pipermail/wiktionary-l/ 01/02/03 2007) people of _Wiktionary_ did not very much like the new logo and therefore most communities did not change anything.
Furthermore there is still this copyright problem (scrabble is a copyrighted trademark), please read
and also the not yet solved questions about which characters to choose in the logo.
It is a total chaos, imho.
Elisabeth Anderl (aka spacebirdy (de,is,es.wiktionary))
Husky a écrit :
> Of course, they can have a different name and in the case of the 'old'
> Wiktionary logo, also different text. But the general style of the
> logo should be the same for all language versions. This is not the
> case with the two Wiktionary logos, they are distinctively different.
> If you have a project that should be recognized all over the world, by
> anyone, without any confusion you shouldn't use two logos.
> Let's assume for example that the German Wikipedia would use a
> different logo than the English Wikipedia. That would be very
> confusing for everyone. A logo is the first thing people remember and
> notice about a website, and they use it to recognize it from other
> -- Hay / Husky
> On Nov 14, 2007 1:53 PM, Thomas Goldammer <thogol(a)googlemail.com> wrote:
>> 2007/11/14, Husky <huskyr(a)gmail.com>:
>>> I think this should be an important issue. It can be confusing for
>>> people visiting the website in multiple languages. All projects should
>>> have an uniform logo.
>> Why? They have different layouts of the entries, different
>> conventions, and of course they are in different languages. So why
>> shouldn't they have different logos? :o)
>> Thogo. (de.wiktionary)
>> foundation-l mailing list
>> Unsubscribe: http://lists.wikimedia.org/mailman/listinfo/foundation-l
> foundation-l mailing list
> Unsubscribe: http://lists.wikimedia.org/mailman/listinfo/foundation-l
We currently seem to be using at least two different logos for Wiktionary:
Has there been a community decision which one is to be preferred?
Toward Peace, Love & Progress:
DISCLAIMER: This message does not represent an official position of
the Wikimedia Foundation or its Board of Trustees.