Hi,
[This may have been asked before, but I couldn't find it in the archives].
I'm looking at trying to use Wiktionary, and other appplications of the
MediaWiki sw, at the enterprise level.
I'd like to be able to use Wiktionary to handle our multilingual glossaries,
help us with multilingual documentation creation [WikiBook] etc.
However a major drawback I see is that, for example, in Wiktionary [and
probably all the other mediawiki apps], the entire body contents of a page
are stored in one database field.
So, if I've got the word "internet" defined and translated into 10
languages, there's no way that I can programatically extract the
translations from the database.
In terms of translation portabilty to other apps etc that poses a big
problem. Has anyone solved it?
Similarly, in say WikiBooks, the main body of the page is all stored in one
database field. Users can enter HTML, wiki shorthand, regular text etc -
there's no enforced structure - an XML DTD etc - to the content. This again
causes portability problems to other apps etc.
Has anyone got any thoughts/solutions for this?
thanks,
-mm
I think there are still many questions open and that's why I write this
mail also to the lists and do not only add it to the page on meta:
****************
Languages and Communities
Now I read a consideration about languages and communities in Ultimate
Wiktionary that states that people who cannot speak English can't be
part of the community.
My answer to that is: no, that's not true and I will explain why below.
Then there is that other statement that it is better to have separate
language communities so that they can speak in their own language and
everyone can participate. And of course that the aim is to have a lot of
international co-operation in cross-languages and being more practical
this is normally English.
Well: this would be desirable, but it is not happening within the
Wiktionary projects – most wiktionaries really don't know what the
others do and besides very few ones there's not even much information
coming through the wiktionary-l.
Well, let's se how communication and communities work within UW.
There will be beer parlours for every language people want to have one
in. There will be people who care about communication issues (mainly
Sysops I suppose, but also other people). Now there are those local
communities talking about everyday issues and then a very important
theme comes along and it is really important to have it in all the other
beer parlours as well. So writing for example only in Italian what does
this person do? He/she writes a message and below the title there's the
message "please distribute to the other beer parlours". Normally people
caring about communications read all in "their beer parlour" and so the
one who knows both languages takes the message and transfers it for
example to the English beer parlour and adds a link to the English
message (really this can be done with the help of a template) and this
way also all the others will know about this. Now what is different to
the actual projects: these communications simply do not exist (or hardly
do exist) – it is not an automatic procedure that can involve all the
Wiktionary communities, but only for those who by chance read this/that
post. Like in all huge networks also in Wiktionary we will have local
groups and I suppose that there will be also groups that for example
focus on certain themes (like etymology, pronunciation and soundfiles,
pictures or whatever). And there will very likely the "translator-beer
parlours" since they use UW for work, they have other requirements than
most users – there will be discussions on terminology, on which term
suits best in which context etc. just like it now happens in so many
mailing lists. Many of these people are not too computer literate and so
they will come over step by step and they will need quite a lot of more
technical information. Of course anyone can contribute to any community
in whatever language he/she likes.
So it cannot be said that people who do not speak English cannot have
their community – they will have it for sure, as it is the only way to
go. English will like so often be a kind of a interface language as I
suppose it is hard to find someone who can translate from Italian to
Chinese directly or the other way round. So the Chinese beer parlour
will receive its news very likely from the English one.
For many minor languages where it actually does not make sense to open a
whole wiktionary since the language speakers are just a few – maybe only
500 or even less can have their place as well. This is not even an
option now – but within UW even this will be possible.
Think about all that African languages where probably universities will
be the first to co-operate. Well also considering the Logos contents: we
can have material in very seldom languages from the very first beginning
and so in particular this very small groups of native speakers will have
something where they can start to work on, about which they can
communicate and discuss – hopefully in their own language. It is much
easier to integrate people for new languages if there is already
something there where they can work on than having to start from scratch.
So that about communities: no fear is needed that there aren't going to
be local communities – they will surely be there, because people need it
and interlanguage communication will be for sure better.
I would even love to see local offline meetings where people that do not
actually contribute to the Wikimedia projects are invited.
************
Ciao, Sabine
___________________________________
Yahoo! Mail: gratis 1GB per i messaggi e allegati da 10MB
http://mail.yahoo.it
Mark,
Do not treat me as if I know everything about all language because I do
not. Having said that, your attitude towards languages is skewed towards
a particular point of view, you aproach it from a speaking persons point
of view. That is the traditional way that linguists, people that express
some expertise about languages, do this. I am not a linguist but I have
had the benefit of some intense information by Wolfgang Georgsdorf about
speaking and signing languages. Now from my point of view, a dictionary
has always been about written languages. Traditionally when the spoken
language was considered, there were phonetic scripts and these are only
usable to people who studied this. To the vast majority of the human
population they are worthless.
The scripts that you sum up so well, are from a practical point of view
in the same category as the phonetic scripts, they are useless to the
majority of the people that may have a need for them. Your assertion is
correct, "written English is a transcription of spoken English" but
there is a limit to this; you have to speak English to be able to
transcribe it. So in this assertion you show that you can speak English.
I aproach languages from a completely different angle than you do. I am
working on a dictionary, I am working on a dictionay on the Internet and
your point of view, while traditional, does not help me a bit in
understanding how this dictionary is going to work for deaf people. To
me it is quite simple; for a deaf person, it is necessary to know a
written language in order to use a computer. Given this skill set you
can present a user interface. It is easy to translate from that written
language to a sign language this way. However, this aproach has one
important draw back, it negates the reality of sign languages being
languages in their own right with their own vocabulary with meanings
that do not translate well.
You say that it is absurd to make a destinction between three language
types; spoken, written and sign languages. I can tell you that my
written expressiona are different from my spoken expressions. When I
speak, tone of voice is important, when I write grammar is more
relevant. I use different words depending on speaking or writing. I
think this is true for everyone yet you make it an absurdity..
I am glad that I do not have to prove that I know anything about
languages, I am happy that I know enough to create a data design for an
electronic all inclusive dictionary. I still need all the help that I
can get and I still need all comments that people can give because hey,
what do I know.
Thanks,
GerardM
Mark Williamson wrote:
>Gerard, you can find the message to which I was referring at the end
>of the present message.
>
>Please do not treat me as if I know nothing about the subject, because
>I certainly do.
>
>Many deaf people learn to speak and to read lips. I don't really think
>whether or not deaf people learn a spoken language is an issue
>regarding Wiktionary. But certainly, many deaf people DO learn spoken
>languages, even if you think (as you appear to) that spoken and
>written languages are completely distinct entities.
>
>Besides, it's absurd to state that written languages are not related
>to spoken languages. Written English is nothing more than a
>transcription of spoken English (in a standard dialect).
>
>As of yet there is no widely-used written language which does not
>correspond directly or nearly directly to a spoken or signed language.
>
>You say there are three different kinds of languages, spoken, written,
>and signed. This, too, is absurd!
>
>Written languages are always associated with a spoken language (or in
>some cases, a signed language). Thus, when we say "Finnish", we're
>referring to the language spoken by the majority of Finns, as well as
>the language written by the majority of Finns, because in the minds of
>all reasonable people they are a single entity.
>
>Do you think that if Korean kids had to learn to write
>Dutch-written-language in school rather than Korean-written-language,
>it would be just as easy? Obviously, it would not. This is because
>knowing spoken Korean makes it much easier to learn written Korean,
>and knowing spoken Dutch makes it easier to learn written Dutch,
>because there are many simple correspondences. Learning a different
>written language, however, is just as difficult as learning a
>different spoken language.
>
>But again, this isn't the main point. Please read the e-mail below:
>
>---------- Forwarded message ----------
>From: Mark Williamson <node.ue(a)gmail.com>
>Date: 24-Aug-2005 03:06
>Subject: Re: [Wikitech-l] Re: Spell checking in MediaWiki
>To: Wikimedia developers <wikitech-l(a)wikimedia.org>
>
>
>Hi,
>
>That page states that the ways for writing signed languages are closer
>to Chinese characters than Latin script.
>
>This is completely incorrect.
>
>Please see below for my suggestion on signed languages.
>
>There are 4 main ways of writing signed languages:
>
>1) With word-for-word glosses in a spoken language. For ASL or BSL
>this is usually English; for InSL it may be Hindi or another Indian
>language or English; for Chinese SL it will probably be Chinese. While
>this is suitable in most cases for writing whole sentences and
>recording syntax and grammar, it gives no specific information about
>what a sign looks like and thus is completely unsuitable.
>
>2) Sutton SignWriting. This writing system is copyrighted and use of
>it is not free. However, it is currently the most widely used of any
>of the 3 main sign-writing systems today, at least by deaf people
>(researchers are more likely to use HamNoSys or Stokoe). It is more
>like Korean letters: each part of any given symbol says something
>specific about how to form the sign, but they are combined to form
>what may appear to the uninitiated to be a logographic sign, when in
>fact it is most certainly not. More information at
>http://www.omniglot.com/writing/signwriting.htm
>
>3) HamNoSys. A very complex system that can be best compared with a
>"Narrow transcription" of a spoken language using the IPA
>http://en.wikipedia.org/wiki/Phonetic_transcription#Narrow_and_broad_transc…
>, it is used mostly by researchers. However, it's much easier to
>represent on computers than Sutton SignWriting. More information at
>http://www.sign-lang.uni-hamburg.de/Projects/HamNoSys.html
>
>4) Stokoe. Stokoe is actually, in a sense, the basis of HamNoSys. It
>is more equivalent to the Latin script than any of the other systems,
>in fact it borrows many letters from it. Its use is restricted mostly
>to researchers today. Some people accept the minor changes made by it
>by a BSL researcher, however any more drastic changes are usually
>considered to be separate systems.
>
>Suggestion: Use http://www.unm.edu/~grvsmth/signsynth/ -- data will be
>stored as a computer representation of Stokoe, but can be played back.
>Demo available at http://www.panix.com/~grvsmth/signsynth/ ...
>
>Although the native rendering for SignSynth is VRML (Virtual Reality
>Markup Language), I imagine it would be quite easy to convert
>automatically or even to make it render initially in a more
>widely-used format, such as some sort of video or animation format.
>
>This would limit the space that would be taken up by the storage of so
>many videos -- there are hundreds of signed languages on the planet
>today, and to store videos for each word in all of them (or even just
>the major ones and the 1000 most frequent words) would take up a lot
>of space.
>
>While it is obviously not perfect (look at the forehead... oh my is
>she ugly!), it's definitely good enough that someone could imitate it
>and their imitation would be the proper sign very accurately, and it's
>also good enough that anybody who can speak sign language (there are
>better terms than "speak", but they seem awkward to me) should be able
>to understand it well.
>
>Presumably, the developer of the software could be solicited for
>further cooperation.
>
>Mark
>
>On 24/08/05, Gerard Meijssen <gerard.meijssen(a)gmail.com> wrote:
>
>
>>Lars Aronsson wrote:
>>
>>
>>
>>>Sabine Cretella wrote:
>>>
>>>
>>>
>>>
>>>>Well Lars, we are not so far away from making a different point
>>>>in that. It is one of the usages we have in mind with Ultimate
>>>>Wiktionary. Since there we will have words in all languages and
>>>>have these words in a relational database it is easy to "extract
>>>>an actual spellchecker" every now and then.
>>>>
>>>>
>>>>
>>>>
>>>I keep hearing these promises, but "seeing is believing"! Have
>>>you started actual work on UW yet, or are you sitting idle while
>>>waiting for Wikidata to be released? Will there be an English
>>>free dictionary that can compete in size and quality with Aspell's
>>>current dictionary by the end of 2005? Or by the end of 2006?
>>>
>>>
>>>
>>Hoi,
>>Actual work on UW itself is underway. Here you can find the data desisgn
>>http://meta.wikimedia.org/wiki/Ultimate_Wiktionary_data_design This
>>design is very much open for comments and I am happy to say that many
>>comments that were given have led to changes. I name but a few changes
>>that came about this way; Can sign languages be included - now they can,
>>Can attestations be included - now they can.
>>
>>As Ultimate Wiktionary is dependent on Wikidata, there is little option
>>for us but to wait untill it is ready. It is really important that
>>Wikidata is done well because it will not serve only Ultimate Wiktionary
>>but also Ultimate Wiktionary.
>>
>>When both Aspell and Ultimate Wiktionary are considered Free, it should
>>be possible for us to work together. Once we find this cooperation
>>possible, we could host the data currently included in Aspell in UW. In
>>return we would provide a publicly accessible website where it is easy
>>to add new words thay will end up in Aspell. Even when we do not
>>cooperate, there will be languages that currently do not have a
>>spellchecker. These spellcheckers I am particularly exited about because
>>this is where we will be able to add value.
>>
>>Without a massive infusion of data, it will be hard to predict when we
>>have as many words as Aspell does for languages where Aspell has a
>>dictionary.
>>
>>Then again, if we create a wordcount on the Wikipedia content, run it
>>against a spellchecker, the resulting list should be spelled correctly
>>and could be included in UW. Particularly for our biggest wikipedias and
>>the amount of topics covered, it should be a list that might be close to
>>the size of what Aspell has. We will also have a long list of words
>>missing in Aspell. We will however not get a spellchecker for British or
>>American in this way.
>>
>>Thanks,
>> GerardM.
>>
>>
>>Thanks,
>> GerardM
>>
Mark Williamson wrote:
>I meant that it's unsuitable for a dictionary.
>
>Any good dictionary will tell you how a Chinese character is
>pronounced in any given variety.
>
>Chinese characters wouldn't be so difficult to learn for somebody who
>only knew a signed language (in much of the developing world, deaf
>people don't learn to read and write spoken languages), because they
>can be associated with signs on a morphemic basis.
>
>However, a spoken language written alphabetically used for glosses
>will require that any given deaf person learn said spoken language
>before being able to read a text.
>
>This also means that a deaf man in Boston who wants to know the ASL
>equivalent of the English word "boarish" will not get an answer by
>looking in a dictionary and finding the gloss "boarish" in the field
>for an ASL translation.
>
>Mark
>
>
Hoi
First of all, there are three different types of languages. There are
the written languages, the spoken languages and the signed languages.
They are quite different. A deaf person does not learn a spoken
language, he learns a written language. The difference is quite crucial.
How a written language relates to a spoken language is something that a
deaf person does not apreciate. This relation is often tenuous at best.
It makes our written language as abstract as Chines characters for deaf
people.
Your assumption that deaf people have to learn written languages is
correct. In many ways it is an essential skill. It is awkward that a
dictionary needs to have another language to make it accesible. To a
large extend this is just that, awkward. It is not a reason not to
include sign languages in Ultimate Wiktionary as UW intends to have all
words in all languages. When you have a look at a recent version of the
datadesign, you will find that I included Wolfgang Georgdorf's
methodology of providing metadata about signs in there as well.
So when someone want to find ASL for boarish, and he is literate, he
will be able to find it. Being illiterate makes using a computer
practically impossible. The idea of having a user interface for ASL will
be a dream, it will not be feasible in UW mark I. The best way it might
work is that you have the words in the UI and when you click it, you get
a signed instruction.
This discussion is not really relevant to the WIKITECH mailinglist so I
crosspost it to the WIKTIONARY mailing list where it is of interest.
Thanks,
GerardM
>On 27/08/05, Delirium <delirium(a)hackish.org> wrote:
>
>
>>Mark Williamson wrote:
>>
>>
>>
>>>1) With word-for-word glosses in a spoken language. For ASL or BSL
>>>this is usually English; for InSL it may be Hindi or another Indian
>>>language or English; for Chinese SL it will probably be Chinese. While
>>>this is suitable in most cases for writing whole sentences and
>>>recording syntax and grammar, it gives no specific information about
>>>what a sign looks like and thus is completely unsuitable.
>>>
>>>
>>>
>>>
>>Why does this make it completely unsuitable? A large proportion of
>>Chinese characters give no specific information about how they are
>>pronounced---and indeed are pronounced radically different by Mandarin,
>>Cantonese, and Japanese speakers---but that doesn't seem to have led to
>>them being deemed unsuitable for use in a written language. At the very
>>least, using Chinese characters to write Chinese SL is no worse than
>>using Kanji to write Japanese.
>>
>>-Mark
>>
I take this theme up since it is not only a Ultimate Wiktionary problem
... it is an actual problem for many minor languages.
When there are several possibilities to write a word within a language
these should be treated on an equal level. No-one may ever discriminate
one of the possible spellings. The important thing is that if these
different spellings can be attributed to a ceratain "branch" it should
be done. Those who cannot be classified just receive the general
language classification. Gerard will be able to explain this better I
suppose, but bein on his way to wikimania I am not so sure if he has
time to explain.
Anyway: I am a translator for EN-DE and IT-DE now we have a pretty weird
situation with German. There was a spelling reform that in a first place
was adopted by all federal states and this year in autumn the new
spelling should have become the only valid one... now in Germany these
decisions are not taken centrally by the federal government in Berlin,
but by the ministries for education and culture of the single federal
states. Some of these federal states, among them Bavaria, will not
accept the deadline of this year in autumn, but accept old and new
writing. It is a funny situation for my job, since it could mean that a
customer tells you that he wants the old spelling or the new spelling
depending on where he lives and what he prefers or in most cases they
will not even bother how things are written, since many don't even know
the new spelling rules ... it is really a strange situation: imagine
someone in Bavaria translates a text - legally he/she may use whatever
ortography he/she likes - if the customer lives in a state where the old
ortography is not valid anymore this can lead to a bad surprise.
But let's talk about minor languages. We have some difficulties on the
nds wiktionary - people think that the only way to write correct is
following the Sass ortography. Some days ago I had a longer telephone
conversation with one of the directors of the Institut für
niederdeutsche Sprache (institute for nds) - he explained that there are
at least six different acknowledged ways of writing nds and if we go to
details 200 to 400 ways of writing (including also dictionaries from
around 1920 etc.) can be defined - so accepting only one way of writing
is a discrimination to my opinion. They all need to be accepted - the
important thing is that there is a distinction from one to the other.
How we could achieve this - in the actual wiktionary signing all non
classified words just with nds. Words that can be classified receive
nds-ABC, nds-DEF, nds-SASS, nds-xyz. So not only the single term is to
be classified, but also the definition (if possible) - if it is not
classified there's simply no reference to a certain class.
We have a similar situation for Sicilian - it can be subdivided at least
into the different provinces, but even more detailed if one wants. The
stange thing is: there never was any discrimination, never any struggle,
some discussions, yes, but no-one ever tried to impose "one possibility
of writing a word" as the ultimate one.
Now the next problem is: if we follow the wiki-way and it comes to a
vote: if people who are convinced that ABC is the only way of possible
writing start up with a project and after some time create the vote they
all normally vote for that writing, but is there a real majority? What
if there are institutions that know more about that and these are just
not being considered since people are tooooo fond of one way of writing,
don't think about democracy and don't think about the damage they would
do to their own language an culture excluding all other possibilities?
What happens to people who would like to contribute if a two huge
ressources are starting to be non-democratic?
What happens to the language of a country/region if only one way of
spelling is considered and all the otheres are discriminated?
What happens to the culture in this case?
Even if many will not agree, I am convinced that this situation can be
compared to some african languages where culture is being lost since
people don't study in their own language, but only in a foreign
language. Language transmits culture, language transmits feelings and so
much more ... it is not possible to exclude existing language (or better
spelling) to favour a particular one only because that is the beloved
one for some people.
Going back to UW: for us every spelling that is correct has its right
"to live" - and so it will be there. If there are more possibilities:
you will find them - no discrimination. UW is there not only for the big
languages - I see it as a very particular chance for minority languages
since using it at a certain stage we will be able to create even a
papiamento-zulu glossary - something no editor would ever think about.
Something that will make the culture of many of those small cultural
groups live on.
Ultimate Wiktionary maybe is a particular name - maybe with another name
there would have been less problems ;-) but be sure: it is only a name -
it came out in a conversation, since for some language combinations it
will really be the ultimate solution since otherwise it would have been
very likely that some languages where we now only have approx. 100 terms
would never have had the possibility to have an own place to grow and
cultivate their glossaries.
Often there were loads of discussions with colleagues, people involved
in the localisation and language industry, institutions and whatever ...
we did not just say: let's do it in some way - this "let's do it in some
way that we do not need to do the same work over and over again" for me
started on 31 august 2004 - almost a year now. Before that I already
worked on finding a solution for a glossary repository in many languages
- when I found wiktionary I thought: this is near to what I want, but I
missed the interchangeability - I could not create a DE-FR combination
on the Italian wiktionary ... interwiki-links then were a step ahead,
but it still was not what I imagined. And then came that day when I
noted the first templates on it.wiktionary.org ... it was the beginning
of an amazing adventure - maybe the most incredible one I can imagine
for people working with languages. Things went ahead - hours and hours
of discussion, of sorting out, of talking with people ... I suppose many
of you cannot even imagine what it means to reach almost a dream dreamt
by many linguist over and over again - for years and years ... or maybe
it would be better to say decades.
We are almost there: this comunity is part of it - isn't that great?
Isn't only the thought to be part of a language and culture preserving
project that allows completely new results exciting? Many of us (and I
am not talking about people working on wiktionary) think so - why do you
think so many colleagues answer to a simple mail to a mailing list of
translators where I asked for co-operation for Wikimania? Those people
know my postings (I often write about wiktionary and ultimate wiktionary
- to the lists and on my blog). We are on a right way - it is not an
easy way - of course there will be problems to solve, but it will be fun
to solve them.
Hmmm ... now this mail became much longer than I really wanted ... it is
time to go to bed ... tomorrow I have the last day to finish my
translation works and the day after wikimania starts ... and I don't
know what expects our team of translators.
Well, I don't write often, but when I start it is hard to stop.
Ciao and have a great time!
Sabine
*****
Sabine Cretella
Translations IT-DE, EN-DE
s.cretella(a)wordsandmore.it
skype:sabinecretella
Meetingplace for translators.
http://www.wesolveitnet.com
___________________________________
Yahoo! Mail: gratis 1GB per i messaggi e allegati da 10MB
http://mail.yahoo.it
Hi,
The uploading of the dictionary of the French Academy (edition 1932, 33,000
articles) is against postponed due to legal reasons. We
Also some work is postponed waiting for the Ultimate Wiktionary, mainly words
which already exists in other Wiktionaries. We would like to know what is the
schedule for implementing the UW, so we can plan some of the work.
Some information about transfering from the actual Wiktionary to the UW would
be useful.
Best regards,
Yann
--
http://www.non-violence.org/ | Site collaboratif sur la non-violence
http://www.forget-me.net/ | Alternatives sur le Net
http://fr.wikipedia.org/ | Encyclopédie libre
http://www.forget-me.net/pro/ | Formations et services Linux
The Verba Volant mailing list <http://www.verba-volant.net/> sent a
mail to their subscribers today stating the service (which seems to be
sending out multilingual quotations) was "offered thanks to the
cooperation between Wikipedia <http://en.wikipedia.org/> and Logos
Group <http://www.logos.it/>."
To avoid any further confusion over this, there have been discussions
between Jimmy Wales, Logos, and other interested parties (namely,
Gerard and Sj), but any discussions have simply resulted, so far, in
some proposed ideas about how Logos might work with Wiktionary, not
Wikipedia as the mail suggested. Gerard has just put together a page
about this proposal at
<http://meta.wikimedia.org/wiki/Logos.Wiktionary>. For background
information, please see
<http://meta.wikimedia.org/wiki/Ultimate_Wiktionary>.
Discussion about any aspects of this proposal are still very welcome
on those meta pages, or on the Wiktionary mailing list, and the fact
that discussions regarding the proposed project are taking place with
external parties should not be seen as a reflection of any decision
already having been made. Comments by the existing Wiktionary
community, and anyone else interested, would be very valuable as we
move towards various decisions about the project.
Angela
Heiko Evermann wrote:
>Hi Gerard,
>
>slowly I am beginning to be fed up with this discussion. I ask you
>not to misrepresent my position any longer.
>
>
>
>>>>On the one word you told me by e-mail that was a hundred percent
>>>>error turned out to be used 186 times on the internet (I have a
>>>>screenshot of google if you don't believe it) and I found the writer
>>>>who used it in her texts - she's a reporter for a newspaper in North
>>>>Germany and has been writing articles in Pattdüütsch vor over 13
>>>>years for them now. I also contacted her to ask her how to categorise
>>>>her writings.
>>>>
>>>>
>>>Another mysterious supporting source.
>>>
>>>
>>You can ask for the name of this writer; it can be given.
>>
>>
>Please list this word, please name this writer, so that we can discuss this
>one item in detail. I
>had listed several gravely misspelled words (according to all spellings that
>I know of), so please tell us which one of them you are
>referring here. Please name your sources. You could e.g. cite any dictionary
>whatsoever to prove your point. I have given details. I can show you several
>dictionaries that support my spelling. You so far have given 0. In words
>*ZERO*. That is not enough.
>Besides you have not even given any information about the origin of the
>list.
>We know which web site it is from, but from the misspellings in
>capitalization
>it is clear that it is derived from a text. I would really like to know
>which text that is. The problem is that you so far have not done anything
>to underly your claims with any credible source. Again: citing one website
>that is mirrored to some other places is circular reasoning, but no proof.
>Again: wiktionary is not a dump for all the misspellings in the world. You
>would
>not get through with that in any other wiktionary.
>
>
We have a letter by a university professor informing us that bitter
fights are waged over what is "correct" spelling in nds. Anyway the fact
that you have your sources in itself only proves that you can attribute
the information that you provide to an orthography. "Werner Eichelberg
sien dollet Wöörbook" is the source of Sabine's list; Werner indicated
that the source were articles that were translated from deutschplatt.
Your assertion that this must be incorrect is based on the availability
of the resources that you have. There are some 200 valid orthographies
and your assertion that some words *must *be wrong can be substantiated
when you have considered them all. The sheer fact that this source has
been indicated for several years as a good resource on the nds.wikipedia
must count for something, (it is not just any old website :)
You again assert that they are misspellings. Given that Sabine is in the
process of getting more resources for this discussion, it would be
prudent to give it some time and not insist on instant resolution
because this is not feasible.
>
>
>>>>So: anything is out of discussion here. I am not going let me impose
>>>>things by anyone, I prefer research and adapt the contents we have to
>>>>that..
>>>>
>>>>
>Now the thing is that you wanted to impose things on us. Besides you created
>facts
>by importing this list into it.wiktionary.org. A list that is highly
>suspicious.
>
>
>
First of all I have done nothing here; I have not imported the list, but
given your point of view that only what you know to be correct should be
inserted I do agree that what Sabine did is in line with the Wiki
tradition. She provides information that people can comment on. Again, I
urge you to identify the words that you know to be correct for the
orthography that they represent. This will ultimately give us a list of
words that cannot be attributed to any orthography because they are
wrong; they will then be indicated for what we will know at that time.
>>When Heiko is to ignore substantiated facts, he will make what he does
>>in the Wiktionary world irrelevant. This would not be about ediwars. I
>>do not expect that Heiko will get into an editwar as there is a perfect
>>solution and that is using proper labelling. When Heiko indicates words
>>to be according to the Sass orthography or whatever Heiko orthography
>>knows, his work will be most relevant. There are plenty possible
>>solutions here.
>>
>>
>No Gerard, *you* have not delivered substantiated facts. Why haven't you
>done that
>all along. This discussion has been going on for several weeks now. And your
>only
>argument is that you found this spelling somewhere on the internet and
>therefore
>it is a valid spelling. You could of course try to import all this data with
>the tag
>"very private spelling of xy", but then I really have ask who should profit
>from that?
>Low Saxon is in a bad shape nowadays. And an nds.wiktionary.org needs to
>present data
>that reflect actual current usage of words and not private spellings. If
>there should be
>a place for very private spellings in wiktionary or UW, then certainly
>*after* inserting
>the real, current use as substantiated by dictionaries etc. of whatever
>spelling. What I have been doing is cleaning up (as can be seen in
>nds.wiktionary.org/wiki/Special:Recentchanges). And I do think that this has
>helped to make the data a lot more relevant.
>
>
>
I am not party in this really because I am not gaining access to new
resources. What I am doing is showing that what is being done is
relevant and an acceptable way of going forward.
To make it relevant you have to state what orthography a word is
http://nds.wiktionary.org/wiki/ankieken (a recently changed article)
does not indicate an orthography and is from my point of view as
relevant as any of the stuff Sabine uploaded. Low Saxon may be in a bad
way, it is helped by working together and doing a good job in
categorizing words to the orthographies that are in use. It is not
helped by us endlessly exchanging mail. I will take you more seriously
when you start indicating orthographies. You can trust Sabine that she
is writing to all these resources and I expect that it will lead to
wordlist that will indicate what orthography they belong to. This in
turn will make what we both aim for more objective; a good resource for nds.
>>>>I did not once mention your name and what I had/still have is a
>>>>general question - it does not happen the first time that people know
>>>>how to improve things, but just complain about others not doing as
>>>>the writer supposes - and talking about wikis: this is not the way to
>>>>go and that's it. It is a very general question. If you need proofs
>>>>for that: they are there in the histories.
>>>>
>>>>
>The same accusations that Sabine put in her anonymous mail to this list were
>also
>found in her answer to my post to it.wiktionary.org, at least to my (albeit
>limited)
>understanding of Italian. And therefore this forum is indeed a place to
>discuss these things.
>
>
>
>>>If someone contents something you should listen to them and go in a
>>>discussion. Not just boldly go on and add the stuff elsewhere!
>>>
>>>
>>Here you show that you do not apreciate what Wiktionary is about; every
>>wiktionary is about all words in all languages. Therefore it is
>>completely acceptable to add this content in the Italian Wiktionary.
>>
>>
>Then please mark these entries as what they are: Low Saxon in an awful
>quality,
>full of errors and following an unsubstantiated very private spelling of one
>individual.
>Or prove otherwise, which so far (even after lots of mails and discussions)
>you have not done.
>
>
>
As mentioned before, this is the pot calling the kettle black. Start
indicating orthographies and you prove the quality of your contributions.
>Please do at least try, so that we (you and I) can discuss the real issue:
>the quality of your data.
>Again: this is not Sass-spelling vs. the world, but one very private,
>inconsistent, doubtful spelling
>against the rest of Low Saxon, and we will really clean this mess up in
>nds.wiktionary.org.
>At least unless you back up your position with real facts. Now this should
>not be too difficult,
>shouldn't it?
>
>And perhaps we can then go back to work, because there really is a lot of
>work to do and the
>list could already have been cleaned up, if we didn't have this discussion
>and if you then
>import the corrected data for us, or if you give us acces to the import
>script. (Which by the way
>I have asked for several times so far. I also need it because I have quite a
>long list of words (apart
>from your list) derived from the Low Saxon translation of KDE that I would
>like to import.)
>
>
>
The software we are using is known to you; we use the pywikipedia bot
software. No problems there. Generating the source for the bot is
something that is often different depending on what we have for input.
It is a typical handjob. If you have a list with Sass compliant words or
a list with words in another orthography (preferably with at least one
translation) I am quite happy to make you a source so that you can
upload this. Are these KDE files .po files ??
>Is there really no way for us to cooperate? Does anyone else here understand
>what I am talking about for all this time?
>
>
>
There are many ways in which we can cooperate but the bottom line is; to
improve nds content you have to indicate the orthography because without
it, the quality of the information is debatable. So again let us work
together and agree that knowing the orthography is key to proving the
worth of individual lemmas.
NB this whole exchange of e-mails is not really relevant to the
Wikipedia-l so I will only answer from now on at the Wiktionary-l
Thanks,
GerardM