Heiko Evermann wrote:
Hi Gerard, ok, let us do this in private mail:
We have a letter by a university professor informing us that bitter fights are waged over what is "correct" spelling in nds. Anyway the fact that you have your sources in itself only proves that you can attribute the information that you provide to an orthography. "Werner Eichelberg sien dollet Wöörbook" is the source of Sabine's list; Werner indicated that the source were articles that were translated from deutschplatt.
So do you or does Sabine have the original articles that led to the list by Werner Eichelberg?
No Werner has these origninal articles.
Your assertion that this must be incorrect is based on the availability of the resources that you have. There are some 200 valid orthographies and your assertion that some words *must *be wrong can be substantiated when you have considered them all. The sheer fact that this source has been indicated for several years as a good resource on the nds.wikipedia must count for something, (it is not just any old website :)
No this is not the way to do it. I have shown examples that were wrong. In all there have been three people in nds.wiktionary.org, all familiar with the language who state that this list is of low quality. We have given examples. In such a case it is then up to those who advocate *for* the list that this list conforms to some standard. We have given enough evidence. The main example is "abschreim". This is High German Slang. The evidence for this is
- the word exists as such in High German
- it uses the prefix "ab". Low Saxon uses "af-" here. I can document that for
North Low Saxon in general (Sass), Schleswig-Holstein Westcoast (Neuber), Mecklenburg (Hermann-Winter). 3) it uses "-ei." (German) instead of "-i-" (Low Saxon) 4) a google search lists 249 instances, most of them written in south German slang. Search for "abschreim dat för" to filter out High German by also searching for two very common LS words (neuter article and a common preposition) and you get deutschplatt and a mirror of it.
So this shows that this entry has nothing to do with being one of 200 valid orthographies. And now it is really up to you to show which LS orthography this word conforms to.
You are missing the point in that you want to make a rule from an example.
You again assert that they are misspellings. Given that Sabine is in the process of getting more resources for this discussion, it would be prudent to give it some time and not insist on instant resolution because this is not feasible.
To me that sounds like you are evading a discussion.
At this moment there is not much point to the discussion. It can wait until we have better resoures.
First of all I have done nothing here; I have not imported the list, but given your point of view that only what you know to be correct should be inserted I do agree that what Sabine did is in line with the Wiki tradition. She provides information that people can comment on.
Do you really think that people in it.wiktionary.org will comment on that? If you want comments, you can get them in nds.wiktionary.org, where we are doing exactly that. And therefore I really cannot understand why she uploaded this data into it.wiktionary.org. This just means that when UW will be available, we will get all that data again. We can clean things up in nds.wiktionary.org, but all the problems will reappear then.
We will get comments from the people that Sabine is contacting.
Again, I urge you to identify the words that you know to be correct for the orthography that they represent. This will ultimately give us a list of words that cannot be attributed to any orthography because they are wrong; they will then be indicated for what we will know at that time.
I think it is not neccessary to flag all words as nds-sass. The data that I have entered/corrected is generally correct. If that is not sufficient for you, I would still propose to have the main heading flagged with -nds- and to have subheadings indicating the spellings and areas where this is correct. To me it does not make so much sense to have all entries flagged as "Plattdüütsch (Sass, Noordneddersassisch)". Or is that really what you want to have? For me (and I think that the others from nds.wikipedia.org do think the same) it would be sufficient to list deviant forms as such and I would also like to place notes in the articles linking to other forms. But that seems to be impossible? My suggestion is to work through the list to clean up what can be easily cleaned up and then to have a look at what remains.
You are wrong. Without flagging words as nds-sass what you do is not relevant. There is no such thing as "generally correct" if it is correct it can be atributed to one or more orthographies. If this cannot be done, the quality is as debatable as the stuff you object to. By indicating that something is Sass, there is a black and white situation by saying that something is nds it can be correct for any of the 200 orthographies and at this moment in time I do not take your word for it being non-nds.
No Gerard, *you* have not delivered substantiated facts. Why haven't you done that all along. This discussion has been going on for several weeks now. And your only argument is that you found this spelling somewhere on the internet and therefore it is a valid spelling. You could of course try to import all this data with the tag "very private spelling of xy", but then I really have ask who should profit from that? Low Saxon is in a bad shape nowadays. And an nds.wiktionary.org needs to present data that reflect actual current usage of words and not private spellings. If there should be a place for very private spellings in wiktionary or UW, then certainly *after* inserting the real, current use as substantiated by dictionaries etc. of whatever spelling. What I have been doing is cleaning up (as can be seen in nds.wiktionary.org/wiki/Special:Recentchanges). And I do think that this has helped to make the data a lot more relevant.
I am not party in this really because I am not gaining access to new resources. What I am doing is showing that what is being done is relevant and an acceptable way of going forward.
Sorry, I do not see that you have shown that the list is relevant.
I am talking about how Sabine is working towards a resolution. I am not talking about this list here.
To make it relevant you have to state what orthography a word is http://nds.wiktionary.org/wiki/ankieken (a recently changed article) does not indicate an orthography and is from my point of view as relevant as any of the stuff Sabine uploaded.
I moved it to the basic form. The word is common Low Saxon and does not require any further flagging.
I disagree if a word is to be correct we need to know for what orthographies it is correct. If this cannot be done or if you do not want to specify this, you devalue your work.
As mentioned before, this is the pot calling the kettle black. Start indicating orthographies and you prove the quality of your contributions.
Words that are common Low Saxon do not need further flagging. Besides I have not seen such a thing in any other wiktionary.
We do a similar thing with Chinese where we indicate if it is Traditional or Simplified Chinese. We do need flagging without it it is not clear that it is correct.
The software we are using is known to you; we use the pywikipedia bot software. No problems there. Generating the source for the bot is something that is often different depending on what we have for input. It is a typical handjob. If you have a list with Sass compliant words or a list with words in another orthography (preferably with at least one translation) I am quite happy to make you a source so that you can upload this. Are these KDE files .po files ??
I have had a look at the bot and I have had a look at the import file that Sabine sent me by Email. (Which uses a different markup for start/end than in the word file.) I have tried to understand the way pywikipedia bot works for wiktionary, but I have not understand it. I would really be grateful for a short examle consisting of
- a short import file with 2-5 entries
- the command line to use.
I would be able to work from there. The data is in http://sourceforge.net/projects/aspell-nds. It is a word list that we derived from the KDE po-files. We filtered out the Low Saxon parts (the po file maps English to Low Saxon), broke it into words, sorted it, counted it (with a short shell script) and used that to find inconsistencies in our KDE op Platt. We then transformed this list into an input file for the aspell spell checker. This list uses some words with a spelling deviant of Sass: latin based words are submitted to the double vowel for long vowels. I would correct that and add the German translation plus the grammar information (noun, verb etc). The list is about 1500 words, and I think that it should be relatively easy to create a list of several hundred words for importing quite quickly.
Send me your Sass list and I will send you back an import list with a command how you can upload it. This will include the Sass indications
Is there really no way for us to cooperate? Does anyone else here understand what I am talking about for all this time?
There are many ways in which we can cooperate but the bottom line is; to improve nds content you have to indicate the orthography because without it, the quality of the information is debatable. So again let us work together and agree that knowing the orthography is key to proving the worth of individual lemmas.
Well, that would mean flagging all my entries as nds-sass? Is that what you want? That still leaves the question what to do with the current list.
- I would like to edit it (preferably as text file, because that is so much
faster) and then get it imported. That would have to be done by Sabine, as she wants to see the original author attributed, or would it be sufficient to indicate that in the checkin comment? (I really do need a short example to see how the robot works.) 2) Concerning the data of deutschplatt: I would really like to see that data *not* included unless the underlying spelling system can be confirmed in some way. I think it does not make much sense to import this data without any flagging, as there are grave doubts about lots of words (as I have been telling for quite a long way). I would not object inserting that data if 2.1) the spelling system and the content can be substantiated. From what I had started to work through, about half the entries cannot be substantiated, and that is far too much. 2.2) the entries get flagged accordingly. 3) I would really like to find a way that can reuse the nds entries. I would not like to see vain repetitions in "is" with a heading {{-nds-sass}}, a heading {{nds-harte}} etc all with a complete set of translations.
*The sheer fact that you do not indicate an orthography makes what you do as valuable as the work that you criticize. Until you start indicating orthographies there is no way forward; it is essential in making plain what is correct and what is not.
*Without explicit repetitions, how would you indicate if something is correct in a certain orthography ?
*From my point of view, like with Chinese words that are only zh can be either orthography. It is only when we know that a word is simplified or traditionel that we can use it for things like error-checking. The same is true for nds; without an indication of an orthography it is nothing more than an indication that it is propably correct nds.
Would that be a proposal that you can live with?
One other thing: is there a way to invert entries to get the German=>LS entries prepared, so that they only need little rework or is it neccessary to do that by hand?
NB this whole exchange of e-mails is not really relevant to the Wikipedia-l so I will only answer from now on at the Wiktionary-l
I have not subscribed there so far. Hence the private mail.
Kind regards,
Heiko Evermann
It is high time for you to subscribe to the wiktionary mailing list because only then can you discuss things that have to do with lexicological information. The people on the Wikipedia list do not really apreciate what this is about.
Thanks, GerardM