Hi Gerard, ok, let us do this in private mail:
We have a letter by a university professor informing us that bitter fights are waged over what is "correct" spelling in nds. Anyway the fact that you have your sources in itself only proves that you can attribute the information that you provide to an orthography. "Werner Eichelberg sien dollet Wöörbook" is the source of Sabine's list; Werner indicated that the source were articles that were translated from deutschplatt.
So do you or does Sabine have the original articles that led to the list by Werner Eichelberg?
Your assertion that this must be incorrect is based on the availability of the resources that you have. There are some 200 valid orthographies and your assertion that some words *must *be wrong can be substantiated when you have considered them all. The sheer fact that this source has been indicated for several years as a good resource on the nds.wikipedia must count for something, (it is not just any old website :)
No this is not the way to do it. I have shown examples that were wrong. In all there have been three people in nds.wiktionary.org, all familiar with the language who state that this list is of low quality. We have given examples. In such a case it is then up to those who advocate *for* the list that this list conforms to some standard. We have given enough evidence. The main example is "abschreim". This is High German Slang. The evidence for this is 1) the word exists as such in High German 2) it uses the prefix "ab". Low Saxon uses "af-" here. I can document that for North Low Saxon in general (Sass), Schleswig-Holstein Westcoast (Neuber), Mecklenburg (Hermann-Winter). 3) it uses "-ei." (German) instead of "-i-" (Low Saxon) 4) a google search lists 249 instances, most of them written in south German slang. Search for "abschreim dat för" to filter out High German by also searching for two very common LS words (neuter article and a common preposition) and you get deutschplatt and a mirror of it.
So this shows that this entry has nothing to do with being one of 200 valid orthographies. And now it is really up to you to show which LS orthography this word conforms to.
You again assert that they are misspellings. Given that Sabine is in the process of getting more resources for this discussion, it would be prudent to give it some time and not insist on instant resolution because this is not feasible.
To me that sounds like you are evading a discussion.
First of all I have done nothing here; I have not imported the list, but given your point of view that only what you know to be correct should be inserted I do agree that what Sabine did is in line with the Wiki tradition. She provides information that people can comment on.
Do you really think that people in it.wiktionary.org will comment on that? If you want comments, you can get them in nds.wiktionary.org, where we are doing exactly that. And therefore I really cannot understand why she uploaded this data into it.wiktionary.org. This just means that when UW will be available, we will get all that data again. We can clean things up in nds.wiktionary.org, but all the problems will reappear then.
Again, I urge you to identify the words that you know to be correct for the orthography that they represent. This will ultimately give us a list of words that cannot be attributed to any orthography because they are wrong; they will then be indicated for what we will know at that time.
I think it is not neccessary to flag all words as nds-sass. The data that I have entered/corrected is generally correct. If that is not sufficient for you, I would still propose to have the main heading flagged with -nds- and to have subheadings indicating the spellings and areas where this is correct. To me it does not make so much sense to have all entries flagged as "Plattdüütsch (Sass, Noordneddersassisch)". Or is that really what you want to have? For me (and I think that the others from nds.wikipedia.org do think the same) it would be sufficient to list deviant forms as such and I would also like to place notes in the articles linking to other forms. But that seems to be impossible? My suggestion is to work through the list to clean up what can be easily cleaned up and then to have a look at what remains.
No Gerard, *you* have not delivered substantiated facts. Why haven't you done that all along. This discussion has been going on for several weeks now. And your only argument is that you found this spelling somewhere on the internet and therefore it is a valid spelling. You could of course try to import all this data with the tag "very private spelling of xy", but then I really have ask who should profit from that? Low Saxon is in a bad shape nowadays. And an nds.wiktionary.org needs to present data that reflect actual current usage of words and not private spellings. If there should be a place for very private spellings in wiktionary or UW, then certainly *after* inserting the real, current use as substantiated by dictionaries etc. of whatever spelling. What I have been doing is cleaning up (as can be seen in nds.wiktionary.org/wiki/Special:Recentchanges). And I do think that this has helped to make the data a lot more relevant.
I am not party in this really because I am not gaining access to new resources. What I am doing is showing that what is being done is relevant and an acceptable way of going forward.
Sorry, I do not see that you have shown that the list is relevant.
To make it relevant you have to state what orthography a word is http://nds.wiktionary.org/wiki/ankieken (a recently changed article) does not indicate an orthography and is from my point of view as relevant as any of the stuff Sabine uploaded.
I moved it to the basic form. The word is common Low Saxon and does not require any further flagging.
As mentioned before, this is the pot calling the kettle black. Start indicating orthographies and you prove the quality of your contributions.
Words that are common Low Saxon do not need further flagging. Besides I have not seen such a thing in any other wiktionary.
The software we are using is known to you; we use the pywikipedia bot software. No problems there. Generating the source for the bot is something that is often different depending on what we have for input. It is a typical handjob. If you have a list with Sass compliant words or a list with words in another orthography (preferably with at least one translation) I am quite happy to make you a source so that you can upload this. Are these KDE files .po files ??
I have had a look at the bot and I have had a look at the import file that Sabine sent me by Email. (Which uses a different markup for start/end than in the word file.) I have tried to understand the way pywikipedia bot works for wiktionary, but I have not understand it. I would really be grateful for a short examle consisting of 1) a short import file with 2-5 entries 2) the command line to use. I would be able to work from there. The data is in http://sourceforge.net/projects/aspell-nds. It is a word list that we derived from the KDE po-files. We filtered out the Low Saxon parts (the po file maps English to Low Saxon), broke it into words, sorted it, counted it (with a short shell script) and used that to find inconsistencies in our KDE op Platt. We then transformed this list into an input file for the aspell spell checker. This list uses some words with a spelling deviant of Sass: latin based words are submitted to the double vowel for long vowels. I would correct that and add the German translation plus the grammar information (noun, verb etc). The list is about 1500 words, and I think that it should be relatively easy to create a list of several hundred words for importing quite quickly.
Is there really no way for us to cooperate? Does anyone else here understand what I am talking about for all this time?
There are many ways in which we can cooperate but the bottom line is; to improve nds content you have to indicate the orthography because without it, the quality of the information is debatable. So again let us work together and agree that knowing the orthography is key to proving the worth of individual lemmas.
Well, that would mean flagging all my entries as nds-sass? Is that what you want? That still leaves the question what to do with the current list. 1) I would like to edit it (preferably as text file, because that is so much faster) and then get it imported. That would have to be done by Sabine, as she wants to see the original author attributed, or would it be sufficient to indicate that in the checkin comment? (I really do need a short example to see how the robot works.) 2) Concerning the data of deutschplatt: I would really like to see that data *not* included unless the underlying spelling system can be confirmed in some way. I think it does not make much sense to import this data without any flagging, as there are grave doubts about lots of words (as I have been telling for quite a long way). I would not object inserting that data if 2.1) the spelling system and the content can be substantiated. From what I had started to work through, about half the entries cannot be substantiated, and that is far too much. 2.2) the entries get flagged accordingly. 3) I would really like to find a way that can reuse the nds entries. I would not like to see vain repetitions in "is" with a heading {{-nds-sass}}, a heading {{nds-harte}} etc all with a complete set of translations.
Would that be a proposal that you can live with?
One other thing: is there a way to invert entries to get the German=>LS entries prepared, so that they only need little rework or is it neccessary to do that by hand?
NB this whole exchange of e-mails is not really relevant to the Wikipedia-l so I will only answer from now on at the Wiktionary-l
I have not subscribed there so far. Hence the private mail.
Kind regards,
Heiko Evermann