I've been pondering. There are some typos and common spelling mistakes which are no-brainers to catch and fix, but require a lot of grunt work. Would it be possible, or desirable, to have a system which does a search for one particular error at a time, eg "recieve", and corrects all the pages it finds in batches, of say 20, to avoid loading the server? Obviously, this will deal with only a fraction of the overall typo checking, since many things depend on context (eg "its" / "it's"). Would this be best implemented as a system tool, or as a seperate script (like the Eason's Bible Dictionary import)? -- tarquin
I think it should be a general policy that we should not allow any automatic process to alter content. The value of Wikipedia is that real human beings with understanding and judgment have edited the pages.
What if there were a page about common English mispellings that had "recieve" there intentionally? Or perhaps it could be someone's name, or a foreign word used in context. I remember having a Vietnames co-worker named "Teh" who was constantly running into the problem of spell checkers changing his name to "The".
lcrocker@nupedia.com writes:
Would it be possible, or desirable, to have a system which does a search for one particular error at a time, eg "recieve", and corrects all the pages it finds in batches, of say 20, to avoid loading the server? Obviously, this will deal with only a fraction of the overall typo checking, since many things depend on context (eg "its" / "it's").
...
I think it should be a general policy that we should not allow any automatic process to alter content. The value of Wikipedia is that real human beings with understanding and judgment have edited the pages.
ACK.
What if there were a page about common English mispellings that had "recieve" there intentionally? Or perhaps it could be someone's name, or a foreign word used in context. I remember having a Vietnames co-worker named "Teh" who was constantly running into the problem of spell checkers changing his name to "The".
I see the problem, but imagine this like a search and replace function in a texteditor: I usually don't hit "replace all", but look at every particular instance and decide, if it should be replaced.
And something like this, if used in a responsible way (maybe only by admins) would be a very valuable tool for the wikipedia.
So imagine a list of search results, all showing a few lines a content around to decide if it's a mispelling (and a link to the article if this is not enough information for you), and a checkbox beside each result to include it in the replace or not. All still depends on human judgement, but facilitates the editing a lot.
greetings, sorry for the bad english elian
elian wrote:
I see the problem, but imagine this like a search and replace function in a texteditor: I usually don't hit "replace all", but look at every particular instance and decide, if it should be replaced.
And something like this, if used in a responsible way (maybe only by admins) would be a very valuable tool for the wikipedia.
So imagine a list of search results, all showing a few lines a content around to decide if it's a mispelling (and a link to the article if this is not enough information for you), and a checkbox beside each result to include it in the replace or not. All still depends on human judgement, but facilitates the editing a lot.
You mean a spellchecker :)
One solution is to take an article that you think has a LOT of typoes in it and run it through the built-in spellchecker in Word or another word processer. But then you still have to edit the page...
On Friday 13 September 2002 19:45, lcrocker@nupedia.com wrote: li'o
What if there were a page about common English mispellings that had "recieve" there intentionally? Or perhaps it could be someone's name, or a foreign word used in context. I remember having a Vietnames co-worker named "Teh" who was constantly running into the problem of spell checkers changing his name to "The".
Well, I don't know anyone whose real name is "The", but we both know someone whose real name is "And". Do any of you know anyone else whose name is a cmavo in English or some other widely spoken language?
phma
lcrocker@nupedia.com wrote:
Would it be possible, or desirable, to have a system which does a search for one particular error at a time, eg "recieve", and corrects all the pages it finds in batches, of say 20, to avoid loading the server?
I think it should be a general policy that we should not allow any automatic process to alter content. The value of Wikipedia is that real human beings with understanding and judgment have edited the pages.
What if there were a page about common English mispellings that had "recieve" there intentionally? Or perhaps it could be someone's name, or a foreign word used in context. I remember having a Vietnames co-worker named "Teh" who was constantly running into the problem of spell checkers changing his name to "The".
I see your point. I was thinking of something that only tackles one particular mis-spelling at a time, and not a full spell-checker for that reason; but I see that even a batch replace of all "recieve" could break things like: 'a common misspelling of "receive" is "recieve"' 'Recieve is a city in Brazil" (a feasible typo for Recife) and so on.
Something that gives a list of, say 10 occurences, with context, and asks for check-box confirmation of each one might be okay -- but I think it should search for only one particular mis-spelt word. Overall, the task of checking spelling and grammar has to be a human one.
On Fri, Sep 13, 2002 at 04:45:48PM -0700, lcrocker@nupedia.com wrote:
There are some typos and common spelling mistakes which are no-brainers to catch and fix, but require a lot of grunt work. Would it be possible, or desirable, to have a system which does a search for one particular error at a time, eg "recieve", and corrects all the pages it finds in batches, of say 20, to avoid loading the server?
Why doesn't someone just add a function, that before submitting every article/change, it runs a spell checker, writes all the ``mistakes'' and possible ``corrections'' along with an editable field of the article bellow. This way the editor can see his/her mistakes and correct them very easily.
Note that I don't propose anything like: ,,You have written 'recieve', would you like to change it [Y/N]'', but rather a _list_ off all the mistakes displayed at once. Thus you don't _have_to_ make some extra effort, it just helps you to realize that this and that word may be bogus.
Something like:
recieve => receive, ... interestig => interesting, ... etc.
<field for editing article>
[preview] [submit]
I think it could also correct all the mistakes at majority of the pages very quickly since with _every_ edit, the person who is editing is noticed about these possible problems. (But I'm sure it wouldn't work if you only add a button ``spell-checking'', you have to display it _every_ time!) The editor can use his own judgement to decide to fix it or not. The server load is not so heavy, because there are not so much submits and these are distributed through all the day. For me, that seems better than some one-time massive corrections. Use ispell/aspell or something.
Btw., thanx you all for Wikipedia! Hynek
Welcome to the list Hynek.
Hynek Hanke wrote:
I think it could also correct all the mistakes at majority of the pages very quickly since with _every_ edit, the person who is editing is noticed about these possible problems. ... The editor can use his own judgement to decide to fix it or not. For me, that seems better than some one-time massive corrections. Use ispell/aspell or something.
In Czech this would not be a problem, because that language is official in only one country. This makes it much easier to establish an academy to set standards for the language.
English is spoken in many countries, but most notably in the United Kingdom and the United States. Over time these two have developed different standards for language use, including spelling. Wikipedia has had to develop guidelines so that these two cultures can live side by side in articles. The proposal to have this spell check every time an article is edited could easily lead to a whole new round of edit wars.
Eclecticology
At 06:57 PM 9/14/02 +0200, Hynek Hanke wrote:
On Fri, Sep 13, 2002 at 04:45:48PM -0700, lcrocker@nupedia.com wrote:
There are some typos and common spelling mistakes which are no-brainers to catch and fix, but require a lot of grunt work. Would it be possible, or desirable, to have a system which does a search for one particular error at a time, eg "recieve", and corrects all the pages it finds in batches, of say 20, to avoid loading the server?
Why doesn't someone just add a function, that before submitting every article/change, it runs a spell checker, writes all the ``mistakes'' and possible ``corrections'' along with an editable field of the article bellow. This way the editor can see his/her mistakes and correct them very easily.
Because some of us find these things very irritating, and rarely run them on software that comes with them. Conversely, those who want such a function can copy the text into a word processor that has it, run the spell-check, and make the changes they want. We don't need to burden our programming team, or our software.
On Sat, Sep 14, 2002 at 03:32:06PM -0400, Vicki Rosenzweig wrote:
Why doesn't someone just add a function, that before submitting every article/change, it runs a spell checker, writes all the ``mistakes'' and possible ``corrections'' along with an editable field of the article bellow. This way the editor can see his/her mistakes and correct them very easily.
Because some of us find these things very irritating, and rarely run them on software that comes with them. Conversely, those who want such a function can copy the text into a word processor that has it, run the spell-check, and make the changes they want. We don't need to burden our programming team, or our software.
But you can't deny that the problem is there. I think most people wouldn't cut&past it into their spellchecker but they would correct it if they saw their mistakes directly on the screen.
Hynek
--- Hynek Hanke hanke@volny.cz wrote:
On Sat, Sep 14, 2002 at 03:32:06PM -0400, Vicki Rosenzweig wrote:
Why doesn't someone just add a function, that
before
submitting every article/change, it runs a spell
checker, writes
all the ``mistakes'' and possible ``corrections''
along with
an editable field of the article bellow. This way
the editor
can see his/her mistakes and correct them very
easily.
Because some of us find these things very
irritating, and rarely
run them on software that comes with them.
Conversely, those
who want such a function can copy the text into a
word
processor that has it, run the spell-check, and
make the changes
they want. We don't need to burden our programming
team, or
our software.
But you can't deny that the problem is there. I think most people wouldn't cut&past it into their spellchecker but they would correct it if they saw their mistakes directly on the screen.
Perhaps such a feature could be controlled via User Preferences...
Stephen Gilbert
__________________________________________________ Do you Yahoo!? Yahoo! News - Today's headlines http://news.yahoo.com
--- Hynek Hanke hanke@volny.cz wrote:
Why doesn't someone just add a function, that before submitting every article/change, it runs a spell checker, writes all the ``mistakes'' and possible ``corrections'' along with an editable field of the article bellow. This way the editor can see his/her mistakes and correct them very easily.
The original WikiWikiWeb has a similiar spell check function.
Stephen Gilbert
__________________________________________________ Do you Yahoo!? Yahoo! News - Today's headlines http://news.yahoo.com
wikipedia-l@lists.wikimedia.org