[Wikipedia-l] Box characters and question marks

Matthew Woodcraft mattheww at chiark.greenend.org.uk
Sun Sep 1 18:14:26 UTC 2002


On Sun, Sep 01, 2002 at 09:48:03AM -0700, lcrocker at nupedia.com wrote:
> The problem is that /some/ character sets, notably Microsoft
> Windows code page 1252, /do/ use those character codes for things
> like curly quotes and em dashes.  To be correctly encoded for
> Wikipedia, they should be changed to HTML entities referencing
> either the character name (e.g., "lsquo"), or the correct Unicode
> value.  Copying Windows text with those things directly into
> Wikipedia creates the illegal characters.  When we see that
> happen, we should try to figure out what they're supposed to be
> and replace them with correct ones.

You might consider adding an
  accept-charset="ISO-8859-1"
attribute to the main content form. It won't do any good, but at least
you'll have the satisfaction of knowing that the characters are
officially someone else's bug.

-M-



More information about the Wikipedia-l mailing list