[Wikipedia-l] Box characters and question marks

lcrocker at nupedia.com lcrocker at nupedia.com
Sun Sep 1 16:48:03 UTC 2002


Character codes 128..160 don't exist in ISO-8859-1, or in
Unicode.  They're empty, illegal codes that represent nothing.
How some browser, OS, or font chooses to display them is
entirely a matter of taste--some display boxes, some display
question marks, some display nothing at all.  It doesn't
matter, because their isn't any "correct" way to display
codes that don't represent anything.

The problem is that /some/ character sets, notably Microsoft
Windows code page 1252, /do/ use those character codes for things
like curly quotes and em dashes.  To be correctly encoded for
Wikipedia, they should be changed to HTML entities referencing
either the character name (e.g., "lsquo"), or the correct Unicode
value.  Copying Windows text with those things directly into
Wikipedia creates the illegal characters.  When we see that
happen, we should try to figure out what they're supposed to be
and replace them with correct ones.


You Wrote:
>The boxes/question marks are almost always a result of pasting in 
text from Microsoft Word or some other program that uses "fancy" 
quotation marks etc.  I always assume it to be the result of a 
copyright violation and google-test it (which I'll go do now).  That 
would explain all three of your questions below--where they come 
from, why the contributor doesn't see them, and why they don't 
display.
>
>There may be other explanations, which I'd like to hear.... 
(honestly)
>
>kq
>
>You Wrote:
>>I noticed that Quercusrobur was writing articles with lots of boxes 
in them, 
>>and changed them to apostrophes. So I wrote him a note on his talk 
page 
>>including a row of box characters (which I generated with "160 128 
do i emit 
>>loop"; I have no way to type them). I just checked the page and all 
the boxes 
>>have been turned into question marks! So I have three questions:
>>
>>1. How do people write articles with boxes?
>>
>>2. The people who write articles with boxes don't see them as 
boxes. They see 
>>them as apostrophes, dashes, ellipses, etc. How do we explain to 
them how to 
>>avoid writing boxes?
>>
>>3. What changed the boxes to question marks?
>>
>>phma
>>[Wikipedia-l]
>>To manage your subscription to this list, please go here:
>>http://www.nupedia.com/mailman/listinfo/wikipedia-l
>
>
>
>
>[Wikipedia-l]
>To manage your subscription to this list, please go here:
>http://www.nupedia.com/mailman/listinfo/wikipedia-l







More information about the Wikipedia-l mailing list