[Mediawiki-l] users copy and paste "bad" special characters from MS Word

Monahon, Peter B. Peter.Monahon at USPTO.GOV
Wed Sep 26 11:04:17 UTC 2007


>> Earlier: "...Looks like the method 
>> described here is probably what 
>> you want:.."

http://shiflett.org/blog/2005/oct/convert-smart-quotes-with-php

> Earlier: "...That looks like exactly 
> what I need...I'd never heard them
> called "smart quotes"..."

There are many places Microsoft "reveals*" it's auto-replacement policy
as the user types, using high-order printable typographic characters
replacing standard 256 ASCII characters resulting in "?" and so on for
word-to-web cut-and-paste, look in the following Microsoft Word menus
(note (c) and 1/2 and other replacements):

Tools > Auto Correct Options...

[AutoCorrect]
 Show AutoCorrect Options buttons
 Correct TWo INitial CApitals
 Capitalize first letter of sentences
 Capitalize first letter of table cells
 Capitalize names of days
 Correct accidental usage of cAPS LOCK key
 Replace text as you type

[AutoFormat As You Type]

Replace as you type
 "Straight quotes" with smart quotes
 Ordinals (1st) with superscript
 Fractions (1/2) with fraction character (   )
 Hyphens (--) with dash (
 *Bold* and _italic_ with real formatting
 Internet and network paths with hyperlinks

Apply as you type
 Automatic bulleted lists
 Automatic numbered lists
 Border lines
 Tables
 Built-in Heading styles
Automatically as you type
 Format beginning of list item like the one before it
 Set left- and first-indent with tabs and backspaces
 Define styles based on your formatting

[AutoFormat]

Apply
 Built-in Heading styles   Automatic bulleted lists
 List styles  Other paragraph styles
Replace
 "Straight quotes" with smart quotes
 Ordinals (1st) with superscript
 Fractions (1/2) with fraction character (   )
 Hyphens (--) with dash (
 *Bold* and _italic_ with real formatting
 Internet and network paths with hyperlinks
Preserve
 Styles
Always AutoFormat
 Plain text WordMail documents

==

You're dealing with MS Word as the dynamic input editor for MediaWiki,
and that's similar to the task of re-purposing static MS Word DOCs for
use as MediaWiki pages, so there's much overlap in assignment and tools.
Importing existing documents, as well as cutting and pasting
"live/unsaved/in-process" documents in your case, is a significant need,
and there's much to be learned from document converter/import
experiences that applies equally to cut-and-paste (where "?" often
displays on the web instead of Microsoft's choices for typographical
characters intended for printing).

Note the following characters are anticipated and dealt with by Gunter
Schmidt http://www.beadsoft.de in
http://www.mediawiki.org/wiki/Extension_talk:Word2MediaWikiPlus

' Replace all smart quotes with their dumb equivalents
... and:
"_"
"-"
"+"
"/"
"{"
"}"
"["
"]"
"~"
"^^"
"|"
"'"
"<"
">"
... and adding as needed:
"" 'save some space if several characters in a row
"<nowiki>"
"</nowiki>"
'remove empty paragraphs at begin of document 
'add <br> to all manual line breaks
'add <br> to all paragraphs
'Add empty line at document end to prevent error
(address lines beginning with tab or space)
'replace "forced blanks"
'do not allow more than two empty paragraphs in a row
'Clean up Tab Indention
'Header, Footer and Footnote reference
... and much, much more.

==

I do extensive document cleanup (macros) before converting form MS Word
to Wiki, and your users may be able to load the macro at
http://www.mediawiki.org/wiki/Extension_talk:Word2MediaWikiPlus which
will assist them in also cutting-and-pasting tables and images, and
preserving bold, italic and other formatting!  I love it.  

Good luck.  Keep us informed in how it works for you, and please share
your results on http://www.mediawiki.org/

==

* for your entertainment: Microsoft's non-"reveal codes" white paper:
http://www.google.com/search?hl=en&q=microsoft+word+wordperfect+reveal+c
odes+white+paper&btnG=Google+Search - oh how the mighty have fallen, too
bad, WordPerfect and "reveal codes" could have evolved into the perfect
web editor, Macromedia Dreamweaver came close.




More information about the MediaWiki-l mailing list