Any hints on converting Word docs to wiki markup? I'm trying to build a corporate intranet site and (surprise) most of my source material is in Word.
On 5/29/06 2:53 PM, "Roy Smith" roy@panix.com wrote:
Any hints on converting Word docs to wiki markup? I'm trying to build a corporate intranet site and (surprise) most of my source material is in Word.
I haven't tried any of these Word to Wiki scripts, but Google "word2wiki.bas" and check out the results.
I originally found this following a post from...
http://lists.edgewall.com/archive/trac/2005-October/004998.html
cheers, dEVoN
Is antiword helpful?
My favorite command when I receive a word doc is:
antiword -f -w0 foo.doc | less
Perhaps you can write a script to call this and convert, for example, *bold* and /italics/ and _underline_ to ''italics'' and '''bold''' and whatever underline should be. I think bullet lists work out easily too.
On Mon, 29 May 2006, Roy Smith wrote:
Any hints on converting Word docs to wiki markup? I'm trying to build a corporate intranet site and (surprise) most of my source material is in Word.
On 5/31/06, Leif Pedersen pedersen@meridian-enviro.com wrote:
Perhaps you can write a script to call this and convert, for example, *bold* and /italics/ and _underline_ to ''italics'' and '''bold''' and whatever underline should be. I think bullet lists work out easily too.
Someone out there made a macro of some kind to do this sort of thing. I didn't write down any references and I'm not sure how up-to-date it would be though.
go to hell with your shit mess
From: "Sy Ali" sy1234@gmail.com Reply-To: MediaWiki announcements and site admin list mediawiki-l@Wikimedia.org To: "MediaWiki announcements and site admin list" mediawiki-l@wikimedia.org Subject: Re: [Mediawiki-l] Converting MS Word to Wiki? Date: Fri, 2 Jun 2006 16:05:36 -0400
On 5/31/06, Leif Pedersen pedersen@meridian-enviro.com wrote:
Perhaps you can write a script to call this and convert, for example, *bold* and /italics/ and _underline_ to ''italics'' and '''bold''' and whatever underline should be. I think bullet lists work out easily too.
Someone out there made a macro of some kind to do this sort of thing. I didn't write down any references and I'm not sure how up-to-date it would be though. _______________________________________________ MediaWiki-l mailing list MediaWiki-l@Wikimedia.org http://mail.wikipedia.org/mailman/listinfo/mediawiki-l
_________________________________________________________________ Behöver du middagstips? http://www.msn.se/mat
For this I found word2wiki_with_images.bas but was not working properly so I adapted it for Mediawiki and is working satisfactory now.
It is quite easy to use. Just import it in word by using tolls>macro>visual basic editor The File>Import If you now run the macro on your document it will be converted, the images will be stored in a separate directory and the converted document is copied to your clipboard. Then just paste it in your page.
It is not perfect but a fast way to import Word documents. Hope the attachment is coming with this posting otherwise just ask
-----Original Message----- From: mediawiki-l-bounces@Wikimedia.org [mailto:mediawiki-l- bounces@Wikimedia.org] On Behalf Of Sy Ali Sent: Friday, June 02, 2006 22:32 To: MediaWiki announcements and site admin list Subject: Re: [Mediawiki-l] Converting MS Word to Wiki?
On 6/2/06, Magnus Nordmark main_67@msn.com wrote:
go to hell with your shit mess
*plonk* _______________________________________________ MediaWiki-l mailing list MediaWiki-l@Wikimedia.org http://mail.wikipedia.org/mailman/listinfo/mediawiki-l
Um, wha? Who is this idiot?
On Jun 2, 2006, at 12:32 PM, Sy Ali wrote:
On 6/2/06, Magnus Nordmark main_67@msn.com wrote:
go to hell with your shit mess
*plonk*
Ok attachment is not working. Sorry I'm new here.
Then just copy below. In de VB editor in Word right click on the module and add a new module and paste the text. Then export it.
======
Sub Word2Wiki()
Application.ScreenUpdating = False
ReplaceQuotes WikiEscapeChars WikiConvertHyperlinks
WikiConvertH1 WikiConvertH2 WikiConvertH3 WikiConvertH4 WikiConvertH5
WikiConvertItalic WikiConvertBold WikiConvertUnderline WikiConvertStrikeThrough WikiConvertSuperscript WikiConvertSubscript
WikiConvertLists WikiConvertTables
WikiSaveAsHTMLAndConvertImages
' Copy to clipboard ActiveDocument.Content.Copy
Application.ScreenUpdating = True End Sub
Private Sub WikiConvertH1() ReplaceHeading wdStyleHeading1, "==" End Sub
Private Sub WikiConvertH2() ReplaceHeading wdStyleHeading2, "===" End Sub
Private Sub WikiConvertH3() ReplaceHeading wdStyleHeading3, "====" End Sub
Private Sub WikiConvertH4() ReplaceHeading wdStyleHeading4, "=====" End Sub
Private Sub WikiConvertH5() ReplaceHeading wdStyleHeading5, "======" End Sub
Private Sub WikiConvertBold() ActiveDocument.Select
With Selection.Find
.ClearFormatting .Font.Bold = True .Text = ""
.Format = True .MatchCase = False .MatchWholeWord = False .MatchWildcards = False .MatchSoundsLike = False .MatchAllWordForms = False
.Forward = True .Wrap = wdFindContinue
Do While .Execute With Selection If Len(.Text) > 1 And InStr(1, .Text, vbCr) Then ' Just process the chunk before any newline characters ' We'll pick-up the rest with the next search .Collapse .MoveEndUntil vbCr End If
' Don't bother to markup newline characters (prevents a loop, as well) If Not .Text = vbCr Then .InsertBefore "'''" .InsertAfter "'''" End If
.Style = ActiveDocument.Styles("Default Paragraph Font") .Font.Bold = False End With Loop End With End Sub
Private Sub WikiConvertItalic() ActiveDocument.Select
With Selection.Find
.ClearFormatting .Font.Italic = True .Text = ""
.Format = True .MatchCase = False .MatchWholeWord = False .MatchWildcards = False .MatchSoundsLike = False .MatchAllWordForms = False
.Forward = True .Wrap = wdFindContinue
Do While .Execute With Selection If Len(.Text) > 1 And InStr(1, .Text, vbCr) Then ' Just process the chunk before any newline characters ' We'll pick-up the rest with the next search .Collapse .MoveEndUntil vbCr End If
' Don't bother to markup newline characters (prevents a loop, as well) If Not .Text = vbCr Then .InsertBefore "''" .InsertAfter "''" End If
.Style = ActiveDocument.Styles("Default Paragraph Font") .Font.Italic = False End With Loop End With End Sub
Private Sub WikiConvertUnderline() ActiveDocument.Select
With Selection.Find
.ClearFormatting .Font.Underline = True .Text = ""
.Format = True .MatchCase = False .MatchWholeWord = False .MatchWildcards = False .MatchSoundsLike = False .MatchAllWordForms = False
.Forward = True .Wrap = wdFindContinue
Do While .Execute With Selection If Len(.Text) > 1 And InStr(1, .Text, vbCr) Then ' Just process the chunk before any newline characters ' We'll pick-up the rest with the next search .Collapse .MoveEndUntil vbCr End If
' Don't bother to markup newline characters (prevents a loop, as well) If Not .Text = vbCr Then .InsertBefore "<u>" .InsertAfter "</u>" End If
.Style = ActiveDocument.Styles("Default Paragraph Font") .Font.Underline = False End With Loop End With End Sub
Private Sub WikiConvertStrikeThrough() ActiveDocument.Select
With Selection.Find
.ClearFormatting .Font.StrikeThrough = True .Text = ""
.Format = True .MatchCase = False .MatchWholeWord = False .MatchWildcards = False .MatchSoundsLike = False .MatchAllWordForms = False
.Forward = True .Wrap = wdFindContinue
Do While .Execute With Selection If Len(.Text) > 1 And InStr(1, .Text, vbCr) Then ' Just process the chunk before any newline characters ' We'll pick-up the rest with the next search .Collapse .MoveEndUntil vbCr End If
' Don't bother to markup newline characters (prevents a loop, as well) If Not .Text = vbCr Then .InsertBefore "<strike>" .InsertAfter "</strike>" End If
.Style = ActiveDocument.Styles("Default Paragraph Font") .Font.StrikeThrough = False End With Loop End With End Sub
Private Sub WikiConvertSuperscript() ActiveDocument.Select
With Selection.Find
.ClearFormatting .Font.Superscript = True .Text = ""
.Format = True .MatchCase = False .MatchWholeWord = False .MatchWildcards = False .MatchSoundsLike = False .MatchAllWordForms = False
.Forward = True .Wrap = wdFindContinue
Do While .Execute With Selection .Text = Trim(.Text) If Len(.Text) > 1 And InStr(1, .Text, vbCr) Then ' Just process the chunk before any newline characters ' We'll pick-up the rest with the next search .Collapse .MoveEndUntil vbCr End If
' Don't bother to markup newline characters (prevents a loop, as well) If Not .Text = vbCr Then .InsertBefore "<sup>" .InsertAfter "</sup>" End If
.Style = ActiveDocument.Styles("Default Paragraph Font") .Font.Superscript = False End With Loop End With End Sub
Private Sub WikiConvertSubscript() ActiveDocument.Select
With Selection.Find
.ClearFormatting .Font.Subscript = True .Text = ""
.Format = True .MatchCase = False .MatchWholeWord = False .MatchWildcards = False .MatchSoundsLike = False .MatchAllWordForms = False
.Forward = True .Wrap = wdFindContinue
Do While .Execute With Selection .Text = Trim(.Text) If Len(.Text) > 1 And InStr(1, .Text, vbCr) Then ' Just process the chunk before any newline characters ' We'll pick-up the rest with the next search .Collapse .MoveEndUntil vbCr End If
' Don't bother to markup newline characters (prevents a loop, as well) If Not .Text = vbCr Then .InsertBefore "<sub>" .InsertAfter "</sub>" End If
.Style = ActiveDocument.Styles("Default Paragraph Font") .Font.Subscript = False End With Loop End With End Sub
Private Sub WikiConvertLists() Dim para As Paragraph For Each para In ActiveDocument.ListParagraphs With para.Range .InsertBefore " " For i = 1 To .ListFormat.ListLevelNumber If .ListFormat.ListType = wdListBullet Then .InsertBefore "*" Else .InsertBefore "#" End If Next i .ListFormat.RemoveNumbers End With Next para End Sub
Private Sub WikiConvertTables() Dim thisTable As Table Dim thisRow As Row Dim thisCell As Cell Dim ElRango As Object For Each thisTable In ActiveDocument.Tables With thisTable For Each thisCell In thisTable.Range.Cells thisCell.Range.InsertBefore "|" Next thisCell For Each thisRow In .Rows thisRow.Range.InsertBefore Chr(11) & "|-" & Chr(11) If thisRow.Index = .Rows.Count Then 'Cerramos la tabla al final thisRow.Range.InsertAfter Chr(11) & "|}" & Chr(11) End If If thisRow.Index = 1 Then thisRow.Range.InsertBefore "{| border='1'" End If Next thisRow
Set ElRango = .ConvertToText(Separator:="|") With ElRango.Find .ClearFormatting .Text = "^p" With .Replacement .ClearFormatting .Text = "" End With .Execute Replace:=wdReplaceAll End With End With Next thisTable End Sub
Private Sub WikiConvertHyperlinks() Dim hyperCount As Integer
hyperCount = ActiveDocument.Hyperlinks.Count
For i = 1 To hyperCount With ActiveDocument.Hyperlinks(1) Dim addr As String addr = .Address .Delete .Range.InsertBefore "[" .Range.InsertAfter "|" & addr & "]" End With Next i End Sub
' Replace all smart quotes with their dumb equivalents Private Sub ReplaceQuotes() Dim quotes As Boolean quotes = Options.AutoFormatAsYouTypeReplaceQuotes Options.AutoFormatAsYouTypeReplaceQuotes = False ReplaceString ChrW(8220), """" ReplaceString ChrW(8221), """" ReplaceString "'", "'" ReplaceString "'", "'" Options.AutoFormatAsYouTypeReplaceQuotes = quotes End Sub
Private Sub WikiEscapeChars() EscapeCharacter "*" EscapeCharacter "" EscapeCharacter "" EscapeCharacter "" EscapeCharacter "{" EscapeCharacter "}" EscapeCharacter "[" EscapeCharacter "]" EscapeCharacter "~" EscapeCharacter "^^" EscapeCharacter "|" End Sub
Private Function ReplaceHeading(styleHeading As String, headerPrefix As String) Dim normalStyle As Style Set normalStyle = ActiveDocument.Styles(wdStyleNormal)
ActiveDocument.Select
With Selection.Find
.ClearFormatting .Style = ActiveDocument.Styles(styleHeading) .Text = ""
.Format = True .MatchCase = False .MatchWholeWord = False .MatchWildcards = False .MatchSoundsLike = False .MatchAllWordForms = False
.Forward = True .Wrap = wdFindContinue
Do While .Execute With Selection If InStr(1, .Text, vbCr) Then ' Just process the chunk before any newline characters ' We'll pick-up the rest with the next search .Collapse .MoveEndUntil vbCr End If
' Don't bother to markup newline characters (prevents a loop, as well) If Not .Text = vbCr Then .InsertBefore headerPrefix .InsertBefore vbCr .InsertAfter headerPrefix End If
.Style = normalStyle End With Loop End With End Function
Private Function EscapeCharacter(char As String) ReplaceString char, "" & char End Function
Private Function ReplaceString(findStr As String, replacementStr As String) Selection.Find.ClearFormatting Selection.Find.Replacement.ClearFormatting With Selection.Find .Text = findStr .Replacement.Text = replacementStr .Forward = True .Wrap = wdFindContinue .Format = False .MatchCase = False .MatchWholeWord = False .MatchWildcards = False .MatchSoundsLike = False .MatchAllWordForms = False End With Selection.Find.Execute Replace:=wdReplaceAll End Function
Private Sub WikiSaveAsHTMLAndConvertImages() Dim s As shape For Each s In ActiveDocument.Shapes If s.Type = msoPicture Then s.ConvertToInlineShape End If Next FileName = ActiveDocument.Path + "" + ActiveDocument.Name FolderName = FileName + "_files"
ActiveDocument.SaveAs FileName:=FileName + ".htm", _ FileFormat:=wdFormatFilteredHTML, LockComments:=False, Password:="", _ AddToRecentFiles:=True, WritePassword:="", ReadOnlyRecommended:=False, _ EmbedTrueTypeFonts:=False, SaveNativePictureFormat:=False, SaveFormsData _ :=False, SaveAsAOCELetter:=False
Set fs = CreateObject("Scripting.FileSystemObject") If fs.FolderExists(FolderName) Then Set f = fs.GetFolder(FolderName)
Dim iShape As InlineShape Set fc = f.Files i = 1 For Each f In fc If i <= ActiveDocument.InlineShapes.Count Then Set iShape = ActiveDocument.InlineShapes.Item(i) iShape.Range.InsertBefore "[[Afbeelding:" + f.Name & "]]" i = i + 1 End If Next
Shell "explorer.exe " + FileName + "_files", vbNormalFocus End If End Sub =======
Sy Ali wrote:
On 5/31/06, Leif Pedersen pedersen@meridian-enviro.com wrote:
Perhaps you can write a script to call this and convert, for example, *bold* and /italics/ and _underline_ to ''italics'' and '''bold''' and whatever underline should be. I think bullet lists work out easily too.
Someone out there made a macro of some kind to do this sort of thing. I didn't write down any references and I'm not sure how up-to-date it would be though.
Here is a tool (GPL, haven't tested it) that claims to be able to convert MS Word -> Mediawiki.
http://www.pcwelt.de/downloads/heft-cd/12-05/123808/
HTH.
Mathias
mediawiki-l@lists.wikimedia.org