Hello,
I have a word file (1200 pages) (written within the last 10 years). It is written in cinstant format (Like a lexicon) and i would like to import it into a mediawiki installation. How can I do that?
Any help is highly appreciated
uv22e Alcott wrote:
Hello,
I have a word file (1200 pages) (written within the last 10 years). It is written in cinstant format (Like a lexicon) and i would like to import it into a mediawiki installation. How can I do that?
Any help is highly appreciated
The first step would be to make it into the plain text you want into the wiki. You could for instance dump it all into one wikipage. Or you could split a new page when reaching some marker, etc. Then you would transform it into wikitext where needed. And finally move it into the wiki with eg. maintenance/importTextFile.php
I know this is generic advise, but it very much depends on what you want to do with your file into the wiki. I hope it can enlight you a bit.
uv22e Alcott wrote:
I have a word file (1200 pages) (written within the last 10 years). It is written in cinstant format (Like a lexicon) and i would like to import it into a mediawiki installation. How can I do that?
OpenOffice can open Word files and lets you export text as MediaWiki markup. You have to make sure you are using correct styles in Word, though (headings etc.).
And with the OpenOffice extension http://extensions.services.openoffice.org/project/wikipublisher you can even publish directly from OpenOffice to a live online wiki.
hth Frank
ty so much, your emails are a big help to me :)
The Problem is that I have written a lexikon, so I have a header and then something from 3 to 20 lines text. In conclusion I have something at about 3000 little texts / notions which needs their own wiki page. So I dont want to copy & paste all the notions inside the word file into the wiki etc. Is there a chance to automate that a program sees the header (fat) and begins a new wiki page?
Here some example how my word file looks like:
* Lorem ipsum dolor sit amet*: consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet.
*Lorem ipsum dolor sit amet*: consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet. Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet.
*Lorem ipsum dolor sit amet*: consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet. Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet.
*Lorem ipsum dolor sit amet*: consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet. Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet.
On Wed, Dec 29, 2010 at 12:56 PM, nakohdo Frank.Ralf@gmx.net wrote:
uv22e Alcott wrote:
I have a word file (1200 pages) (written within the last 10 years). It is written in cinstant format (Like a lexicon) and i would like to import it into a mediawiki installation. How can I do that?
OpenOffice can open Word files and lets you export text as MediaWiki markup. You have to make sure you are using correct styles in Word, though (headings etc.).
And with the OpenOffice extension http://extensions.services.openoffice.org/project/wikipublisher you can even publish directly from OpenOffice to a live online wiki.
hth Frank -- View this message in context: http://old.nabble.com/Read-1200-Pages-into-wiki-tp30547532p30550732.html Sent from the WikiMedia General mailing list archive at Nabble.com.
MediaWiki-l mailing list MediaWiki-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-l
Here the file published:
http://www2.uni-siegen.de/~merk/downloads/finanzbegriffe.pdf
AFAIK most import scripts map one file to one wiki page.
But you could use MediaWiki's XML import/export format for multipage imports from one file (http://www.mediawiki.org/wiki/Manual:Importing_XML_dumps). Converting your Word content to XML might be a bit tougher but should be manageable. http://slash4.de/tutorials/Automatic_mediawiki_page_import_powershell_script might give you some pointers.
hth Frank
Another option for such structured texts might be http://semantic-mediawiki.org
hth Frank
Can you remove me from this mail list please?
Thanks.
----- Original Message ----- From: "nakohdo" Frank.Ralf@gmx.net To: mediawiki-l@lists.wikimedia.org Sent: Tuesday, January 04, 2011 9:02 AM Subject: Re: [Mediawiki-l] Re ad 1200 Pages into wiki
Another option for such structured texts might be http://semantic-mediawiki.org
hth Frank
-- View this message in context: http://old.nabble.com/Read-1200-Pages-into-wiki-tp30547532p30584916.html Sent from the WikiMedia General mailing list archive at Nabble.com.
MediaWiki-l mailing list MediaWiki-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-l
Hello! If your delimiter is a double newline I would suggest you to use awk or Google Refine to convert the data in XML. With Google Refine it should be easier since this tool provides instant preview during the configureation of the output.
On Wed, Dec 29, 2010 at 4:15 PM, nakohdo Frank.Ralf@gjmx.net wrote:
AFAIK most import scripts map one file to one wiki page.
But you could use MediaWiki's XML import/export format for multipage imports from one file (http://www.mediawiki.org/wiki/Manual:Importing_XML_dumps). Converting your Word content to XML might be a bit tougher but should be manageable.
http://slash4.de/tutorials/Automatic_mediawiki_page_import_powershell_script might give you some pointers.
hth Frank -- View this message in context: http://old.nabble.com/Read-1200-Pages-into-wiki-tp30547532p30551790.html Sent from the WikiMedia General mailing list archive at Nabble.com.
MediaWiki-l mailing list MediaWiki-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-l
Try: understand the xml-export available under special page. Export at least two pages you created.
First follow the advice to use Open Office mediawiki export to preserve as much formatting as possible.
This: http://biowikifarm.net/meta/Converting_Word_to_Mediawiki_text many help or not.
then prepare your Wordprocessor file: replace & to & > to > and < to <
Now replace the "article" borders with the stuff you find between the pages in your xml export.
The catch is to give each page the right title. If your base structure is well defined, and you know end or previous to next title, end of title to start of page content, you should be able to do it.
Add the start and end of the export "xml"
Save as plain text. Use an xml validator to check you did it right. This is crucial...
Then try to import a small subset first.
This may help - or not: http://biowikifarm.net/meta/Mediawiki_XML_page_importing
good luck!
Gregor
Hi
Maybe this extension is useful to you.
mediawiki-l@lists.wikimedia.org