Hi Jim,
I know I'm gettin' close on this just a few more questions:
Like you said: I used the Special:Export page at: http://meta.wikimedia.org/wiki/Special:Export and requested some random page name. ..and presto....XML code..cool!
However, I got a lot of "<Namespace key=>" tags. What are they?
Otherwise, I think I'm ready to do this :-)
And here is another problem along the same lines. Assuming I have a "golden" XML version of a wiki page, is there a maintinence script which will update an existing wiki page based on an externally provided XML file.
continued thanks, - rich
---- Jim Hu jimhu@tamu.edu wrote:
At some point I'm going to clean up some code I've written to make these imports easier...
Here's what I did (caveat - it _seems_ to work for me, but I don't know if I've introduced creeping crud into my wiki):
I used Special:Export to generate a sample XML file. I just lifted the xml headers and basic wiki info (<siteinfo>) from that. Each article is then generated by replacing the information between
<article></article> tags. For a bunch of gene pages that I generate, I have this function:
function gene_page_xml($gene_page_title,$gene_page_content){ global $gene_count; $time_stamp = date("Y-m-d").'T'.date("h:i:s")."Z"; $gene_page = "<page>\n<title>$gene_page_title</title>\n<id>0</id> \n<revision> <id>$gene_count</id> <timestamp>$time_stamp</timestamp>
<contributor> <username>Wikientrybot</username> <id>2</id> </contributor> <comment>Automated import of articles</comment> <text xml:space=\"preserve\">"; $gene_page.=$gene_page_content."</text>\n</revision>\n</page>\n"; return $gene_page; }
I've found that the revision id is irrelevant for importing, but the time_stamp has to be after the last revision of an existing page, if you're overwriting one that's already there (not an issue for the initial import, but if you're like me you'll import them and realize you want to make changes in the layout and reimport them). Note that the time is in zulu.
I've also done this by making a string for an article xml item template, and then doing str_replace to fill in the variable fields.
I write the xml_header, the gene pages, and the close tag out to a file, and then call importDump.php.
Hope this helps...and I hope the real experts can correct me if I've misrepresented anything about the xml spec.
Best,
Jim
On Mar 14, 2007, at 9:32 AM, revansx@cox.net revansx@cox.net wrote:
Hmm..., I found the "importDump.php" (thanks Jim) in the maintinence folder, but "importTextFile.php" is nowhere to be seen (thanks Rob). I'm working with mediawiki version-1.6.10. could that be the problem? In the mean time, could somone point me to a document that describes the "Article Class" and an example of a basic XML or text file that would be imported so I will know what to format my file like. Sorry to be a high maintinence noob ;-) I'm sure I'll have more question before this is over. Thanks.......
Hi Folks, I'm new to the forum and I'm looking to write a PHP script that generates (from the same server as is running the mediawiki) a new, orphaned, wiki page and also creates the initial page content as an automated server action.
Take a look at maintenance/importTextFile.php. Rob Church
MediaWiki-l mailing list MediaWiki-l@lists.wikimedia.org http://lists.wikimedia.org/mailman/listinfo/mediawiki-l
===================================== Jim Hu Associate Professor Dept. of Biochemistry and Biophysics 2128 TAMU Texas A&M Univ. College Station, TX 77843-2128 979-862-4054
MediaWiki-l mailing list MediaWiki-l@lists.wikimedia.org http://lists.wikimedia.org/mailman/listinfo/mediawiki-l