I'd like to import a public database into my mediawiki, where each *row* of the database corresponds to a new article.
Here's a simple example to clarify:
ID Name Height Weight Row1 fred 150 100 Row2 bill 210 55
After importing this DB I'd have two new articles (title: fred and bill) - the fred article has an infobox with parameters 150 and 100 - the bill article has an infobox with parameters 210 and 55
My databases have thousands of entries...so an automated approach is necessary. Has this been done before? Anyone got any ideas?
Thanks.
Hi,
I didn't see anything on the faq of an easy way to do this but here is my situation:
I would like to import users from another sql database into the mysql database for one of my wiki sites.
I already have the query that grabs the usernames/password in a txt file from the sql database, but I'm not sure how to properly insert them into the wiki db as normal users or sysops. Ideally only a few of these users will have sysop permissions and the rest will be regular users.
Also when a password is changed in the sql db, I will need to change it in the wiki db(will have to code the condition that compares passwords of course). When a user is deactivated in the sql db(another condition I will have to code to check to see if that user exists in the wiki user table) I would like to scramble that users password since you can't simply delete a user. I would also like to automate this process on a nightly basis.
Is this at all possible? We can definitely code this if it is possible since we have a couple of developers in house.
Thanks in advance for any assistance.
-Isaac ----- Original Message ----- From: "HumanCell .org" humancell@gmail.com To: mediawiki-l@Wikimedia.org Sent: Thursday, December 08, 2005 3:56 PM Subject: [Mediawiki-l] importing public databases into mediawiki
I'd like to import a public database into my mediawiki, where each *row* of the database corresponds to a new article.
Here's a simple example to clarify:
ID Name Height Weight Row1 fred 150 100 Row2 bill 210 55
After importing this DB I'd have two new articles (title: fred and bill) - the fred article has an infobox with parameters 150 and 100 - the bill article has an infobox with parameters 210 and 55
My databases have thousands of entries...so an automated approach is necessary. Has this been done before? Anyone got any ideas?
Thanks. _______________________________________________ MediaWiki-l mailing list MediaWiki-l@Wikimedia.org http://mail.wikipedia.org/mailman/listinfo/mediawiki-l
It's all possible. Users and user rights are in the user and user_groups tables. Google for "MediaWiki md5 password" to find out the correct means for hashing up a password and remember group names in the user_groups table are all lowercase.
Rob Church
On 09/12/05, Isaac Gonzalez youngi@comcast.net wrote:
Hi,
I didn't see anything on the faq of an easy way to do this but here is my situation:
I would like to import users from another sql database into the mysql database for one of my wiki sites.
I already have the query that grabs the usernames/password in a txt file from the sql database, but I'm not sure how to properly insert them into the wiki db as normal users or sysops. Ideally only a few of these users will have sysop permissions and the rest will be regular users.
Also when a password is changed in the sql db, I will need to change it in the wiki db(will have to code the condition that compares passwords of course). When a user is deactivated in the sql db(another condition I will have to code to check to see if that user exists in the wiki user table) I would like to scramble that users password since you can't simply delete a user. I would also like to automate this process on a nightly basis.
Is this at all possible? We can definitely code this if it is possible since we have a couple of developers in house.
Thanks in advance for any assistance.
-Isaac ----- Original Message ----- From: "HumanCell .org" humancell@gmail.com To: mediawiki-l@Wikimedia.org Sent: Thursday, December 08, 2005 3:56 PM Subject: [Mediawiki-l] importing public databases into mediawiki
I'd like to import a public database into my mediawiki, where each *row* of the database corresponds to a new article.
Here's a simple example to clarify:
ID Name Height Weight Row1 fred 150 100 Row2 bill 210 55
After importing this DB I'd have two new articles (title: fred and bill)
- the fred article has an infobox with parameters 150 and 100
- the bill article has an infobox with parameters 210 and 55
My databases have thousands of entries...so an automated approach is necessary. Has this been done before? Anyone got any ideas?
Thanks. _______________________________________________ MediaWiki-l mailing list MediaWiki-l@Wikimedia.org http://mail.wikipedia.org/mailman/listinfo/mediawiki-l
MediaWiki-l mailing list MediaWiki-l@Wikimedia.org http://mail.wikipedia.org/mailman/listinfo/mediawiki-l
On 08/12/05, HumanCell .org humancell@gmail.com wrote:
I'd like to import a public database into my mediawiki, where each *row* of the database corresponds to a new article.
Here's a simple example to clarify:
ID Name Height Weight Row1 fred 150 100 Row2 bill 210 55
Well, since I just mentionned the XML import format in another posting (there's a schema and example in the 'docs/' directory of the source), here's a proposal using that:
1) write a script in your language of choice that queries the database, and formats each row into something along the lines of: <page><title>$name</title><revision><text>{{my infobox|height=$height|weight=$weight}}</text></revision></page> [this isn't actually valid, e.g. you need a timestamp and user etc in the revision; these could be constants in your program though]
2) create the page [[Template:My infobox]] (or whatever you actually want to call it) with appropriate stuff
3) go to Special:Import while logged in as a sysop, and import your new XML file. Ta-da! A whole lotta new articles!
Alternatively, of course, you could write something that read from one database and wrote to the other, but this way has the advantage that most of the leg-work is done by existing MediaWiki code, and you don't have to work out how. Except, unfortunately, it won't update the tables used by "whatlinkshere" and co - see http://bugzilla.wikimedia.org/show_bug.cgi?id=2483 (they'll be updated as soon as you edit each page, probably even if you do a "null edit" that doesn't actually save any changes)
-- Rowan Collins BSc [IMSoP]
If you happen to pick Python, you could grab the Python robot framework from http://pywikipediabot.sourceforge.net, tweak it, and then run touch.py -file:file, where the file contains a list of the imported article names - this too could be generated by your script.
There's also a bot for Perl, but you may just want to take the lazier option; run the maintenance script for rebuilding link tables.
Rob Church
On 09/12/05, Rowan Collins rowan.collins@gmail.com wrote:
On 08/12/05, HumanCell .org humancell@gmail.com wrote:
I'd like to import a public database into my mediawiki, where each *row* of the database corresponds to a new article.
Here's a simple example to clarify:
ID Name Height Weight Row1 fred 150 100 Row2 bill 210 55
Well, since I just mentionned the XML import format in another posting (there's a schema and example in the 'docs/' directory of the source), here's a proposal using that:
- write a script in your language of choice that queries the
database, and formats each row into something along the lines of: <page><title>$name</title><revision><text>{{my infobox|height=$height|weight=$weight}}</text></revision></page> [this isn't actually valid, e.g. you need a timestamp and user etc in the revision; these could be constants in your program though]
- create the page [[Template:My infobox]] (or whatever you actually
want to call it) with appropriate stuff
- go to Special:Import while logged in as a sysop, and import your
new XML file. Ta-da! A whole lotta new articles!
Alternatively, of course, you could write something that read from one database and wrote to the other, but this way has the advantage that most of the leg-work is done by existing MediaWiki code, and you don't have to work out how. Except, unfortunately, it won't update the tables used by "whatlinkshere" and co - see http://bugzilla.wikimedia.org/show_bug.cgi?id=2483 (they'll be updated as soon as you edit each page, probably even if you do a "null edit" that doesn't actually save any changes)
-- Rowan Collins BSc [IMSoP] _______________________________________________ MediaWiki-l mailing list MediaWiki-l@Wikimedia.org http://mail.wikipedia.org/mailman/listinfo/mediawiki-l
- go to Special:Import while logged in as a sysop, and import your
new XML file. Ta-da! A whole lotta new articles!
That sounds pretty interesting Rowan. So there only needs to be one XML file containing all the rows and with one call to Special:Import multiple articles are generated. Neat.
Alternatively, of course, you could write something that read from one database and wrote to the other,
Want to avoid that if possible. So your first suggestion sounds better.
Rob's suggestion looks quite interesting too - I'm also looking for a tool for data validation (to ensure infobox fields are valid data) so the bot might be quite useful for that too.
Thanks for the suggestions...am going to try them out...
Hi,
I am new here just tried importing a page (following Rob suggestions):
<mediawiki xmlns="http://www.mediawiki.org/xml/export-0.3/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.mediawiki.org/xml/export-0.3/ http://www.mediawiki.org/xml/export-0.3.xsd" version="0.3" xml:lang="en"> <page> <title>JUST A TEST</title> <id>1287</id> <revision> <id>1288</id> <timestamp>2005-12-09T12:42:38Z</timestamp> <contributor><ip>172.22.68.72</ip></contributor> <text xml:space="preserve">testing1 testing2 testing3 testing4 testing5 testing6 testing7 </text> </revision> </page> </mediawiki>
I guess we need to take care of the ids...
works very well using the importDump.php script.
You can search for the title in the search box but not for the text, is it possible to add the text to the search capabilities?
Thanks Leon
HumanCell .org wrote:
- go to Special:Import while logged in as a sysop, and import your
new XML file. Ta-da! A whole lotta new articles!
That sounds pretty interesting Rowan. So there only needs to be one XML file containing all the rows and with one call to Special:Import multiple articles are generated. Neat.
Alternatively, of course, you could write something that read from one database and wrote to the other,
Want to avoid that if possible. So your first suggestion sounds better.
Rob's suggestion looks quite interesting too - I'm also looking for a tool for data validation (to ensure infobox fields are valid data) so the bot might be quite useful for that too.
Thanks for the suggestions...am going to try them out... _______________________________________________ MediaWiki-l mailing list MediaWiki-l@Wikimedia.org http://mail.wikipedia.org/mailman/listinfo/mediawiki-l
On 09/12/05, Leon Goldovsky leongo@ebi.ac.uk wrote:
You can search for the title in the search box but not for the text, is it possible to add the text to the search capabilities?
Again, this needs the article to be "touched" in some way, probably by making a null edit, and probably using a bot, as discussed above.
The importDump.php and Special:Import really *ought* to have a way of dealing with such things, but they don't - partly, I think, because unless there's a way of *deferring* such touches, it could amount to a pretty major chunk of processing (for, say, an entire DB dump).
-- Rowan Collins BSc [IMSoP]
rebuildAll? rebuildXX?
Rob Church
On 09/12/05, Rowan Collins rowan.collins@gmail.com wrote:
On 09/12/05, Leon Goldovsky leongo@ebi.ac.uk wrote:
You can search for the title in the search box but not for the text, is it possible to add the text to the search capabilities?
Again, this needs the article to be "touched" in some way, probably by making a null edit, and probably using a bot, as discussed above.
The importDump.php and Special:Import really *ought* to have a way of dealing with such things, but they don't - partly, I think, because unless there's a way of *deferring* such touches, it could amount to a pretty major chunk of processing (for, say, an entire DB dump).
-- Rowan Collins BSc [IMSoP] _______________________________________________ MediaWiki-l mailing list MediaWiki-l@Wikimedia.org http://mail.wikipedia.org/mailman/listinfo/mediawiki-l
Thanks for the help!!
it works realy well.
Thanks Leon
Rob Church wrote:
rebuildAll? rebuildXX?
Rob Church
On 09/12/05, Rowan Collins rowan.collins@gmail.com wrote:
On 09/12/05, Leon Goldovsky leongo@ebi.ac.uk wrote:
You can search for the title in the search box but not for the text, is it possible to add the text to the search capabilities?
Again, this needs the article to be "touched" in some way, probably by making a null edit, and probably using a bot, as discussed above.
The importDump.php and Special:Import really *ought* to have a way of dealing with such things, but they don't - partly, I think, because unless there's a way of *deferring* such touches, it could amount to a pretty major chunk of processing (for, say, an entire DB dump).
-- Rowan Collins BSc [IMSoP] _______________________________________________ MediaWiki-l mailing list MediaWiki-l@Wikimedia.org http://mail.wikipedia.org/mailman/listinfo/mediawiki-l
MediaWiki-l mailing list MediaWiki-l@Wikimedia.org http://mail.wikipedia.org/mailman/listinfo/mediawiki-l
mediawiki-l@lists.wikimedia.org