Forwarded from Foundation-l:
--- jellings jellings@gmail.com wrote:
Date: Wed, 16 Feb 2005 10:02:24 -0500 From: jellings jellings@gmail.com To: foundation-l@wikimedia.org Subject: [Foundation-l] Import question
Hi
Sorry in advance if this is the wrong list to post this to.
I would like to know about importing csv files into an existing wiki.
I have tried to find the documentation for doing so but have not had much luck.
If anyone could point me in the right direction I would appreciate it.
Thanks in advance.
-- john
__________________________________ Do you Yahoo!? Read only the mail you want - Yahoo! Mail SpamGuard. http://promotions.yahoo.com/new_mail
To clarify on my original post:
I have a text file of dictionary terms - word, definition plus another file of word, synonyms - since I already have it in this format it would be a lot easier to import it.
I CAN reformat the file, but to enter it manually would take too long (there are over 20,000 entries in the dictionary file alone, more in the synonyms).
I do not have a complete understanding of the database schema so the answer may be obvious to others; any insight would be helpful.
Rich Holton wrote:
Forwarded from Foundation-l:
--- jellings jellings@gmail.com wrote:
Date: Wed, 16 Feb 2005 10:02:24 -0500 From: jellings jellings@gmail.com To: foundation-l@wikimedia.org Subject: [Foundation-l] Import question
Hi
Sorry in advance if this is the wrong list to post this to.
I would like to know about importing csv files into an existing wiki.
I have tried to find the documentation for doing so but have not had much luck.
If anyone could point me in the right direction I would appreciate it.
Thanks in advance.
-- john
__________________________________ Do you Yahoo!? Read only the mail you want - Yahoo! Mail SpamGuard. http://promotions.yahoo.com/new_mail
jellings wrote:
To clarify on my original post:
I have a text file of dictionary terms - word, definition plus another file of word, synonyms - since I already have it in this format it would be a lot easier to import it.
Take a look at maintenance/importUseModWiki.php and change the importer part to read from your format.
-- brion vibber (brion @ pobox.com)
jellings wrote:
To clarify on my original post:
I have a text file of dictionary terms - word, definition plus another file of word, synonyms - since I already have it in this format it would be a lot easier to import it.
I CAN reformat the file, but to enter it manually would take too long (there are over 20,000 entries in the dictionary file alone, more in the synonyms).
I do not have a complete understanding of the database schema so the answer may be obvious to others; any insight would be helpful.
Now that you have clarified your intention it is evident that this is a Wiktionary matter.
This might be raised in [[Wiktionary:Beer parlour]], but I suspect that it would not receive a warm reception. While the words which are not yet in Wiktionary might be welcomed there would remain the question of how do you propose to reconcile your contributions with the words that we already have. To the extent that any upload may be possible it should also be done at a rate that is slow enough to allow others to ask questions about each of your contributions. Perhaps by first entering a small sample of your files you could develop the confidence of the Wiktionary community that your contributions are acceptable.
Ec (en:Wiktionary sysop)
Ray Saintonge wrote:
Now that you have clarified your intention it is evident that this is a Wiktionary matter.
This might be raised in [[Wiktionary:Beer parlour]], but I suspect that it would not receive a warm reception. While the words which are not yet in Wiktionary might be welcomed there would remain the question of how do you propose to reconcile your contributions with the words that we already have. To the extent that any upload may be possible it should also be done at a rate that is slow enough to allow others to ask questions about each of your contributions.
Perhaps by first entering a
small sample of your files you could develop the confidence of the Wiktionary community that your contributions are acceptable.
Here is a sample (pretty straightforward):
Definition table:
D000092|Acetohexamide|A sulfonylurea hypoglycemic agent that is metabolized in the liver to 1-hydrohexamide.
Synonym table:
D000092|D000092 D000092|D000093 D000092|D000094
Ec (en:Wiktionary sysop)
Ray Saintonge wrote:
jellings wrote:
To clarify on my original post:
I have a text file of dictionary terms - word, definition plus another file of word, synonyms - since I already have it in this format it would be a lot easier to import it.
I CAN reformat the file, but to enter it manually would take too long (there are over 20,000 entries in the dictionary file alone, more in the synonyms).
I do not have a complete understanding of the database schema so the answer may be obvious to others; any insight would be helpful.
Now that you have clarified your intention it is evident that this is a Wiktionary matter.
This might be raised in [[Wiktionary:Beer parlour]], but I suspect that it would not receive a warm reception. While the words which are not yet in Wiktionary might be welcomed there would remain the question of how do you propose to reconcile your contributions with the words that we already have. To the extent that any upload may be possible it should also be done at a rate that is slow enough to allow others to ask questions about each of your contributions. Perhaps by first entering a small sample of your files you could develop the confidence of the Wiktionary community that your contributions are acceptable.
Ec (en:Wiktionary sysop)
Hoi, This is a great example where the en:wiktionary assumes that is only has been adressed when dictionary terms are mentioned. It also assumes that its practices are the same on all wiktionaries. It is only one wiktionary and they are not.
There are several wiktionaries that welcome quality content and load them in bulk with a bot onto their pages. We even apply for temporary bot status so that it can be uploaded with a personalised bot (good for the GFDL requirements). This is a great way of beefing up the content of a project. Certainly when as often happens, the content is either specialised or with translations to a specific language it can be an extremely worthwhile addition to the wiktionary.
The argument that the upload should be slow enough for the community to ask questions is based on what, fear ?? There is already plenty of content on en: that is questionable like words spelled wrong, this is spelled out, but hey, it is a dictionary and content SHOULD be correct. So when you argue let us know what the content is about, I could agree. When you say do it slowly as we want to look into every addition, I would say that this is not feasible using a bot.
Thanks, GerardM
Hoi, There is a bot that will try to import the data into wiktionary. It is part of the pywikipedia bot.Under optimal conditions it works fine, if you run a Window variety, it must be Windows XP as this has the best handling of "funny" characters. The bot tries to cater for UTF-8 but it does not work perfectly. Andre Engels, the writer of the bot cannot find where the current problems are so we are rather desperate for a fix :(.
To get it to work, your file must contain a start and stop string for each article. The data will be imported as is into wiktionary. The bot will only import if there is nothing there, so it must be completely new. To be "nice" to the other users, it is best to run this with a profile with bot status. :)
I am happy to produce a file that is ready for upload for you from what you have. The uploading of the file I would also do when the bot worked 100%. I do upload with different users in order to state from where the content originated (and comply with the GFDL in the process).
Thanks, GerardM
jellings wrote:
To clarify on my original post:
I have a text file of dictionary terms - word, definition plus another file of word, synonyms - since I already have it in this format it would be a lot easier to import it.
I CAN reformat the file, but to enter it manually would take too long (there are over 20,000 entries in the dictionary file alone, more in the synonyms).
I do not have a complete understanding of the database schema so the answer may be obvious to others; any insight would be helpful.
Rich Holton wrote:
Forwarded from Foundation-l:
--- jellings jellings@gmail.com wrote:
Date: Wed, 16 Feb 2005 10:02:24 -0500 From: jellings jellings@gmail.com To: foundation-l@wikimedia.org Subject: [Foundation-l] Import question
Hi
Sorry in advance if this is the wrong list to post this to.
I would like to know about importing csv files into an existing wiki.
I have tried to find the documentation for doing so but have not had much luck.
If anyone could point me in the right direction I would appreciate it.
Thanks in advance.
-- john
__________________________________ Do you Yahoo!? Read only the mail you want - Yahoo! Mail SpamGuard. http://promotions.yahoo.com/new_mail
Wikitech-l mailing list Wikitech-l@wikimedia.org http://mail.wikipedia.org/mailman/listinfo/wikitech-l
wikitech-l@lists.wikimedia.org