Hello There are many different archives available for download from Wikipedia so I am not sure which one to get.
Please tell me which one has a list of page titles and category
For example
Sit-Verb Cat-Noun-Animal Him-Pronoun Dog-Noun-Animal Happy-Adjective Washington-Noun-Place Metallica-Noun-Band
What is the URL to the archive that I need to download to get this kind of information?
I don't even care what format it is in. XML, CSV, SQL, I don't care which as long as it's parse-able.
Azu schreef:
Hello There are many different archives available for download from Wikipedia so I am not sure which one to get.
Please tell me which one has a list of page titles and category
For example
Sit-Verb Cat-Noun-Animal Him-Pronoun Dog-Noun-Animal Happy-Adjective Washington-Noun-Place Metallica-Noun-Band
What is the URL to the archive that I need to download to get this kind of information?
I don't even care what format it is in. XML, CSV, SQL, I don't care which as long as it's parse-able.
I don't know the URL off the top of my head, but you'll probably want to download the database dump (which is in SQL format). In there is a 'categorylinks' table which lists all category associations.
Roan Kattouw (Catrope)
Sorry had to delete your message to reply. It kept saying it was to long and that I needed to "fix that" or something.
But anyways, how do I import the file? When I try to import it I get the error
ERROR 1064 (42000) at line 12: You have an error in your SQL syntax; check the m anual that corresponds to your MySQL server version for the right syntax to use near '(14) NOT NULL, UNIQUE KEY `cl_from` (`cl_from`,`cl_to`), KEY `cl_timestamp` ' at line 5
and obviously I can't open the file in a text editor since it's way to big (over 700 megabytes).
Please help!
On Sun, May 11, 2008 at 9:28 AM, Azu AzuMao@gmail.com wrote:
and obviously I can't open the file in a text editor since it's way to big (over 700 megabytes).
Any decent text editor will be able to open arbitrarily large text files. If you're on Windows, try something like Notepad++. On Linux, the default editors seem to work fine for very large files, or at least gedit does.
You should, of course, also be able to import it to MySQL, but that's evidently trickier.
Simetrical <Simetrical+wikilist@...> writes:
On Sun, May 11, 2008 at 9:28 AM, Azu <AzuMao@...> wrote:
and obviously I can't open the file in a text editor since it's way to big
(over
700 megabytes).
Any decent text editor will be able to open arbitrarily large text files. If you're on Windows, try something like Notepad++. On Linux, the default editors seem to work fine for very large files, or at least gedit does.
You should, of course, also be able to import it to MySQL, but that's evidently trickier.
Thanks, that doesn't help though. Even if I do get it open in a text editor you haven't told me how to actually fix it lol.
I've tried importing it directly through command line (J:\MySQL\bin\mysql.exe -B -n -q --force -u root --password=XXXXXX<E:\commonswiki-20080425-categorylinks.sql) and I also tried importing it with SQLyog, and with Navicat.
I get this same error no matter what I try, and I don't know how to fix it. Please help.
On Sun, May 11, 2008 at 11:28 PM, Azu AzuMao@gmail.com wrote:
ERROR 1064 (42000) at line 12: You have an error in your SQL syntax; check the m anual that corresponds to your MySQL server version for the right syntax to use near '(14) NOT NULL, UNIQUE KEY `cl_from` (`cl_from`,`cl_to`), KEY `cl_timestamp` ' at line 5
You're probably using MySQL 5. Wikimedia uses some version of MySQL 4 (I can't recall exactly which) and importing dumps into MySQL 5 can cause problems.
Azu wrote:
Hello There are many different archives available for download from Wikipedia so I am not sure which one to get.
Please tell me which one has a list of page titles and category
For example
Sit-Verb Cat-Noun-Animal Him-Pronoun Dog-Noun-Animal Happy-Adjective Washington-Noun-Place Metallica-Noun-Band
What is the URL to the archive that I need to download to get this kind of information?
I don't even care what format it is in. XML, CSV, SQL, I don't care which as long as it's parse-able.
I think you don't want to download wikipedia, but wiktionary. The appropiate link would be http://download.wikimedia.org/enwiktionary/latest/enwiktionary-latest-pages-...
wikitech-l@lists.wikimedia.org