I'm happy to announce a new mirror for datasets other than the XML dumps.
This mirror comes to us courtesy of the Center for Research Computing,
University of Notre Dame, and covers everything "other" [1] which includes
such goodies as Wikidata entity dumps, pageview counts, titles of all files
on each wiki (daily), titles of all articles of each wiki (daily), and the
so-called "adds-changes" dumps, among other things. You can access it at
http://wikimedia.crc.nd.edu/other/ so please do!
Ariel
[1] https://dumps.wikimedia.org/other/
Hi,
we were trying to run Mediawiki Dumper and we got that:
ERROR 1064 (42000) at line 127478: You have an error in your SQL syntax;
check the manual that corresponds to your MySQL server version for the
right syntax to use near ''{{Orphan|date=April 2014}}\n\nThe \'\'\'lung
float test\'\'\', also called the ' at line 1
This is the command we ran:
java -server -classpath
/data/servers/data_load/lib/commons-compress.jar:/data/servers/data_load/lib/mwdumper.jar
org.mediawiki.dumper.Dumper --format=sql:1.5
/data/servers/data_load/en/20160801/enwiki-20160801-pages-articles.xml.bz2
| /usr/bin/mysql --max_allowed_packet=1G --default-character-set=utf8
--force -h 127.0.0.1 -uXXXXXXX -pXXXXXXX -P 3306 -D wikimirror_en
mysql version we are using is… 5.6.22
do you know how to get the issue fixed?
"lbzip2 -t
/public/dumps/public/wikidatawiki/20160801/wikidatawiki-20160801-pages-articles.xml.bz2"
succeeds for me on Labs. You should compare the checksum of your copy
with
https://dumps.wikimedia.org/wikidatawiki/20160801/wikidatawiki-20160801-sha…
(says c6a823508240d161e481e5d0045921421a9db5f1
wikidatawiki-20160801-pages-articles.xml.bz2 ).
Nemo
Hi. I've run across an error while downloading dumps, which I've not
seen before. (I've been downloading dumps regularly for over 5 years).
I've done some research on it, but haven't found much information.
It's a bit random, and from what I can tell, it's not failing for
client-side reasons. I'm a bit stumped, so I'm reaching out for any
help.
I provide more details below. My quick questions are:
* Has anyone else run across this error recently? Or just random
failed downloads? I've first received reports of it on July 29th, and
encountered it myself several times today.
* Has anything changed recently on the Wikimedia dump server side that
might require TLSv1? Or anything similar?
Any detail or feedback would be helpful.
Thanks.
----
I'm downloading XML data dumps from https://dumps.wikimedia.org
through XOWA. Recently, random downloads fail with a
javax.net.ssl.SSLException: "SSL peer shut down incorrectly"
Some details:
* This seems to happen more often when downloading large files. I ran
across it five times today on an openSUSE box. Each file was over 5
GB.
** https://dumps.wikimedia.org/commonswiki/20160801/commonswiki-20160801-pages…
** https://dumps.wikimedia.org/commonswiki/20160801/commonswiki-20160801-image…
** https://dumps.wikimedia.org/enwiki/20160801/enwiki-20160801-pages-articles.…
** https://dumps.wikimedia.org/enwiki/20160801/enwiki-20160801-pagelinks.sql.gz
* In addition, I know of one other person who also encountered this
error several times. This person was downloading Polish Wikipedia, and
encountered it on both Windows and Ubuntu. I believe they were in a
different part of the world (Poland vs East Coast United States)
** https://dumps.wikimedia.org/plwiki/20160801/plwiki-20160801-pages-articles.…
* This error does not occur when initiating the connection (at
handshake). It occurs sometime during the download of the file (for
example, at the 85% mark). The actual percentage appears to vary
(i.e.: not always at the 1 GB mark)
Some other observations:
* Restarting the download for the file seems to work, but that may
have just been "luck"
* I've downloaded other 5+ GB files without problems. For example,
https://dumps.wikimedia.org/wikidatawiki/20160801/wikidatawiki-20160801-pag…
. Again, this may have been just "luck"
* I've downloaded other "smaller" files such as the pageprops file
without problems.
* I've tried running the Java application with
"-Djavax.net.debug=all". This provides a lot of info, but nothing
particularly interesting. It seems to confirm that the server requires
TLSv1 (and not TLSv1.1 or TLSv1.2). Furthermore, I haven't been able
to reproduce the error while this debug flag is running.