Hi. I've run across an error while downloading dumps, which I've not seen before. (I've been downloading dumps regularly for over 5 years). I've done some research on it, but haven't found much information. It's a bit random, and from what I can tell, it's not failing for client-side reasons. I'm a bit stumped, so I'm reaching out for any help.
I provide more details below. My quick questions are:
* Has anyone else run across this error recently? Or just random failed downloads? I've first received reports of it on July 29th, and encountered it myself several times today. * Has anything changed recently on the Wikimedia dump server side that might require TLSv1? Or anything similar?
Any detail or feedback would be helpful.
Thanks.
----
I'm downloading XML data dumps from https://dumps.wikimedia.org through XOWA. Recently, random downloads fail with a javax.net.ssl.SSLException: "SSL peer shut down incorrectly"
Some details: * This seems to happen more often when downloading large files. I ran across it five times today on an openSUSE box. Each file was over 5 GB. ** https://dumps.wikimedia.org/commonswiki/20160801/commonswiki-20160801-pages-... ** https://dumps.wikimedia.org/commonswiki/20160801/commonswiki-20160801-image.... ** https://dumps.wikimedia.org/enwiki/20160801/enwiki-20160801-pages-articles.x... ** https://dumps.wikimedia.org/enwiki/20160801/enwiki-20160801-pagelinks.sql.gz * In addition, I know of one other person who also encountered this error several times. This person was downloading Polish Wikipedia, and encountered it on both Windows and Ubuntu. I believe they were in a different part of the world (Poland vs East Coast United States) ** https://dumps.wikimedia.org/plwiki/20160801/plwiki-20160801-pages-articles.x... * This error does not occur when initiating the connection (at handshake). It occurs sometime during the download of the file (for example, at the 85% mark). The actual percentage appears to vary (i.e.: not always at the 1 GB mark)
Some other observations: * Restarting the download for the file seems to work, but that may have just been "luck" * I've downloaded other 5+ GB files without problems. For example, https://dumps.wikimedia.org/wikidatawiki/20160801/wikidatawiki-20160801-page... . Again, this may have been just "luck" * I've downloaded other "smaller" files such as the pageprops file without problems. * I've tried running the Java application with "-Djavax.net.debug=all". This provides a lot of info, but nothing particularly interesting. It seems to confirm that the server requires TLSv1 (and not TLSv1.1 or TLSv1.2). Furthermore, I haven't been able to reproduce the error while this debug flag is running.
Hi. Two more points:
* It looks like none of my files worked correctly on the 2nd try. (I was misreading the log files when I thought they worked the 2nd time). I'm going to try multiple times, as well as with other download mechanisms (wget, browser download). Someone has reported that it worked the 3rd time with wget, so maybe persistence is they key. :) * I left the Java app running overnight. It failed with the error below. It's not that helpful, but I'm posting it in case anyone else notices anything. Note that I'm running with openJDK 1.8
----
openjdk version "1.8.0_91" OpenJDK Runtime Environment (IcedTea 3.0.1) (suse-12.1-x86_64) OpenJDK 64-Bit Server VM (build 25.91-b14, mixed mode)
----
1560: C1 10 0B 0B CF C3 9E 5A 91 84 .......Z.. main, handling exception: javax.net.ssl.SSLException: SSL peer shut down incorrectly %% Invalidated: [Session-1, TLS_DHE_RSA_WITH_AES_128_GCM_SHA256] main, SEND TLSv1.2 ALERT: fatal, description = unexpected_message Padded plaintext before ENCRYPTION: len = 2 0000: 02 0A .. main, WRITE: TLSv1.2 Alert, length = 26 [Raw write]: length = 31 0000: 15 03 03 00 1A 00 00 00 00 00 00 00 02 3C 52 6B .............<Rk 0010: 35 CF CE 06 AE 79 20 47 3A 0D 6D 31 FF 0B 6B 5....y G:.m1..k main, called closeSocket() download failed: src=https://dumps.wikimedia.org/enwiki/latest/enwiki-latest-pagelinks.sql.gz trg=/home/xowa/wiki/en.wikipedia.org/enwiki-latest-pagelinks.sql.gz err=[err 0] <javax.net.ssl.SSLException> SSL peer shut down incorrectly [trace]: sun.security.ssl.InputRecord.readV3Record(InputRecord.java:596) sun.security.ssl.InputRecord.read(InputRecord.java:532) sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:973) sun.security.ssl.SSLSocketImpl.readDataRecord(SSLSocketImpl.java:930) sun.security.ssl.AppInputStream.read(AppInputStream.java:105) java.io.BufferedInputStream.read1(BufferedInputStream.java:284) java.io.BufferedInputStream.read(BufferedInputStream.java:345) sun.net.www.MeteredStream.read(MeteredStream.java:134) java.io.FilterInputStream.read(FilterInputStream.java:133) sun.net.www.protocol.http.HttpURLConnection$HttpInputStream.read(HttpURLConnection.java:3336) java.io.BufferedInputStream.read1(BufferedInputStream.java:284) java.io.BufferedInputStream.read(BufferedInputStream.java:345) gplx.core.ios.IoEngine_system.DownloadFil(Unknown Source) gplx.core.ios.IoEngine_xrg_downloadFil.Exec(Unknown Source) gplx.xowa.bldrs.cmds.utils.Xob_download_cmd.Cmd_run(Unknown Source) gplx.xowa.bldrs.Xob_bldr.Run(Unknown Source) gplx.xowa.bldrs.Xob_bldr.Invk(Unknown Source) gplx.langs.gfs.GfsCore_.Exec(Unknown Source) gplx.langs.gfs.GfsCore_.Exec(Unknown Source) gplx.langs.gfs.GfsCore_.Exec(Unknown Source) gplx.langs.gfs.GfsCore.ExecOne_to(Unknown Source) gplx.xowa.apps.gfs.Xoa_gfs_mgr.Run_str_for(Unknown Source) gplx.xowa.apps.gfs.Xoa_gfs_mgr.Run_str_for(Unknown Source) gplx.xowa.apps.gfs.Xoa_gfs_mgr.Run_url_for(Unknown Source) gplx.xowa.apps.gfs.Xoa_gfs_mgr.Run_url(Unknown Source) gplx.xowa.apps.boots.Xoa_boot_mgr.Run_app(Unknown Source) gplx.xowa.apps.boots.Xoa_boot_mgr.Run(Unknown Source) gplx.xowa.Xoa_app_.Run(Unknown Source) gplx.xowa.Xowa_main.main(Unknown Source)
On Fri, Aug 5, 2016 at 10:28 PM, gnosygnu gnosygnu@gmail.com wrote:
Hi. I've run across an error while downloading dumps, which I've not seen before. (I've been downloading dumps regularly for over 5 years). I've done some research on it, but haven't found much information. It's a bit random, and from what I can tell, it's not failing for client-side reasons. I'm a bit stumped, so I'm reaching out for any help.
I provide more details below. My quick questions are:
- Has anyone else run across this error recently? Or just random
failed downloads? I've first received reports of it on July 29th, and encountered it myself several times today.
- Has anything changed recently on the Wikimedia dump server side that
might require TLSv1? Or anything similar?
Any detail or feedback would be helpful.
Thanks.
I'm downloading XML data dumps from https://dumps.wikimedia.org through XOWA. Recently, random downloads fail with a javax.net.ssl.SSLException: "SSL peer shut down incorrectly"
Some details:
- This seems to happen more often when downloading large files. I ran
across it five times today on an openSUSE box. Each file was over 5 GB. ** https://dumps.wikimedia.org/commonswiki/20160801/commonswiki-20160801-pages-... ** https://dumps.wikimedia.org/commonswiki/20160801/commonswiki-20160801-image.... ** https://dumps.wikimedia.org/enwiki/20160801/enwiki-20160801-pages-articles.x... ** https://dumps.wikimedia.org/enwiki/20160801/enwiki-20160801-pagelinks.sql.gz
- In addition, I know of one other person who also encountered this
error several times. This person was downloading Polish Wikipedia, and encountered it on both Windows and Ubuntu. I believe they were in a different part of the world (Poland vs East Coast United States) ** https://dumps.wikimedia.org/plwiki/20160801/plwiki-20160801-pages-articles.x...
- This error does not occur when initiating the connection (at
handshake). It occurs sometime during the download of the file (for example, at the 85% mark). The actual percentage appears to vary (i.e.: not always at the 1 GB mark)
Some other observations:
- Restarting the download for the file seems to work, but that may
have just been "luck"
- I've downloaded other 5+ GB files without problems. For example,
https://dumps.wikimedia.org/wikidatawiki/20160801/wikidatawiki-20160801-page... . Again, this may have been just "luck"
- I've downloaded other "smaller" files such as the pageprops file
without problems.
- I've tried running the Java application with
"-Djavax.net.debug=all". This provides a lot of info, but nothing particularly interesting. It seems to confirm that the server requires TLSv1 (and not TLSv1.1 or TLSv1.2). Furthermore, I haven't been able to reproduce the error while this debug flag is running.
I tried two files which took about 60 and 150 minutes to download. wget had to retry with partial data several times and completed the downloads after a few retries. A pattern emerges from when you look at the timestamps in the typescript of the wget jobs: apparently, a cron job which seems to run at 17,47 minutes every hour is restarting the web server.
GG
<pre> $ fgrep 2016-08-06 typescript --2016-08-06 09:37:05-- https://dumps.wikimedia.org/commonswiki/20160801/commonswiki-20160801-pages-... 2016-08-06 09:47:19 (1.82 MB/s) - Connection closed at byte 1170013869. Retrying. --2016-08-06 09:47:20-- (try: 2) https://dumps.wikimedia.org/commonswiki/20160801/commonswiki-20160801-pages-... 2016-08-06 10:17:29 (1.79 MB/s) - Connection closed at byte 4569103660. Retrying. --2016-08-06 10:17:31-- (try: 3) https://dumps.wikimedia.org/commonswiki/20160801/commonswiki-20160801-pages-... 2016-08-06 10:27:36 (1.88 MB/s) - ‘commonswiki-20160801-pages-articles.xml.bz2’ saved [5761778656/5761778656] --2016-08-06 14:30:35-- https://dumps.wikimedia.org/commonswiki/20160801/commonswiki-20160801-image.... 2016-08-06 14:47:21 (1.92 MB/s) - Connection closed at byte 2025914028. Retrying. --2016-08-06 14:47:22-- (try: 2) https://dumps.wikimedia.org/commonswiki/20160801/commonswiki-20160801-image.... 2016-08-06 15:48:01 (1.85 MB/s) - Connection closed at byte 9102130472. Retrying. --2016-08-06 15:48:03-- (try: 3) https://dumps.wikimedia.org/commonswiki/20160801/commonswiki-20160801-image.... 2016-08-06 16:17:37 (1.90 MB/s) - Connection closed at byte 12637698981. Retrying. --2016-08-06 16:17:40-- (try: 4) https://dumps.wikimedia.org/commonswiki/20160801/commonswiki-20160801-image.... 2016-08-06 17:06:17 (1.82 MB/s) - ‘commonswiki-20160801-image.sql.gz’ saved [18204308503/18204308503] </pre>
Wow! Thanks Gerhard! That's brilliant! I've been staring at it for a while, but didn't even notice that pattern. Kudos!
I confirm the same on my side as well. I checked the XOWA logs, and they all fail at around the 17 or 47 minute mark. I excerpt below.
I've also been trying wget today, and the failures are also at the same minute mark. I also excerpt below.
I'm hoping this behavior is accidental, as I can't imagine that hard interrupts would be intentional. Hopefully, Ariel or someone else will shed more light.
----
20160805_124702.853 download failed: src=https://dumps.wikimedia.org/commonswiki/latest/commonswiki-latest-pages-arti... err=[err 0] <javax.net.ssl.SSLException> SSL peer shut down incorrectly
20160805_141654.131 download failed: src=https://dumps.wikimedia.org/commonswiki/latest/commonswiki-latest-image.sql.... err=[err 0] <javax.net.ssl.SSLException> SSL peer shut down incorrectly
20160805_141654.131 download failed: src=https://dumps.wikimedia.org/commonswiki/latest/commonswiki-latest-image.sql.... err=[err 0] <javax.net.ssl.SSLException> SSL peer shut down incorrectly
20160805_234711.244 download failed: src=https://dumps.wikimedia.org/enwiki/latest/enwiki-latest-pagelinks.sql.gz err=[err 0] <javax.net.ssl.SSLException> SSL peer shut down incorrectly
20160806_051747.080 download failed: src=https://dumps.wikimedia.org/enwiki/latest/enwiki-latest-pagelinks.sql.gz err=[err 0] <javax.net.ssl.SSLException> SSL peer shut down incorrectly
20160806_154730.251 download failed: src=https://dumps.wikimedia.org/enwiki/latest/enwiki-latest-pagelinks.sql.gz err=[err 0] <javax.net.ssl.SSLException> SSL peer shut down incorrectly
20160806_154730.251 download failed: src=https://dumps.wikimedia.org/enwiki/latest/enwiki-latest-pagelinks.sql.gz err=[err 0] <javax.net.ssl.SSLException> SSL peer shut down incorrectly
----
51% [=====================================================================> ] 9,376,136,876 1.91MB/s in 84m 43s
2016-08-06 14:47:19 (1.76 MB/s) - Connection closed at byte 9376136876. Retrying.
--2016-08-06 14:47:20-- (try: 2) https://dumps.wikimedia.org/commonswiki/20160801/commonswiki-20160801-image.... Connecting to dumps.wikimedia.org (dumps.wikimedia.org)|208.80.154.11|:443... connected. HTTP request sent, awaiting response... 206 Partial Content Length: 18204308503 (17G), 8828171627 (8.2G) remaining [application/octet-stream] Saving to: ?commonswiki-20160801-image.sql.gz?
70% [+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++========================> ] 12,776,996,137 1.96MB/s in 29m 37s
2016-08-06 15:16:57 (1.83 MB/s) - Connection closed at byte 12776996137. Retrying.
--2016-08-06 15:16:59-- (try: 3) https://dumps.wikimedia.org/commonswiki/20160801/commonswiki-20160801-image.... Connecting to dumps.wikimedia.org (dumps.wikimedia.org)|208.80.154.11|:443... connected. HTTP request sent, awaiting response... 206 Partial Content Length: 18204308503 (17G), 5427312366 (5.1G) remaining [application/octet-stream] Saving to: ?commonswiki-20160801-image.sql.gz?
89% [++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++=========================> ] 16,265,214,885 1.75MB/s in 29m 55s
2016-08-06 15:46:55 (1.85 MB/s) - Connection closed at byte 16265214885. Retrying.
--2016-08-06 15:46:58-- (try: 4) https://dumps.wikimedia.org/commonswiki/20160801/commonswiki-20160801-image.... Connecting to dumps.wikimedia.org (dumps.wikimedia.org)|208.80.154.11|:443... connected. HTTP request sent, awaiting response... 206 Partial Content Length: 18204308503 (17G), 1939093618 (1.8G) remaining [application/octet-stream] Saving to: ?commonswiki-20160801-image.sql.gz?
100%[++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++==============>] 18,204,308,503 1.42MB/s in 17m 6s
Saving to: ?enwiki-20160801-pages-articles.xml.bz2?
45% [============================================================>
] 5,961,055,916 1.96MB/s in 49m 42s
2016-08-06 17:47:08 (1.91 MB/s) - Connection closed at byte 5961055916. Retrying.
--2016-08-06 17:47:09-- (try: 2) https://dumps.wikimedia.org/enwiki/20160801/enwiki-20160801-pages-articles.x... Connecting to dumps.wikimedia.org (dumps.wikimedia.org)|208.80.154.11|:443... connected. HTTP request sent, awaiting response... 206 Partial Content Length: 13142511189 (12G), 7181455273 (6.7G) remaining [application/octet-stream] Saving to: ?enwiki-20160801-pages-articles.xml.bz2?
71% [+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++===================================> ] 9,419,685,161 1.90MB/s in 29m 38s
2016-08-06 18:16:47 (1.86 MB/s) - Connection closed at byte 9419685161. Retrying.
--2016-08-06 18:16:49-- (try: 3) https://dumps.wikimedia.org/enwiki/20160801/enwiki-20160801-pages-articles.x... Connecting to dumps.wikimedia.org (dumps.wikimedia.org)|208.80.154.11|:443... connected. HTTP request sent, awaiting response... 206 Partial Content Length: 13142511189 (12G), 3722826028 (3.5G) remaining [application/octet-stream]
On Sat, Aug 6, 2016 at 12:32 PM, Gerhard Gonter ggonter@gmail.com wrote:
I tried two files which took about 60 and 150 minutes to download. wget had to retry with partial data several times and completed the downloads after a few retries. A pattern emerges from when you look at the timestamps in the typescript of the wget jobs: apparently, a cron job which seems to run at 17,47 minutes every hour is restarting the web server.
GG
<pre> $ fgrep 2016-08-06 typescript --2016-08-06 09:37:05-- https://dumps.wikimedia.org/commonswiki/20160801/commonswiki-20160801-pages-articles.xml.bz2 2016-08-06 09:47:19 (1.82 MB/s) - Connection closed at byte 1170013869. Retrying. --2016-08-06 09:47:20-- (try: 2) https://dumps.wikimedia.org/commonswiki/20160801/commonswiki-20160801-pages-articles.xml.bz2 2016-08-06 10:17:29 (1.79 MB/s) - Connection closed at byte 4569103660. Retrying. --2016-08-06 10:17:31-- (try: 3) https://dumps.wikimedia.org/commonswiki/20160801/commonswiki-20160801-pages-articles.xml.bz2 2016-08-06 10:27:36 (1.88 MB/s) - ‘commonswiki-20160801-pages-articles.xml.bz2’ saved [5761778656/5761778656] --2016-08-06 14:30:35-- https://dumps.wikimedia.org/commonswiki/20160801/commonswiki-20160801-image.sql.gz 2016-08-06 14:47:21 (1.92 MB/s) - Connection closed at byte 2025914028. Retrying. --2016-08-06 14:47:22-- (try: 2) https://dumps.wikimedia.org/commonswiki/20160801/commonswiki-20160801-image.sql.gz 2016-08-06 15:48:01 (1.85 MB/s) - Connection closed at byte 9102130472. Retrying. --2016-08-06 15:48:03-- (try: 3) https://dumps.wikimedia.org/commonswiki/20160801/commonswiki-20160801-image.sql.gz 2016-08-06 16:17:37 (1.90 MB/s) - Connection closed at byte 12637698981. Retrying. --2016-08-06 16:17:40-- (try: 4) https://dumps.wikimedia.org/commonswiki/20160801/commonswiki-20160801-image.sql.gz 2016-08-06 17:06:17 (1.82 MB/s) - ‘commonswiki-20160801-image.sql.gz’ saved [18204308503/18204308503] </pre>
There is no cron job that runs at those times. The puppet run on the dataset1001 host does run exatly at those times, so presumably something in one of the puppet jobs affects your downloads. I see nginx worker processes running since yesterday so it's definitely not a restart, graceful or otherwise. The puppet logs likewise do not indicate a restart or refresh of any services; in fact most of the time they indicate no changes whatsoever.
I'll look into this and track it here: https://phabricator.wikimedia.org/T142367 Please add your names to the ticket if you want to see updates as they are posted.
Ariel
On Sun, Aug 7, 2016 at 5:45 AM, gnosygnu gnosygnu@gmail.com wrote:
Wow! Thanks Gerhard! That's brilliant! I've been staring at it for a while, but didn't even notice that pattern. Kudos!
I confirm the same on my side as well. I checked the XOWA logs, and they all fail at around the 17 or 47 minute mark. I excerpt below.
I've also been trying wget today, and the failures are also at the same minute mark. I also excerpt below.
I'm hoping this behavior is accidental, as I can't imagine that hard interrupts would be intentional. Hopefully, Ariel or someone else will shed more light.
20160805_124702.853 download failed: src=https://dumps.wikimedia.org/commonswiki/latest/ commonswiki-latest-pages-articles.xml.bz2 err=[err 0] <javax.net.ssl.SSLException> SSL peer shut down incorrectly
20160805_141654.131 download failed: src=https://dumps.wikimedia.org/commonswiki/latest/ commonswiki-latest-image.sql.gz err=[err 0] <javax.net.ssl.SSLException> SSL peer shut down incorrectly
20160805_141654.131 download failed: src=https://dumps.wikimedia.org/commonswiki/latest/ commonswiki-latest-image.sql.gz err=[err 0] <javax.net.ssl.SSLException> SSL peer shut down incorrectly
20160805_234711.244 download failed: src=https://dumps.wikimedia.org/enwiki/latest/enwiki- latest-pagelinks.sql.gz err=[err 0] <javax.net.ssl.SSLException> SSL peer shut down incorrectly
20160806_051747.080 download failed: src=https://dumps.wikimedia.org/enwiki/latest/enwiki- latest-pagelinks.sql.gz err=[err 0] <javax.net.ssl.SSLException> SSL peer shut down incorrectly
20160806_154730.251 download failed: src=https://dumps.wikimedia.org/enwiki/latest/enwiki- latest-pagelinks.sql.gz err=[err 0] <javax.net.ssl.SSLException> SSL peer shut down incorrectly
20160806_154730.251 download failed: src=https://dumps.wikimedia.org/enwiki/latest/enwiki- latest-pagelinks.sql.gz err=[err 0] <javax.net.ssl.SSLException> SSL peer shut down incorrectly
51% [=========================================================== ==========> ] 9,376,136,876 1.91MB/s in 84m 43s
2016-08-06 14:47:19 (1.76 MB/s) - Connection closed at byte 9376136876. Retrying.
--2016-08-06 14:47:20-- (try: 2) https://dumps.wikimedia.org/commonswiki/20160801/ commonswiki-20160801-image.sql.gz Connecting to dumps.wikimedia.org (dumps.wikimedia.org)|208.80.154.11|:443... connected. HTTP request sent, awaiting response... 206 Partial Content Length: 18204308503 (17G), 8828171627 (8.2G) remaining [application/octet-stream] Saving to: ?commonswiki-20160801-image.sql.gz?
70% [+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ ++++++++++========================> ] 12,776,996,137 1.96MB/s in 29m 37s
2016-08-06 15:16:57 (1.83 MB/s) - Connection closed at byte 12776996137. Retrying.
--2016-08-06 15:16:59-- (try: 3) https://dumps.wikimedia.org/commonswiki/20160801/ commonswiki-20160801-image.sql.gz Connecting to dumps.wikimedia.org (dumps.wikimedia.org)|208.80.154.11|:443... connected. HTTP request sent, awaiting response... 206 Partial Content Length: 18204308503 (17G), 5427312366 (5.1G) remaining [application/octet-stream] Saving to: ?commonswiki-20160801-image.sql.gz?
89% [+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ +++++++++++++++++++++++++++++++++++=========================> ] 16,265,214,885 1.75MB/s in 29m 55s
2016-08-06 15:46:55 (1.85 MB/s) - Connection closed at byte 16265214885. Retrying.
--2016-08-06 15:46:58-- (try: 4) https://dumps.wikimedia.org/commonswiki/20160801/ commonswiki-20160801-image.sql.gz Connecting to dumps.wikimedia.org (dumps.wikimedia.org)|208.80.154.11|:443... connected. HTTP request sent, awaiting response... 206 Partial Content Length: 18204308503 (17G), 1939093618 (1.8G) remaining [application/octet-stream] Saving to: ?commonswiki-20160801-image.sql.gz?
100%[+++++++++++++++++++++++++++++++++++++++++++++++++++++++ ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ +++++==============>] 18,204,308,503 1.42MB/s in 17m 6s
Saving to: ?enwiki-20160801-pages-articles.xml.bz2?
45% [============================================================>
] 5,961,055,916 1.96MB/s in 49m 42s
2016-08-06 17:47:08 (1.91 MB/s) - Connection closed at byte 5961055916. Retrying.
--2016-08-06 17:47:09-- (try: 2) https://dumps.wikimedia.org/enwiki/20160801/enwiki- 20160801-pages-articles.xml.bz2 Connecting to dumps.wikimedia.org (dumps.wikimedia.org)|208.80.154.11|:443... connected. HTTP request sent, awaiting response... 206 Partial Content Length: 13142511189 (12G), 7181455273 (6.7G) remaining [application/octet-stream] Saving to: ?enwiki-20160801-pages-articles.xml.bz2?
71% [+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ ++===================================> ] 9,419,685,161 1.90MB/s in 29m 38s
2016-08-06 18:16:47 (1.86 MB/s) - Connection closed at byte 9419685161. Retrying.
--2016-08-06 18:16:49-- (try: 3) https://dumps.wikimedia.org/enwiki/20160801/enwiki- 20160801-pages-articles.xml.bz2 Connecting to dumps.wikimedia.org (dumps.wikimedia.org)|208.80.154.11|:443... connected. HTTP request sent, awaiting response... 206 Partial Content Length: 13142511189 (12G), 3722826028 (3.5G) remaining [application/octet-stream]
On Sat, Aug 6, 2016 at 12:32 PM, Gerhard Gonter ggonter@gmail.com wrote:
I tried two files which took about 60 and 150 minutes to download. wget had to retry with partial data several times and completed the downloads after a few retries. A pattern emerges from when you look at the timestamps in the typescript of the wget jobs: apparently, a cron job which seems to run at 17,47 minutes every hour is restarting the web server.
GG
<pre> $ fgrep 2016-08-06 typescript --2016-08-06 09:37:05-- https://dumps.wikimedia.org/commonswiki/20160801/
commonswiki-20160801-pages-articles.xml.bz2
2016-08-06 09:47:19 (1.82 MB/s) - Connection closed at byte 1170013869. Retrying. --2016-08-06 09:47:20-- (try: 2) https://dumps.wikimedia.org/commonswiki/20160801/
commonswiki-20160801-pages-articles.xml.bz2
2016-08-06 10:17:29 (1.79 MB/s) - Connection closed at byte 4569103660. Retrying. --2016-08-06 10:17:31-- (try: 3) https://dumps.wikimedia.org/commonswiki/20160801/
commonswiki-20160801-pages-articles.xml.bz2
2016-08-06 10:27:36 (1.88 MB/s) - ‘commonswiki-20160801-pages-articles.xml.bz2’ saved [5761778656/5761778656] --2016-08-06 14:30:35-- https://dumps.wikimedia.org/commonswiki/20160801/
commonswiki-20160801-image.sql.gz
2016-08-06 14:47:21 (1.92 MB/s) - Connection closed at byte 2025914028. Retrying. --2016-08-06 14:47:22-- (try: 2) https://dumps.wikimedia.org/commonswiki/20160801/
commonswiki-20160801-image.sql.gz
2016-08-06 15:48:01 (1.85 MB/s) - Connection closed at byte 9102130472. Retrying. --2016-08-06 15:48:03-- (try: 3) https://dumps.wikimedia.org/commonswiki/20160801/
commonswiki-20160801-image.sql.gz
2016-08-06 16:17:37 (1.90 MB/s) - Connection closed at byte 12637698981. Retrying. --2016-08-06 16:17:40-- (try: 4) https://dumps.wikimedia.org/commonswiki/20160801/
commonswiki-20160801-image.sql.gz
2016-08-06 17:06:17 (1.82 MB/s) - ‘commonswiki-20160801-image.sql.gz’ saved [18204308503/18204308503] </pre>
Xmldatadumps-l mailing list Xmldatadumps-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/xmldatadumps-l
xmldatadumps-l@lists.wikimedia.org