[Toolserver-l] Dump-Mail-Service
Marcin Cieslak
saper at system.pl
Mon Jul 14 18:47:49 UTC 2008
Stefan Kühn wrote:
> Hello!
>
> At the moment I and some other users of the dumps show every
> day/week/month at http://dumps.wikimedia.org/ for a new dump.
Please see the following shell script I am using to fetch
"-latest-pages-articles.xml.bz2" from plwiki and plwikisource.
This script run on Solaris 5.9 machine.
Unfortunately, the dumps no longer bear a "Last-Modified" header, so
currently wget is downloading them every day. Is there possibility to
have those headers back?
#! /bin/sh
WIKIHOME=/home/wiki
HTTP_PROXY=http://my.proxy:8080/
export HTTP_PROXY
[ "$#" = "0" ] && set plwikisource plwiki
[ -d "${WIKIHOME}/log" ] || mkdir -p "${WIKIHOME}/log"
for DB in "$@"
do
WIKI_XML="${WIKIHOME}/${DB}-latest-pages-articles.xml"
WIKI_BZIP="${WIKI_XML}.bz2"
LOGFILE="${WIKIHOME}/log/${DB}-`date +%Y%m%d-%H%M.log`"
/usr/local/bin/wget -o "${LOGFILE}" -P "${WIKIHOME}" -N \
http://download.wikimedia.org/${DB}/latest/${DB}-latest-pages-articles.xml.bz2
if /usr/bin/test "${WIKI_BZIP}" -nt "${WIKI_XML}"
then
/usr/bin/bzip2 -dc "${WIKI_BZIP}" > "${WIKI_XML}"
/usr/bin/touch -r "${WIKI_BZIP}" "${WIKI_XML}"
else
/usr/bin/rm "${LOGFILE}"
fi
done
> Thanks for your help!
>
> Stefan Kühn
> http://de.wikipedia.org/wiki/User:Stefan_K%C3%BChn
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 273 bytes
Desc: OpenPGP digital signature
Url : http://lists.wikimedia.org/pipermail/toolserver-l/attachments/20080714/924521b7/attachment.pgp
More information about the Toolserver-l
mailing list