On Thu, 26 May 2005, Kate Turner wrote:
Andy Rabagliati wrote in
gmane.science.linguistics.wikipedia.technical:
On Wed, 25 May 2005, Kate Turner wrote:
i've started running an image dump for en.wp
using a version of
trickle with large-file support (the last one died after 2GB). if
this works i'll set up regular image dumps again along with the db
backups.
Can you set it up so it can be rsync'ed ?
do you want to rsync the tar file of all images, or the image directory as
its stored on disk?
The latter.
i don't see any benefit from the latter, and the
former is not something
that's feasible right now, although it may be doable in future...
I keep an identical tree this side, and rsync does the rest.
Even if you change the tree arrangement, I can write a script to
re-arrange this end. But .. dont :-)
If the archive has only changed by 5%, I only download 5%. I can ignore
archive trees and rescaled thumbnails.
It will
greatly save your bandwidth and mine, at the expense of some CPU.
we currently have a lot more problems in terms of CPU(/disk) use than
bandwidth, particularly in the area of images & the dumps service.
Bandwidth is the priority in Africa. However, the magnitude of
information means that even an out-of-date copy of wikipedia is very
useable. My worry sometimes is that a newer SQL dump refers to newer
pictures, where I have some perfectly appropriate ones here :-)
So perhaps I should use an SQL dump of the same vintage as the picture
archive for 'least astonishment'.
And, one day, there could be a trickle-back of edits, maybe moderated at
the remote end.
Eric has offered access to other (SQL ?) update methods, but I am busy
on other things and have not had time to investigate.
However, rsync to me is understandable and optimal for my needs, and
takes little of my time. And I'll just download SQL dumps.
Cheers, Andy!