Hi all,
Bot users are able download database dumps of the wikimedia projects at http://download.wikimedia.org/backup-index.html. These bzipped dumps can be fairly large (especially for the English language). Would not it be handy to have a bitorrent link to lower the load on the download server?
Greetings, Annabel
On 2/20/07, Annabel annabel@wikipedia.be wrote:
Hi all,
Bot users are able download database dumps of the wikimedia projects at http://download.wikimedia.org/backup-index.html. These bzipped dumps can be fairly large (especially for the English language). Would not it be handy to have a bitorrent link to lower the load on the download server?
It would definitely be handy. Does anyone know of a stable site which will host the tracker server for us (for free)? Surely there is one out there, since they'd only have to host the tracker server, not the file itself.
Anthony
Toolserver?
On 2/20/07, Anthony wikilegal@inbox.org wrote:
On 2/20/07, Annabel annabel@wikipedia.be wrote:
Hi all,
Bot users are able download database dumps of the wikimedia projects at http://download.wikimedia.org/backup-index.html. These bzipped dumps can be fairly large (especially for the English language). Would not it be handy to have a bitorrent link to lower the load on the download server?
It would definitely be handy. Does anyone know of a stable site which will host the tracker server for us (for free)? Surely there is one out there, since they'd only have to host the tracker server, not the file itself.
Anthony
foundation-l mailing list foundation-l@lists.wikimedia.org http://lists.wikimedia.org/mailman/listinfo/foundation-l
Annabel wrote:
Bot users are able download database dumps of the wikimedia projects at http://download.wikimedia.org/backup-index.html. These bzipped dumps can be fairly large (especially for the English language). Would not it be handy to have a bitorrent link to lower the load on the download server?
This was recently discussed briefly on wikitech-l, and the tentative consensus seemed to be that, while that might be useful, it's not a high priority because downloads of the database dumps don't currently account for a large percentage of Wikimedia's total bandwidth consumption, and so aren't the first place to look for bandwidth savings.
-Mark
It's not high priority, that's right, but you should also look at it from the users point of view. If a large download breaks at 85%, then you have to redownload the whole file. I know that there download managers, but with bitorrent you have extra certainty for the validity of your file: every 1 MB or so, you have a MD5 hash, so corrupt downloads can not happen. Lowering the bandwith on the download server is just an extra advantage.
Annabel
Delirium schreef:
Annabel wrote:
Bot users are able download database dumps of the wikimedia projects at http://download.wikimedia.org/backup-index.html. These bzipped dumps can be fairly large (especially for the English language). Would not it be handy to have a bitorrent link to lower the load on the download server?
This was recently discussed briefly on wikitech-l, and the tentative consensus seemed to be that, while that might be useful, it's not a high priority because downloads of the database dumps don't currently account for a large percentage of Wikimedia's total bandwidth consumption, and so aren't the first place to look for bandwidth savings.
-Mark
It's not high priority, that's right, but you should also look at it from the users point of view. If a large download breaks at 85%, then you have to redownload the whole file. I know that there download managers, but with bitorrent you have extra certainty for the validity of your file: every 1 MB or so, you have a MD5 hash, so corrupt downloads can not happen. Lowering the bandwith on the download server is just an extra advantage.
A download manager is a better way to increase your chances of a clean download. Bittorrent only works if you have lots of people downloading at the same time (or at least similar enough times that the people that have finished haven't given up seeding) - I'm not sure that's the case with Wikipedia DB dumps.
A download manager is a better way to increase your chances of a clean download. Bittorrent only works if you have lots of people downloading at the same time (or at least similar enough times that the people that have finished haven't given up seeding) - I'm not sure that's the case with Wikipedia DB dumps.
If the bandwidth isn't a problem, then you can use web seed[1] to back the torrent up when there aren't enough regular seeders. surely, bittorrent is the safest way to get uncorrupted downloads.
If the bandwidth isn't a problem, then you can use web seed[1] to back the torrent up when there aren't enough regular seeders. surely, bittorrent is the safest way to get uncorrupted downloads.
If you're using the Wikipedia server as the main seed, then you might as well download it directly. Bittorrent only does so many checks because it's so easy for it to get corrupted when you're downloading from so many people. A direct download is much safer.
On 2/21/07, Annabel annabel@wikipedia.be wrote:
It's not high priority, that's right, but you should also look at it from the users point of view. If a large download breaks at 85%, then you have to redownload the whole file. I know that there download managers, but with bitorrent you have extra certainty for the validity of your file: every 1 MB or so, you have a MD5 hash, so corrupt downloads can not happen. Lowering the bandwith on the download server is just an extra advantage.
Agreed, which brings me back to my question. Does anyone know of a stable site which will host the tracker server for us (for free)?
Anyone can host a bittorrent file - it doesn't have to be the WMF. I imagine there must be a site out there which hosts tracker servers for free, but I actually can't find one that isn't geared toward piracy (and therefore likely unstable).
If someone just goes ahead and sets one up, and gives the WMF the url to link to, then no one can complain about priorities, cause it takes 5 seconds to add the url :).
Anthony
On 2/22/07, Anthony wikilegal@inbox.org wrote:
Anyone can host a bittorrent file - it doesn't have to be the WMF.
Well, here's a test. Let me know if this works. It's going to be incredibly slow, of course: http://torrents.brightercomputing.com/torrents-details.php?id=62
If someone with a fast connection wants to act as a seed, the original file is at http://download.wikimedia.org/eowiki/20070209/eowiki-20070209-pages-meta-cur...
And while you're at it, if you've got a fast connection, even if you don't want to be a seed, try making a .torrent file to get us started. I'm not downloading the English Wikipedia right now...
Anthony
On 2/22/07, Anthony wikilegal@inbox.org wrote:
http://torrents.brightercomputing.com/torrents-details.php?id=62
No login url: http://www.mcfly.org/tracker/eowiki-20070209-pages-meta-current.xml.bz2.torr...
wikimedia-l@lists.wikimedia.org