On Mar 17, 2004, at 15:58, Kelly Anderson wrote:
I'm sorry if my ignorance is showing here, but I
went to both
sourceforge and
gnutella.com and I still don't have any idea what
gnutella is, other than it's a peer to peer networking protocol for
file sharing. In other words, I can't figure out why Gnutella is
different from Ares, the old Napster, or Bittorrent from an
architectural point of view.
Gnutella is decentralized; it was designed in response to the classic
Napster, which could be and was shut down by suing the server runners
into oblivion. Search requests are passed from node to node to node in
a very inefficient fashion, but once a file is out there, the original
seeder need not remain on the network. Nodes generally provide many
files for peering (often all they have ever downloaded). The system was
designed for smallish files (up to several megabytes).
BitTorrent is very centralized; it was designed for legitimate
distribution of large files to many simultaneous downloaders, using
peer-to-peer transfers simply as a way to save bandwidth for the
central server. There is no searching mechanism; you must connect to
the particular tracker server managing the torrent for the file you
want and ask for it specifically. If the tracker goes offline,
everything fails. Nodes only make available for peered access the files
they are in progress downloading or have very recently downloaded (and
not yet closed the window on). The system was designed for large files
(tens, hundreds, or thousands of megabytes), and fetches pieces of a
file from multiple different peers simultaneously if possible.
Don't know anything about Ares.
Would someone who is familiar with both Gnutella and
Bittorrent tell
me why using Bittorrent for such a project would be stupid? It would
certainly use less of Wikipedia's already strained bandwidth.
The overhead of torrenting individual articles or PDF booklets on
particular subjects from Wikipedia would likely far outweigh simple
HTTP, particularly since it's generally unlikely that many people would
be downloading the same (small) file (out of many thousands available)
simultaneously.
Hypothetically it might be useful for distributing large bulk dumps
(such as the current database dumps), if and only if more than one
person at a time is likely to be downloading them.
Bandwidth really isn't a problem for Wikipedia; we use a fair amount of
it (compared to Joe Bob's homepage, not compared to Yahoo) but it's not
"strained".
-- brion vibber (brion @
pobox.com)