Hello everybody,
I recently noticed that Wikipedia has starting using a P2P based system for distributing video, and since I am working on a similar system I contacted Michael Dale who suggested that I send an email to this list regarding this.
The most simple way to describe the system is that it is conceptually very similar to HTTP with P2P support; the protocol is essentially HTTP with additional commands to handle P2P operations.
Files are stored on a server that also keeps track of other hosts where all or parts of the content can be found. Before downloading a file, clients will contact the server to obtain general file information, such as file and block size. Files are divided into blocks and clients request a list of hosts where the blocks they want to retrieve can be found. The list might contain only a fixed set of addresses (such as that of the server and any mirrors), or it can include the addresses of other clients that have retrieved the blocks previously. Clients attempt to automatically determine the best hosts from the list to retrieve the content from and later report successful block downloads to the server.
The system also supports caches, similar to web caches, to reduce redundant network traffic.
More information can be found here: http://www.inet.no/shepherd/p2pd/
The software was previously extensively tested on PlanetLab, but now needs more testing after a major restructuring of the code. It also needs to be ported to Windows.
It is in other words not yet ready for production usage, but it would be interesting to discuss potential challenges with e.g., browser integration when a P2P-based protocol is used for video streaming. Not only will the protocol most likely not be directly supported by browsers, but to get the most benefit from a P2P system users should ideally continue to share the content they have downloaded after having finished watching it. This will require some kind of server process to continue running, perhaps independently of the browser. Especially for infrequently accessed video files this would likely have an impact on the degree to which using a P2P system will reduce the server load.
The following page has links to a few documents that describe various usage scenarios and some simple ways of integrating p2pd with a browser: http://www.inet.no/shepherd/p2pd/docs.html
Any thoughts on these integration issues from the perspective of Wikipedia?
Regards,
Karl-Andre' Skevik
My quick thought would be to think how the system could work over vanilla http. ie clients can't really be expected to install things. If the files are split into chunks per keyframe set, could a javascript system drive reassembly? And or could this work in conjunction with efforts to build adaptive streaming protocol. http://blog.gingertech.net/2010/10/09/adaptive-http-streaming-for-open-codec... Where each chunk could potentially be pulled from different servers.
The p2p architecture shifts to a relatively small amount of medium sized http support servers rather than large amount of p2p clients. Ie something medium sized hosts like universities and small ISPs could donate to, and have it work over normal http
I have also cc'ed people who worked on the p2p-next project, Arno Bakker, Riccardo, Diego and Jan Gerber. ( You guys may want to join Wikivideo-l list ). The p2p next group is working on a new architecture for p2p distribution that works over UDP ( so pretty different from the above proposal ) they presented their work at OVC. Arno is there a paper up somewhere that outlines the architecture of the new system your working on?
peace, --michael
On 10/14/2010 04:42 AM, Karl-Andre' Skevik wrote:
Hello everybody,
I recently noticed that Wikipedia has starting using a P2P based system for distributing video, and since I am working on a similar system I contacted Michael Dale who suggested that I send an email to this list regarding this.
The most simple way to describe the system is that it is conceptually very similar to HTTP with P2P support; the protocol is essentially HTTP with additional commands to handle P2P operations.
Files are stored on a server that also keeps track of other hosts where all or parts of the content can be found. Before downloading a file, clients will contact the server to obtain general file information, such as file and block size. Files are divided into blocks and clients request a list of hosts where the blocks they want to retrieve can be found. The list might contain only a fixed set of addresses (such as that of the server and any mirrors), or it can include the addresses of other clients that have retrieved the blocks previously. Clients attempt to automatically determine the best hosts from the list to retrieve the content from and later report successful block downloads to the server.
The system also supports caches, similar to web caches, to reduce redundant network traffic.
More information can be found here: http://www.inet.no/shepherd/p2pd/
The software was previously extensively tested on PlanetLab, but now needs more testing after a major restructuring of the code. It also needs to be ported to Windows.
It is in other words not yet ready for production usage, but it would be interesting to discuss potential challenges with e.g., browser integration when a P2P-based protocol is used for video streaming. Not only will the protocol most likely not be directly supported by browsers, but to get the most benefit from a P2P system users should ideally continue to share the content they have downloaded after having finished watching it. This will require some kind of server process to continue running, perhaps independently of the browser. Especially for infrequently accessed video files this would likely have an impact on the degree to which using a P2P system will reduce the server load.
The following page has links to a few documents that describe various usage scenarios and some simple ways of integrating p2pd with a browser: http://www.inet.no/shepherd/p2pd/docs.html
Any thoughts on these integration issues from the perspective of Wikipedia?
Regards,
Karl-Andre' Skevik
Wikivideo-l mailing list Wikivideo-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikivideo-l
Hello Michael,
If you have e.g., universities and ISPs donating resources (by running some sort of cache/mirror servers) it would be possible to do things a little simpler. If I understand you correctly, you could essentially combine the donated resources with an existing video server or server infrastructure to form a larger CDN that would help reduce the load on the video server. How the CDN communicated internally would not matter as long as it was able to accept normal HTTP requests from end users, and it would be possible to do something like scenario 3 on this page: http://www.inet.no/shepherd/p2pd/http_cdn.html
A university that wished to donate network resources would install a cache server that could accept requests for e.g., Wikimedia video files via HTTP and then register the IP/port number of the cache with Wikimedia, which would list it as a mirror. When the cache received a client request for a file it did not already have cached it would retrieve it before sending the data to the client. Rather than using HTTP internally, the video server and various caches could use a P2P protocol for communication to transmit data efficiently and reduce the load on the video server.
Additional benefits of an approach like this is that only the requested files are copied to the caches, and end-users would not need to install any new software; it would simply look like there were many normal HTTP mirrors. A drawback, compared to a system where all the caches have a copy of all the files on the video server, would potentially be extra latency each time a cache needs to retrieve data it does not have cached.
Having adaptive streaming would be quite interesting. As long as the client does the adaption or stream selection, there should be no problems with supporting this in a P2P-based system. The system would however probably be most efficient if it supported variable block sizes, with each chunk corresponding to one block.
Regards,
Karl-Andre' Skevik
Michael Dale mdale@wikimedia.org writes:
My quick thought would be to think how the system could work over vanilla http. ie clients can't really be expected to install things. If the files are split into chunks per keyframe set, could a javascript system drive reassembly? And or could this work in conjunction with efforts to build adaptive streaming protocol. http://blog.gingertech.net/2010/10/09/adaptive-http-streaming-for-open-codec... Where each chunk could potentially be pulled from different servers.
The p2p architecture shifts to a relatively small amount of medium sized http support servers rather than large amount of p2p clients. Ie something medium sized hosts like universities and small ISPs could donate to, and have it work over normal http
I have also cc'ed people who worked on the p2p-next project, Arno Bakker, Riccardo, Diego and Jan Gerber. ( You guys may want to join Wikivideo-l list ). The p2p next group is working on a new architecture for p2p distribution that works over UDP ( so pretty different from the above proposal ) they presented their work at OVC. Arno is there a paper up somewhere that outlines the architecture of the new system your working on?
peace, --michael
On 10/14/2010 04:42 AM, Karl-Andre' Skevik wrote:
Hello everybody,
I recently noticed that Wikipedia has starting using a P2P based system for distributing video, and since I am working on a similar system I contacted Michael Dale who suggested that I send an email to this list regarding this.
The most simple way to describe the system is that it is conceptually very similar to HTTP with P2P support; the protocol is essentially HTTP with additional commands to handle P2P operations.
Files are stored on a server that also keeps track of other hosts where all or parts of the content can be found. Before downloading a file, clients will contact the server to obtain general file information, such as file and block size. Files are divided into blocks and clients request a list of hosts where the blocks they want to retrieve can be found. The list might contain only a fixed set of addresses (such as that of the server and any mirrors), or it can include the addresses of other clients that have retrieved the blocks previously. Clients attempt to automatically determine the best hosts from the list to retrieve the content from and later report successful block downloads to the server.
The system also supports caches, similar to web caches, to reduce redundant network traffic.
More information can be found here: http://www.inet.no/shepherd/p2pd/
The software was previously extensively tested on PlanetLab, but now needs more testing after a major restructuring of the code. It also needs to be ported to Windows.
It is in other words not yet ready for production usage, but it would be interesting to discuss potential challenges with e.g., browser integration when a P2P-based protocol is used for video streaming. Not only will the protocol most likely not be directly supported by browsers, but to get the most benefit from a P2P system users should ideally continue to share the content they have downloaded after having finished watching it. This will require some kind of server process to continue running, perhaps independently of the browser. Especially for infrequently accessed video files this would likely have an impact on the degree to which using a P2P system will reduce the server load.
The following page has links to a few documents that describe various usage scenarios and some simple ways of integrating p2pd with a browser: http://www.inet.no/shepherd/p2pd/docs.html
Any thoughts on these integration issues from the perspective of Wikipedia?
Regards,
Karl-Andre' Skevik
Wikivideo-l mailing list Wikivideo-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikivideo-l
Wikivideo-l mailing list Wikivideo-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikivideo-l
LocalWords: Wikimedia HTTP
On 14 October 2010 20:04, Michael Dale mdale@wikimedia.org wrote:
My quick thought would be to think how the system could work over vanilla http. ie clients can't really be expected to install things. If the files are split into chunks per keyframe set, could a javascript system drive reassembly? And or could this work in conjunction with efforts to build adaptive streaming protocol. http://blog.gingertech.net/2010/10/09/adaptive-http-streaming-for-open-codec... Where each chunk could potentially be pulled from different servers.
That is a challenge.
The p2p architecture shifts to a relatively small amount of medium sized http support servers rather than large amount of p2p clients. Ie something medium sized hosts like universities and small ISPs could donate to, and have it work over normal http
Yep, economies of scale made server bandwidth way more attractive.
I have also cc'ed people who worked on the p2p-next project, Arno Bakker, Riccardo, Diego and Jan Gerber. ( You guys may want to join Wikivideo-l list ). The p2p next group is working on a new architecture for p2p distribution that works over UDP ( so pretty different from the above proposal ) they presented their work at OVC. Arno is there a paper up somewhere that outlines the architecture of the new system your working on?
Well, Arno is on vacation somewhere in US. There is an IETF draft outlining the transport protocol (not the entire arch). https://datatracker.ietf.org/doc/draft-grishchenko-ppsp-swift/ There is a mailing list for the project: https://listserv.tudelft.nl/mailman/listinfo/swift-ewi The general architecture itself may be described as "practical content-centric networking" or "federated CDN". While it is very much P2P-alike inside (server-to-server), it speaks HTTP to regular users.
wikivideo-l@lists.wikimedia.org