Thank you for your attention! We would love to talk with you.
Regarding Hangout meeting, we are available M-F in the next week and further weeks. Please consider that we are living in Eastern US Time so between 10am - 8pm EST will be mostly available.
On Sat, Nov 28, 2015 at 3:50 AM, Pine W wiki.pine@gmail.com wrote:
Thanks for this initiative.
I think that concerns at the moment would be in the domains of privacy, security, lack of WMF analytics intstrumentation, and WMF fundraising limitations.
That said, looking in the longer term, a number of us in the community are interested in decreasing our dependencies on the Wikimedia Foundation as insurance against possible catastrophes and as a backup plan in case of another significant WMF dispute with the community. It might be worth exploring the options for setting up Wikipedia on infrastructure outside of WMF. I would be interested in talking with you to discuss this further; please let me know if you have time for a Hangout session in early to mid December.
Thank you for your interest! Pine On Nov 27, 2015 10:50 PM, "Yeongjin Jang" yeongjinjanggrad@gmail.com wrote:
Hi,
I am Yeongjin Jang, a Ph.D. Student at Georgia Tech.
In our lab (SSLab, https://sslab.gtisc.gatech.edu/), we are working on a project called B2BWiki, which enables users to share the contents of Wikipedia through WebRTC (peer-to-peer sharing).
Website is at here: http://b2bwiki.cc.gatech.edu/
The project aims to help Wikipedia by donating computing resources from the community; users can donate their traffic (by P2P communication) and storage (indexedDB) to reduce the load of Wikipedia servers. For larger organizations, e.g. schools or companies that have many local users, they can donate a mirror server similar to GNU FTP servers, which can bootstrap peer sharing.
Potential benefits that we think of are following.
- Users can easily donate their resources to the community.
Just visit the website.
- Users can get performance benefit if a page is loaded from
multiple local peers / local mirror (page load time got faster!).
Wikipedia can reduce its server workload, network traffic, etc.
Local network operators can reduce network traffic transit
(e.g. cost that is caused by delivering the traffic to the outside).
While we are working on enhancing the implementation, we would like to ask the opinions from actual developers of Wikipedia. For example, we want to know whether our direction is correct or not (will it actually reduce the load?), or if there are some other concerns that we missed, that can potentially prevent this system from working as intended. We really want to do somewhat meaningful work that actually helps run Wikipedia!
Please feel free to give as any suggestions, comments, etc. If you want to express your opinion privately, please contact sslab@cc.gatech.edu.
Thanks,
--- Appendix ---
I added some detailed information about B2BWiki in the following.
# Accessing data When accessing a page on B2BWiki, the browser will query peers first.
- If there exist peers that hold the contents, peer to peer download
happens. 2) otherwise, if there is no peer, client will download the content from the mirror server. 3) If mirror server does not have the content, it downloads from Wikipedia server (1 access per first download, and update).
# Peer lookup To enable content lookup for peers, we manage a lookup server that holds a page_name-to-peer map. A client (a user's browser) can query the list of peers that currently hold the content, and select the peer by its freshness (has hash/timestamp of the content, has top 2 octet of IP address (figuring out whether it is local peer or not), etc.
# Update, and integrity check Mirror server updates its content per each day (can be configured to update per each hour, etc). Update check is done by using If-Modified-Since header from Wikipedia server. On retrieving the content from Wikipedia, the mirror server stamps a timestamp and sha1 checksum, to ensure the freshness of data and its integrity. When clients lookup and download the content from the peers, client will compare the sha1 checksum of data with the checksum from lookup server.
In this settings, users can get older data (they can configure how to tolerate the freshness of data, e.g. 1day older, 3day, 1 week older, etc.), and the integrity is guaranteed by mirror/lookup server.
More detailed information can be obtained from the following website.
http://goo.gl/pSNrjR (URL redirects to SSLab@gatech website)
Please feel free to give as any suggestions, comments, etc.
Thanks,
Yeongjin Jang
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l