Re: [Wikitech-l] switching to something better than irc.wikimedia.org

1 Mar 2013

      On Friday, March 1, 2013, Petr Bena wrote:
...
web frontend you say?
if you compare the raw data of irc protocol (1 rc feed message) and
raw data of a http request and response for one page consisting only
of that 1 rc feed message, you will see a huge difference in size and
performance.
I was sugesting it for websockets or a long poll, the above comparison
isn't relevant.  Connection is established, with its protocol overhead. It
stays open and messages are continually pushed from the server. Not a web
request for a page containing one rc message.
Also all kinds of authentication required doesn't seem like an
...
improvement to me. It will only complicate what is simple now. Have
there been many attempts to abuse irc.wikimedia.org so far? there is
no authentication at all.
Maybe none is needed but I don't think the irc feed interests anyone
outside of a very small community. Doing something a little more modern
might attract different uses. It might not, but I have no idea.
...
On Fri, Mar 1, 2013 at 5:46 PM, Asher Feldman <afeldman@wikimedia.orgjavascript:;>
wrote:
...
I don't think a custom daemon would actually be needed.
http://redis.io/topics/pubsub
While I was at flickr, we implemented a pubsub based system to push
notifications of all photo uploads and metadata changes to google using
redis as the backend. The rate of uploads and edits at flickr in 2010 was
orders of magnitude greater than the rate of edits across all wmf
projects.
...
Publishing to a redis pubsub channel does grow in cost as the number of
subscribers increases but I don't see a problem at our scale. If so,
there
...
are ways around it.
We are planning on migrating the wiki job queues from mysql to redis in
the
...
next few weeks, so it's already a growing piece of our infrastructure.  I
think the bulk of the work here would actually just be in building
a frontend webservice that supports websockets / long polling, provides a
clean api, and preferably uses oauth or some form of registration to ward
off abuse and allow us to limit the growth of subscribers as we scale.
On Friday, March 1, 2013, Petr Bena wrote:
...
I still don't see it as too much complex. Matter of month(s) for
volunteers with limited time.
However I quite don't see what is so complicated on last 2 points.
Given the frequency of updates it's most simple to have the client
(user / bot / service that need to read the feed) open the persistent
connection to server (dispatcher) which fork itself just as sshd does
and the new process handle all requests from this client. The client
somehow specify what kind of feed they want to have (that's the
registration part) and forked dispatcher keeps it updated with
information from cache.
Nothing hard. And what's the problem with multithreading huh? :) BTW I
don't really think there is a need for multithreading at all, but even
if there was, it shouldn't be so hard.
On Fri, Mar 1, 2013 at 3:47 PM, Tyler Romeo <tylerromeo@gmail.comjavascript:;
javascript:;>
...
...
wrote:
...
On Fri, Mar 1, 2013 at 9:16 AM, Petr Bena <benapetr@gmail.comjavascript:;
javascript:;>
...
...
wrote:
...
...
I have not yet found a good and stable library for JSON parsing in
c#,
...
...
...
...
should you know some let me know :)
Take a look at http://www.json.org/. They have a list of
implementations
...
...
...
for different languages.
However, I disagree with "I feel like such a project would take an
...
insane amount of resources to develop." If we wouldn't make it
insanely complicated, it won't take insane amount of time ;). The
cache daemon could be memcached which is already written and stable.
Listener is a simple daemon that just listen in UDP, parse the data
from mediawiki and store them in memcached in some universal format,
and dispatcher is just process that takes the data from cache,
convert
...
...
...
...
them to specified format and send them to client.
Here's a quick list of things that are basic requirements we'd have to
implement:

Multi-threading, which is in and of itself a pain in the a**.
Some sort of queue for messages, rather than hoping the daemon

can
...
...
...
send out every message in realtime.

Ability for clients to register with the daemon (and a place to

store
...
a client list)

Multiple methods of notification (IRC would be one, XMPP might

be a
...
...
...
candidate, and a simple HTTP endpoint would be a must).
Just those basics isn't an easy task, especially considering unless
WMF
...
...
...
allocates resources to it the project would be run solely by those who
have
...
enough free time. Also, I wouldn't use memcached as a caching daemon,
primarily because I'm not sure such an application even needs a
caching
...
...
...
daemon. All it does is relay messages.
*--*
*Tyler Romeo*
Stevens Institute of Technology, Class of 2015
Major in Computer Science
www.whizkidztech.com | tylerromeo@gmail.com javascript:;javascript:;
_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org javascript:; javascript:;
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org javascript:; javascript:;
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org javascript:;
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org javascript:;
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: [Wikitech-l] switching to something better than irc.wikimedia.org