While I was at flickr, we implemented a pubsub based system to push
notifications of all photo uploads and metadata changes to google using
redis as the backend. The rate of uploads and edits at flickr in 2010 was
orders of magnitude greater than the rate of edits across all wmf projects.
Publishing to a redis pubsub channel does grow in cost as the number of
subscribers increases but I don't see a problem at our scale. If so, there
are ways around it.
We are planning on migrating the wiki job queues from mysql to redis in the
next few weeks, so it's already a growing piece of our infrastructure. I
think the bulk of the work here would actually just be in building
a frontend webservice that supports websockets / long polling, provides a
clean api, and preferably uses oauth or some form of registration to ward
off abuse and allow us to limit the growth of subscribers as we scale.
On Friday, March 1, 2013, Petr Bena wrote:
I still don't see it as too much complex.
Matter of month(s) for
volunteers with limited time.
However I quite don't see what is so complicated on last 2 points.
Given the frequency of updates it's most simple to have the client
(user / bot / service that need to read the feed) open the persistent
connection to server (dispatcher) which fork itself just as sshd does
and the new process handle all requests from this client. The client
somehow specify what kind of feed they want to have (that's the
registration part) and forked dispatcher keeps it updated with
information from cache.
Nothing hard. And what's the problem with multithreading huh? :) BTW I
don't really think there is a need for multithreading at all, but even
if there was, it shouldn't be so hard.
On Fri, Mar 1, 2013 at 3:47 PM, Tyler Romeo
<tylerromeo@gmail.com<javascript:;>>
wrote:
On Fri, Mar 1, 2013 at 9:16 AM, Petr Bena
<benapetr@gmail.com<javascript:;>>
wrote:
I have not yet found a good and stable library
for JSON parsing in c#,
should you know some let me know :)
Take a look at
http://www.json.org/. They have a list of implementations
for different languages.
However, I disagree with "I feel like such a project would take an
insane amount of resources to develop." If
we wouldn't make it
insanely complicated, it won't take insane amount of time ;). The
cache daemon could be memcached which is already written and stable.
Listener is a simple daemon that just listen in UDP, parse the data
from mediawiki and store them in memcached in some universal format,
and dispatcher is just process that takes the data from cache, convert
them to specified format and send them to client.
Here's a quick list of things that are basic requirements we'd have to
implement:
- Multi-threading, which is in and of itself a pain in the a**.
- Some sort of queue for messages, rather than hoping the daemon can
send out every message in realtime.
- Ability for clients to register with the daemon (and a place to
store
a client list)
- Multiple methods of notification (IRC would be one, XMPP might be a
candidate, and a simple HTTP endpoint would be a must).
Just those basics isn't an easy task, especially considering unless WMF
allocates resources to it the project would be run solely by those who
have
enough free time. Also, I wouldn't use
memcached as a caching daemon,
primarily because I'm not sure such an application even needs a caching
daemon. All it does is relay messages.
*--*
*Tyler Romeo*
Stevens Institute of Technology, Class of 2015
Major in Computer Science
www.whizkidztech.com | tylerromeo(a)gmail.com <javascript:;>
_______________________________________________
Wikitech-l mailing list
Wikitech-l(a)lists.wikimedia.org <javascript:;>
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
_______________________________________________
Wikitech-l mailing list
Wikitech-l(a)lists.wikimedia.org <javascript:;>
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
_______________________________________________
Wikitech-l mailing list
Wikitech-l(a)lists.wikimedia.org