On Friday, March 1, 2013, Petr Bena wrote:
web frontend you say?
if you compare the raw data of irc protocol (1 rc feed message) and raw data of a http request and response for one page consisting only of that 1 rc feed message, you will see a huge difference in size and performance.
I was sugesting it for websockets or a long poll, the above comparison isn't relevant. Connection is established, with its protocol overhead. It stays open and messages are continually pushed from the server. Not a web request for a page containing one rc message.
Also all kinds of authentication required doesn't seem like an
improvement to me. It will only complicate what is simple now. Have there been many attempts to abuse irc.wikimedia.org so far? there is no authentication at all.
Maybe none is needed but I don't think the irc feed interests anyone outside of a very small community. Doing something a little more modern might attract different uses. It might not, but I have no idea.
On Fri, Mar 1, 2013 at 5:46 PM, Asher Feldman <afeldman@wikimedia.orgjavascript:;> wrote:
I don't think a custom daemon would actually be needed.
While I was at flickr, we implemented a pubsub based system to push notifications of all photo uploads and metadata changes to google using redis as the backend. The rate of uploads and edits at flickr in 2010 was orders of magnitude greater than the rate of edits across all wmf
projects.
Publishing to a redis pubsub channel does grow in cost as the number of subscribers increases but I don't see a problem at our scale. If so,
there
are ways around it.
We are planning on migrating the wiki job queues from mysql to redis in
the
next few weeks, so it's already a growing piece of our infrastructure. I think the bulk of the work here would actually just be in building a frontend webservice that supports websockets / long polling, provides a clean api, and preferably uses oauth or some form of registration to ward off abuse and allow us to limit the growth of subscribers as we scale.
On Friday, March 1, 2013, Petr Bena wrote:
I still don't see it as too much complex. Matter of month(s) for volunteers with limited time.
However I quite don't see what is so complicated on last 2 points. Given the frequency of updates it's most simple to have the client (user / bot / service that need to read the feed) open the persistent connection to server (dispatcher) which fork itself just as sshd does and the new process handle all requests from this client. The client somehow specify what kind of feed they want to have (that's the registration part) and forked dispatcher keeps it updated with information from cache.
Nothing hard. And what's the problem with multithreading huh? :) BTW I don't really think there is a need for multithreading at all, but even if there was, it shouldn't be so hard.
On Fri, Mar 1, 2013 at 3:47 PM, Tyler Romeo <tylerromeo@gmail.comjavascript:;
wrote:
On Fri, Mar 1, 2013 at 9:16 AM, Petr Bena <benapetr@gmail.comjavascript:;
wrote:
I have not yet found a good and stable library for JSON parsing in
c#,
should you know some let me know :)
Take a look at http://www.json.org/. They have a list of
implementations
for different languages.
However, I disagree with "I feel like such a project would take an
insane amount of resources to develop." If we wouldn't make it insanely complicated, it won't take insane amount of time ;). The cache daemon could be memcached which is already written and stable. Listener is a simple daemon that just listen in UDP, parse the data from mediawiki and store them in memcached in some universal format, and dispatcher is just process that takes the data from cache,
convert
them to specified format and send them to client.
Here's a quick list of things that are basic requirements we'd have to implement:
- Multi-threading, which is in and of itself a pain in the a**.
- Some sort of queue for messages, rather than hoping the daemon
can
send out every message in realtime.
- Ability for clients to register with the daemon (and a place to
store
a client list)
- Multiple methods of notification (IRC would be one, XMPP might
be a
candidate, and a simple HTTP endpoint would be a must).
Just those basics isn't an easy task, especially considering unless
WMF
allocates resources to it the project would be run solely by those who
have
enough free time. Also, I wouldn't use memcached as a caching daemon, primarily because I'm not sure such an application even needs a
caching
daemon. All it does is relay messages.
*--* *Tyler Romeo* Stevens Institute of Technology, Class of 2015 Major in Computer Science www.whizkidztech.com | tylerromeo@gmail.com javascript:;javascript:; _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org javascript:; javascript:; https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org javascript:; javascript:; https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org javascript:; https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org javascript:; https://lists.wikimedia.org/mailman/listinfo/wikitech-l