Wikitech-l July 2013

wikitech-l@lists.wikimedia.org

158 participants
163 discussions

Flow with a Tow Truck
by C. Scott Ananian 29 Jul '13

29 Jul '13

I've mentioned this informally to a few people, but it came up again in discussion and I thought maybe I'd bring the idea to a wider audience. TowTruck (https://towtruck.mozillalabs.com/) is a realtime collaboration framework developed by Mozilla (but not firefox-specific). It provides the real-time communication infrastructure for collaboration on websites. It could be an interesting foundation for something like Flow. TowTruck has a plug-in architecture for custom editors. Visual Editor would use this to handle the synchronization within a visual editor widget. Tow Truck would provide the "find a friend", "real time chat" and other collaboration plumbing. Flow would provide the overall UX design for the collaborative process, especially the non-real-time archival parts. Anyway, I think Tow Truck is an interesting project, and it would be interesting to try to avoid reinventing parts of the wheel. I'm interested in others' thoughts. --scott -- (http://cscott.net)

3 3

Collaborative machine translation for Wikipedia -- proposed strategy
by David Cuenca 29 Jul '13

29 Jul '13

After Erik's email about supporting open source machine translation [1], I've been researching options and having talks with several machine translation researchers about what would be the best way to integrate MT into Wikipedia. Unfortunately I couldn't find a single solution that, on its own, would fulfill all requirements (specially being open source). On the plus side, there is a set of technologies that if integrated, they could provide a positive and reliable experience. It would be a hard way to get there, and even so, it might be worth exploring. This is the preliminary draft: https://meta.wikimedia.org/wiki/Collaborative_Machine_Translation_for_Wikip… It is a shame that the talk about "Supporting translation of Wikipedia content" [2] has not been accepted for WM13. Hopefully there will be enough interest to discuss this topic there anyways. Micru [1] http://thread.gmane.org/gmane.org.wikimedia.foundation/65605 [2] https://wikimania2013.wikimedia.org/wiki/Submissions/Supporting_translation…

4 7

Re: [Wikitech-l] [tools] Bot dispatcher
by Petr Bena 29 Jul '13

29 Jul '13

I somehow clicked reply instead of reply to all, my response is bellow... On Mon, Jul 29, 2013 at 1:14 PM, Petr Bena <benapetr(a)gmail.com> wrote: > On Mon, Jul 29, 2013 at 1:04 PM, Antoine Musso <hashar+wmf(a)free.fr> wrote: >> Le 28/07/13 18:35, Petr Bena a écrit : >>> I think you kind of misunderstood my proposal hashar :) I know that, >>> IRC feed is where the dispatcher is going to take data from, the >>> difference is, that dispatcher is a special service for bot operators, >>> that allow them to subscribe for selected pages / authors (even using >>> regular expressions) and it would filter these for them from RC feed >>> (currently the IRC version) and fill them up in a redis queue they >>> specify in a format they prefer. >> >> Petan, MzMcBribe, Ori and I had an IRC discussion on that topic this >> morning. Here is a quick summary. >> >> >> >> What I dislike in your proposal is that you are still relying on the IRC >> feed service which is not the best way to publish metadata. It is really >> meant to be consumed by IRC client for friendly human displaying. >> > > As I said on irc, the source code is very flexible, and indeed I am > now relying on the /only/ feed service we have in this moment, which > is IRC feed. No matter if we like it or not, it's the only service we > have and I MUST use it because there is no other thing. Once there is > anything better I can use that instead of IRC. > >> For the context the related code is in RecentChange::getIRCLine() and as >> an exemple there is the title formatting: >> >> "\00314[[\00307$title\00314]]"; >> >> Not easily parseable. Moreover the code has plenty of exceptions and >> craft a URL for end user to click. >> >> >> As I understood it, your bot would parse the horrible IRC syntax, craft >> some JSON and write it inn Redis for bots to consume. Thus bots authors >> will no more have to care about IRC format. That is an improvement, but >> we can do better. >> > > That is sort of true. The dispatcher will convert the current irc > message to some serializable class item. That can be serialized to > whatever format the bot developer who is target consumer prefer. In > this moment plain text (separated values with pipe) / xml and json are > available > >> Instead, we could have MediaWiki send JSON directly. Victor Vasiliev >> propsed a change to provide a JSON feed: >> >> https://gerrit.wikimedia.org/r/#/c/52922 >> >> We could have that feed send to EventLogging zero mqueue, and write >> subscribers to it that would put the RC events in Redis. >> > > That's indeed interesting, for dispatcher this means only that the > current parser of edits would be replaced with json parser (instead of > irc parser). However the subscribers you talk about is exactly what > dispatcherd is doing now (its existence kind of kills the requirement > of bot developers to create their own, which may be a lot of work). > People can subscribe to RC feed using a simple 2 line (in future > hopefully 1 line) command in terminal, which automagically creates a > redis queue filled with edits, see > https://wikitech.wikimedia.org/wiki/Bot_Dispatcher#Example_usage > >> To achieve that: >> >> - we need Victor patch to be polished up and deployed >> - find out what need to be written to Redis (one queue per bot? A >> shared queue?) >> - write a zmq subscriber to publish in Redis >> >> Eventually provide some library for bots author to easily query their >> Redis queue. >> >> >> In the end you have: >> - a very robust feeding system which is on par with the other events >> feeds we are already maintaining >> - got rid of IRC formatting >> - nice JSON out of the box :-] >> >> >> -- >> Antoine "hashar" Musso

1 0

[tools] Bot dispatcher
by Petr Bena 29 Jul '13

29 Jul '13

Hi, I have an idea for new service we could implement on tools project that would greatly save system resources. I would like to have some kind of feedback. Imagine a daemon similar to inet.d It would watch the recentchages of ALL wikis we have @wm and users could subscribe (using web browser or some terminal interface) to this service, so that on certain events (page X was modified), this bot dispatcher would do something (submit their bot on grid / sent some signal / tcp packet somewhere / insert data to redis etc etc). This way bot designers could very easily hook their bots to certain events without having to write their own "wiki-watchers". This would be extremely useful, not just for bots that should be triggered on event (someone edit some page) but also for bots that run periodically. For example: archiving bot is now running in a way, that it checks ALL pages where the template for archiving is. No matter if these talk pages are dead for years or not. Using such a dispatcher, everytime when a talk page was modified some script or process could be launched that would add it to some queue (redis like, even the dispatcher could have this as an event so that no process would need to be launched) and archiving bot would only check these pages that are active, instead of thousands dead pages. This way we could very efficiently schedule bots and save ton of system resources (cpu / memory / IO / network / even production servers load). It would also make it far easier for bot operators to create new tasks / bots as they would not need to program "wiki-watchers" themselves. What you think about it?

5 14

Updating Wikipedia based on Wikidata changes
by Denny Vrandečić 29 Jul '13

29 Jul '13

Hi, sorry for another long Email today. Currently, when you change a Wikidata item, its associated Wikipedia articles get told to update, too. So your change to the IMDB ID of a movie in Wikidata will be pushed to all language versions of that article on Wikipedia. Yay! There are two use cases that currently are not possible: * a Wikipedia article on a city might display the mayor. Now someone changes on Wikidata the label of the mayor - the Wikipedia article will get updated the next time the page is rendered, but there is no active update of the page. * a Wikipedia article might want to include data about another item than the associated item - most importantly for references, where I might be interested in the author of a book, it's year of publication, etc. This feature is currently disabled (even though it would be trivial to switch it on) because this information would only get updated when the page is actively rerendered. In order to enable these use cases we need to track on which pages (on Wikipedia) an item (from Wikidata) is used. We are thinking of doing this in two tables: * EntityUsage: one table per client. It has two columns, one with the pageId and one with the entityId, indexed on both columns (and one column with a pk, I guess, for OSC). * Subscriptions: one table on the client. It has two columns, one with the pageId and one with the siteId, indexed on both columns (and one column with a pk, I guess, for OSC). EntityUsage is a potentially big table (something like pagelinks-size). On a change on Wikidata, Wikidata consults the Subscriptions table, and based on that it dispatches the changes to all clients listed there for a given change. Then the client receives the changes and based on the EntityUsage table performs the necessary updates. We wanted to ask for input on this approach, and if you see problems or improvements that we should put in. Cheers, Denny -- Project director Wikidata Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin Tel. +49-30-219 158 26-0 | http://wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.

2 4

python vs php
by Petr Bena 29 Jul '13

29 Jul '13

Lot of people hate these discussions I <3 them. Can someone tell me some pros and cons of using python over php? I recently heard from several people that python is even better than php for website developement so I am wondering if that is actually true. Someone has experience with that?

11 14

Bugzilla Weekly Report
by reporter 29 Jul '13

29 Jul '13

1 0

On your python vs php talk
by Виталий Филиппов 29 Jul '13

29 Jul '13

Hi! The question is a good base for a holy war :-) I want to say that PHP has some advantages Python will never have - it's very simple in deployment, there is no fuss with library versions, nearly all needed features are already built-in including a good SAPI (!) so you don't need wsgi, psgi and etc, you don't need any virtualenvs for deploying because nobody typicaaly uses pear libraries :-) PHP is faster (if you don't take pypy and etc in account). Also I personally HATE block formatting using indentation. It's so silly idea no more language in the world has. Also for example I don't like python's strict typization ideas (for example it throws exception if you concatenate long and str using +). PHP is simple and has no such problems. And for webdev I don't like frameworks, either. I.e I don't like them at all - because I always feel they are trying to restrict me. So Django is not an argument for me, and may be not an argument for you too. And definitely you can't say django is just better than php. What php misses it's the builtin metaprogramming, but in 99% cases you should better write code instead of doing metaprogramming. So for webdev my opinion is that php is MUCH better than python.

6 10

Re: [Wikitech-l] On your python vs php talk
by vitalif＠yourcmc.ru 29 Jul '13

29 Jul '13

It's not "bad" design. It's "bad" only theoretically and just different from strongly-typed languages. I like its "inconsistent" function names - for a lot of functions they're similar to C and in most cases they're very easy to remember, as opposed to some other languages, including python (!!). Of course there are some nuances, but they're in any language. And I personally think "10" is semantically equal to 10 in most cases, so comparison is not a problem, either. You just need to be slightly more accurate while writing things. And my main idea is that only a statically typed should try to be strict. And python very oddly tries to be strict in some places while being dynamically typed. Look, it doesn't concatenate string and long - even Java does that!

2 2

Remove 'visualeditor-enable' from $wgHiddenPrefs
by Bartosz Dziewoński 27 Jul '13

27 Jul '13

This isn't an appropriate list for this, but MaxSem and hashar told me to post it here anyway, so here goes. There's a patch[1] to remove 'visualeditor-enable' from $wgHiddenPrefs, essentially allowing for disabling VE on a per-user basis again. It has overwhelming community support, but the VisualEditor team is refusing to acknowledge it, and ops say it's "none of their business". Can something be done about it? [1] https://gerrit.wikimedia.org/r/#/c/73565/ -- Matma Rex

39 100

← Newer
1
2
3
4
5
6
7
8
...
17
Older →

Jump to page:

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Wikitech-l July 2013