Wikitech-l August 2013

wikitech-l@lists.wikimedia.org

133 participants
129 discussions

by Tim Starling

Please note that the session cookie name on WMF wikis has changed from <wiki>_session to <wiki>Session. That is, for example, enwiki_session to enwikiSession. This was done in response to a security issue -- we had to find some way to rapidly cause all session IDs to be regenerated. Changing the cookie name was the simplest way to achieve that. More details should become available once we are sure that the security issue is fixed. -- Tim Starling

10 years, 9 months

1.22wmf13 rollout — rendering of mw-customeditbut ton

by billinghurst

With the rollout of 1.22wmf13 to the Wikisource sites last night, I am now seeing sporadic rendering of mw-customeditbutton onto the traditional toolbar. Sometimes I have to refresh a loaded page three or four times to get the additional buttons to render onto the toolbar. I have seen this at other times with Pathoschild's regex scripts not loading, but this is first time that the ResourceLoader issue (presumably) has been so recalcitrant with toolbar. Any suggestions on a resolution would be appreciated. Regards, Billinghurst

10 years, 9 months

Dashboard for the Non-roman scripts support of Visual Editor

by Ryu Cheol

Hello, visual editor developers and language engineers. As you know, some scripts are not supported by visual editor yet. I know you are working very hard to deploy the visual editor to other projects. I started a table on meta, http://meta.wikimedia.org/wiki/Visual_Editor/Asian_language_support, to share the current state for Non-roman scripts support in Visual Editor. I could meet various scripts users at an Asian meetup in Wikimania HK. We talked on the issue and hoped to assist the efforts. I would try to summarize the status and request testings of VE to the Asian Wikimedians who use the various scripts. I would like to hear what the visual editor team and team language engineering team think about this page. I think the name of the table could be expanded to include the other non-roman scripts. Cheol

10 years, 9 months

CirrusSearch live on test2wiki

by Chad

Hi all, If you haven't been following along with development, we're working on replacing MWSearch/lsearchd with CirrusSearch (powered by Elasticsearch). Nik and I been working on this for the last several weeks and now it's ready for wider testing outside of labs. This is invitation for you to play with some totally alpha software and report all kinds of bugs you may find. So hop on over to https://test2.wikipedia.organd mess around with Special:Search and the API's list search and prefix search. When testing this, it's useful to compare results from the old and new engines. I added some query parameters that let you toggle between them. Just add srbackend=LuceneSearch|CirrusSearch to force one or the other. All bugs can be filed in BZ under the CirrusSearch component[0]. Depending on how things go, we'd like to roll this out for mw.org on the 28th. Happy bug hunting :D -Chad [0] https://bugzilla.wikimedia.org/enter_bug.cgi?product=MediaWiki%20extensions…

10 years, 9 months

Bugzilla Weekly Report

by reporter

10 years, 9 months

trace gitblit, was: Re: Wikimedia's anti-surveillance plans: site hardening

by rupert THURNER

On Sat, Aug 17, 2013 at 8:40 PM, Ken Snider <ksnider(a)wikimedia.org> wrote: > > On Aug 17, 2013, at 1:33 PM, rupert THURNER <rupert.thurner(a)gmail.com> wrote: > >> hi faidon, i do not think you personally and WMF are particularly >> helpful in accepting contributions. because you: >> * do not communicate openly the problems >> * do not report upstream publically >> * do not ask for help, and even if it gets offered you just ignore it >> with quite some arrogance > > Rupert, please don't call out or attack specific people. We're all on the same team, and I can ... let me change the title, as this is not site hardening any more. > Further, Ops in general, and Faidon in particular, routinely report issues upstream. Our recent bug reports or patches to Varnish and Ceph are two examples that easily come to mind. Faidon was (rightly) attempting to restore service first ... yes ken, you are right, lets stick to the issues at hand: (1) by when you will finally decide to invest the 10 minutes and properly trace the gitblit application? you have the commands in the ticket: https://bugzilla.wikimedia.org/show_bug.cgi?id=51769 (2) by when you will adjust your operating guideline, so it is clear to faidon, ariel and others that 10 minutes tracing of an application and getting a holistic view is mandatory _before_ restoring the service, if it goes down for so often, and for days every time. the 10 minutes more can not be noticed if it is gone for more than a day. (3) how you will handle offers to help out of the community in future. like in the gitblit case, i offered to help tracing the problem while the service was down. max semenik now reported that gitblit should set rel="nofollow". best regards, rupert.

10 years, 9 months

Re: [Wikitech-l] Wikimedia's anti-surveillance plans

by Zack Weinberg

On 2013-08-18 1:04 PM, Bjoern Hoehrmann wrote: > an elision mark that does not explain itself. Makes you come across as > "hit send too early". My email client appears to have decided to post an early draft of the messages I sent on Friday. Sorry about that. Please ignore. For the record, I have read everything that was sent in response to those messages but probably won't get around to responding till Monday. zw

10 years, 9 months

MediaWiki CSP

by Tyler Romeo

Moving this to a separate thread for discussion. On Sat, Aug 17, 2013 at 5:14 PM, Chris Steipp <csteipp(a)wikimedia.org> wrote: > Yeah, but I *think* that one can be solved without affecting editors.. > Just to go over ResourceLoader first, right now the main inline JS we have is the load information for mw.config and mw.user.options, and then some bootstrap code (one at the beginning and one at the end of the file). The main problem is that many of these things are determined during the page request. We'd have to figure out how to move them to an independent JS file while preserving whatever the values are supposed to be. Building something to let them style, but in a way that inline css isn't > allowed by the CSP is something I haven't figured out yet. > Here's really the only thing I can think of: whenever the parser encounters inline styling, remove the styling and add an id attribute to the element if it doesn't already exist. Then have the client load something along the lines of /load.php?page=Main page, or something like that, which will instead extract that CSS and assign it to the associated ids. The main issue would be making sure that the method in which element ids are generated is deterministic so that they match up on the page and in the CSS file. *-- * *Tyler Romeo* Stevens Institute of Technology, Class of 2016 Major in Computer Science www.whizkidztech.com | tylerromeo(a)gmail.com

10 years, 9 months

Wikimedia's anti-surveillance plans: traffic analysis resistance

by Zack Weinberg

(Please see the thread titled "Wikimedia's anti-surveillance plans: site hardening" for who I am and some general context.) Once Wikipedia is up to snuff with all the site-hardening I recommended in the other thread, there remain two significant information leaks (and probably others, but these two are gonna be a big project all by themselves, so let's worry about them first). One is hostnames, and the other is page(+resource) length. Server hostnames are transmitted over the net in cleartext even when TLS is in use (because DNS operates in cleartext, and because the cleartext portion of the TLS handshake includes the hostname, so the server knows which certificate to send down). The current URL structure of *.wiki[pm]edia.org exposes sensitive information in the server hostname: for Wikipedia it's the language tag, for Wikimedia the subproject. Language seems like a serious exposure to me, potentially enough all by itself to finger a specific IP address as associated with a specific Wikipedia user handle. I realize how disruptive this would be, but I think we need to consider changing the canonical Wikipedia URL format to https://wikipedia.org/LANGUAGE/PAGENAME. For *.wikimedia.org it is less obvious what should be done. That domain makes use of subdomain partitioning to control the same-origin policy (for instance, upload.wikimedia.org needs to be a distinct hostname from everything else, lest someone upload e.g. a malicious SVG that exfiltrates your session cookies) so it cannot be altogether consolidated. However, knowing (for instance) whether a particular user is even *aware* of Commons or Meta may be enough to finger them, so we need to think about *some* degree of consolidation. --- Just how much information is exposed by page length (and how to best mitigate it) is a live area of basic research. It happens to be *my* area of basic research, and I would be interested in collaborating with y'all on locking it down (it would make a spiffy case study for my thesis :-) but I must emphasize that *we don't know if it is possible to prevent this attack*. I recommend that everyone interested in this topic read these articles: http://hal.archives-ouvertes.fr/docs/00/74/78/41/PDF/johnny2hotpet-finalcam… discusses why Web browsing history is sensitive information in general. http://kpdyer.com/publications/oakland2012.pdf and http://www.freehaven.net/anonbib/cache/ccs2012-fingerprinting.pdf demonstrate how page length can reveal page identity, debunk a number of "easy" fixes, and their reference lists are good portals to the literature. Finally, http://hal.inria.fr/docs/00/73/29/55/PDF/RR-8067.pdf demonstrates a related but perhaps even more insidious attack, whereby the eavesdropper learns the *user identity* of someone on a social network by virtue of the size of their profile photo. This last article raises a critical point. To render Wikipedia genuinely secure against traffic analysis, it is not sufficient for the eavesdropper to be unable to identify *which pages* are being read or edited. The eavesdropper may also be able to learn and make use of the answers to questions such as: * Given an IP address known to be communicating with WP/WM, whether or not there is a logged-in user responsible for the traffic. * Assuming it is known that a logged-in user is responsible for some traffic, *which user it is* (User: handle) or whether the user has any special privileges. * State transitions between uncredentialed and logged-in (in either direction). * State transitions between reading and editing. This is unlikely to be an exhaustive list. If we are serious about defending about traffic analysis, one of the first things we should do is have a bunch of experienced editors and developers sit down and work out an exhaustive list of things we don't want to reveal. (I have only ever dabbled in editing Wikipedia.) Now, once this is pinned down, theoretically, yes, the cure is padding. However, the padding inherent in TLS block cipher modes is *not* adequate; it's normally strictly "round up to the nearest multiple of 16 bytes", which has been shown to be completely inadequate. One of the above papers talks about patching GnuTLS to pad randomly by up to 256 bytes, but this too is probably insufficient. Random padding, in fact, is no good at all. The adversary can simply average over many pageloads and extract the true length. What's actually needed is to *bin* page (+resource) sizes such that any given load could be a substantial number of different pages. http://hal.inria.fr/docs/00/73/29/55/PDF/RR-8067.pdf also discusses how this can be done in principle. The project - and I emphasize that it would be a *project* - would be to arrange for MediaWiki (the software) to do this binning automatically, such that the adversary cannot learn anything useful either from individual traffic bursts or from a sequence of such bursts, without bulking up overall data transfer sizes too much. Again, this is something I am interested in helping with, provided it is understood that this is a live research question, that success is by no means guaranteed, and that success may come at the expense of other desirable qualities (such as not transmitting resources only needed for the editor until someone actually clicks an edit link). zw

10 years, 9 months

rupert and gitblit (was "site hardening" thread)

by Jeremy Baron

On Sat, Aug 17, 2013 at 5:33 PM, rupert THURNER <rupert.thurner(a)gmail.com> wrote: moved up from below because I'm answering your points in the context of gitblit: > On Sat, Aug 17, 2013 at 12:47 PM, Faidon Liambotis <faidon(a)wikimedia.org> wrote: > let me give you an example as well. git.wikimedia.org broke, and you, > faidon, did _absolutely nothing_ to give good feedback to upstream to > improve the gitblit software. > * upstream bug: https://code.google.com/p/gitblit/issues/detail?id=294 I don't think filing that was useful or appropriate. That's akin to a user of [[Diplopedia]] or portlandwiki.org reporting a site outage at WMF bugzilla. > hi faidon, i do not think you personally and WMF are particularly > helpful in accepting contributions. because you: > * do not communicate openly the problems What do you call https://bugzilla.wikimedia.org/show_bug.cgi?id=49371 ? > * do not report upstream publically and https://code.google.com/p/gitblit/issues/detail?id=274 ? > * do not ask for help, and even if it gets offered you just ignore it > with quite some arrogance Help is *very* often welcome and accepted. (sometimes takes a while to get outside submissions reviewed and sometimes not. there are certainly people that can help if something's gotten stuck. e.g. Sumana) As Ken said, getting the site working again is often a higher priority than making stack traces. This is especially true because it wasn't a new issue, time had already been put into investigating it and tweaking various parameters as well as working with upstream and it wasn't hard to reproduce the issue. (i.e. the stack trace could be made at other times if needed and could wait for it to be made by someone already investigating the problem; I don't know offhand how involved Faidon's been with the gitblit investigations but I know Chad was very (more?) involved with it.) -Jeremy

10 years, 9 months

← Newer
1
...
5
6
7
8
9
10
11
12
13
Older →

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Wikitech-l August 2013