Wikitech-l May 2013

wikitech-l@lists.wikimedia.org

158 participants
192 discussions

by Sam Reed

I'm happy to announce the availability of the second beta release of the new MediaWiki 1.19 release series. Please try it out and let us know what you think. Don't run it on any wikis that you really care about, unless you are both very brave and very confident in your MediaWiki administration skills. MediaWiki 1.19 is a large release that contains many new features and bug fixes. This is a summary of the major changes of interest to users. You can consult the RELEASE-NOTES-1.19 file for the full list of changes in this version. Five security issues were discovered. It was discovered that the api had a cross-site request forgery (CSRF) vulnerability in the block/unblock modules. It was possible for a user account with the block privileges to block or unblock another user without providing a token. For more details, see https://bugzilla.wikimedia.org/show_bug.cgi?id=34212 It was discovered that the resource loader can leak certain kinds of private data across domain origin boundaries, by providing the data as an executable JavaScript file. In MediaWiki 1.18 and later, this includes the leaking of CSRF protection tokens. This allows compromise of the wiki's user accounts, say by changing the user's email address and then requesting a password reset. For more details, see https://bugzilla.wikimedia.org/show_bug.cgi?id=34907 Jan Schejbal of Hatforce.com discovered a cross-site request forgery (CSRF) vulnerability in Special:Upload. Modern browsers (since at least as early as December 2010) are able to post file uploads without user interaction, violating previous security assumptions within MediaWiki. Depending on the wiki's configuration, this vulnerability could lead to further compromise, especially on private wikis where the set of allowed file types is broader than on public wikis. Note that CSRF allows compromise of a wiki from an external website even if the wiki is behind a firewall. For more details, see https://bugzilla.wikimedia.org/show_bug.cgi?id=35317 George Argyros and Aggelos Kiayias reported that the method used to generate password reset tokens is not sufficiently secure. Instead we use various more secure random number generators, depending on what is available on the platform. Windows users are strongly advised to install either the openssl extension or the mcrypt extension for PHP so that MediaWiki can take advantage of the cryptographic random number facility provided by Windows. Any extension developers using mt_rand() to generate random numbers in contexts where security is required are encouraged to instead make use of the MWCryptRand class introduced with this release. For more details, see https://bugzilla.wikimedia.org/show_bug.cgi?id=35078 A long-standing bug in the wikitext parser (bug 22555) was discovered to have security implications. In the presence of the popular CharInsert extension, it leads to cross-site scripting (XSS). XSS may be possible with other extensions or perhaps even the MediaWiki core alone, although this is not confirmed at this time. A denial-of-service attack (infinite loop) is also possible regardless of configuration. For more details, see https://bugzilla.wikimedia.org/show_bug.cgi?id=35315 ********************************************************************* What's new? ********************************************************************* MediaWiki 1.19 brings the usual host of various bugfixes and new features. Comprehensive list of what's new is in the release notes. * Bumped MySQL version requirement to 5.0.2. * Disable the partial HTML and MathML rendering options for Math, and render as PNG by default. * MathML mode was so incomplete most people thought it simply didn't work. * New skins/common/*.css files usable by skins instead of having to copy piles of generic styles from MonoBook or Vector's css. * The default user signature now contains a talk link in addition to the user link. * Searching blocked usernames in block log is now clearer. * Better timezone recognition in user preferences. * Extensions can now participate in the extraction of titles from URL paths. * The command-line installer supports various RDBMSes better. * The interwiki links table can now be accessed also when the interwiki cache is used (used in the API and the Interwiki extension). Internationalization - -------------------- * More gender support (for instance in user lists). * Add languages: Canadian English. * Language converter improved, e.g. it now works depending on the page content language. * Time and number-formatting magic words also now depend on the page content language. * Bidirectional support further improved after 1.18. Release notes - ------------- Full release notes: https://gerrit.wikimedia.org/r/gitweb?p=mediawiki/core.git;a=blob_plain;f=RE LEASE-NOTES-1.19;hb=1.19.0beta2 https://www.mediawiki.org/wiki/Release_notes/1.19 Co-inciding with these security releases, the MediaWiki source code repository has moved from SVN (at https://svn.wikimedia.org/viewvc/mediawiki/trunk/phase3) to Git (https://gerrit.wikimedia.org/gitweb/mediawiki/core.git). So the relevant commits for these releases will not be appearing in our SVN repository. If you use SVN checkouts of MediaWiki for version control, you need to migrate these to Git. If you up are using tarballs, there should be no change in the process for you. Please note that any WMF-deployed extensions have also been migrated to Git also, along with some other non WMF-maintained ones. Please bear with us, some of the Git related links for this release may not work instantly, but should later on. To do a simple Git clone, the command is: git clone https://gerrit.wikimedia.org/r/p/mediawiki/core.git More information is available at https://www.mediawiki.org/wiki/Git For more help, please visit the #mediawiki IRC channel on freenode.net irc://irc.freenode.net/mediawiki or email The MediaWiki-l mailing list at mediawiki-l(a)lists.wikimedia.org. ********************************************************************** Download: http://download.wikimedia.org/mediawiki/1.19/mediawiki-1.19.0beta2.tar.gz Patch to previous version (1.19.0beta1), without interface text: http://download.wikimedia.org/mediawiki/1.19/mediawiki-1.19.0beta2.patch.gz Interface text changes: http://download.wikimedia.org/mediawiki/1.19/mediawiki-i18n-1.19.0beta2.patc h.gz GPG signatures: http://download.wikimedia.org/mediawiki/1.19/mediawiki-1.19.0beta2.tar.gz.si g http://download.wikimedia.org/mediawiki/1.19/mediawiki-1.19.0beta2.patch.gz. sig http://download.wikimedia.org/mediawiki/1.19/mediawiki-i18n-1.19.0beta2.patc h.gz.sig Public keys: https://secure.wikimedia.org/keys.html

9 years, 12 months

New git-review lets you configure 'origin' as the gerrit remote

by Ori Livneh

Hey, The new version of git-review released today (1.22) includes a patch I wrote that makes it possible to work against a single 'origin' remote. This amounts to a workaround for git-review's tendency to frighten you into thinking you're about to submit more patches than the ones you are working on. It makes git-review more pleasant to work with, in my opinion. To enable this behavior, you first need to upgrade to the latest version of git-review, by running "pip install -U git-review". Then you need to create a configuration file: either /etc/git-review/git-review.conf (system-wide) or ~/.config/git-review/git-review.conf (user-specific). The file should contain these two lines: [gerrit] defaultremote = origin Once you've made the change, any new Gerrit repos you clone using an authenticated URI will just work. You'll need to perform an additional step to migrate existing repositories. In each repository, run the following commands: git remote set-url origin $(git config --get remote.gerrit.url) git remote rm gerrit git review -s Hope you find this useful.

10 years, 6 months

Can we help Tor users make legitimate edits?

by Sumana Harihareswara

TL;DR: A few ideas follow on how we could possibly help legit editors contribute from behind Tor proxies. I am just conversant enough with the security problems to make unworkable suggestions ;-), so please correct me, critique & suggest solutions, and perhaps volunteer to help. The current situation: https://en.wikipedia.org/wiki/Wikipedia:Advice_to_users_using_Tor_to_bypass… We generally don't let anyone edit or upload from behind Tor; the TorBlock extension stops them. One exception: a person can create an account, accumulate lots of good edits, and then ask for an IP block exemption, and then use that account to edit from behind Tor. This is unappealing because then there's still a bunch of in-the-clear editing that has to happen first, and because then site functionaries know that the account is going to be making controversial edits (and could possibly connect it to IPs in the future, right?). And right now there's no way to truly *anonymously* contribute from behind Tor proxies; you have to log in. However, since JavaScript delivery is hard for Tor users, I'm not sure how much editing from Tor -- vandalism or legit -- is actually happening. (I hope for analytics on this and thus added it to https://www.mediawiki.org/wiki/Analytics/Dreams .) We know at least that there are legitimate editors who would prefer to use Tor and can't. People have been talking about how to improve the situation for some time -- see http://cryptome.info/wiki-no-tor.htm and https://lists.torproject.org/pipermail/tor-dev/2012-October/004116.html . It'd be nice if it could actually move forward. I've floated this problem past Tor and privacy people, and here are a few ideas: 1) Just use the existing mechanisms more leniently. Encourage the communities (Wikimedia & Tor) to use https://en.wikipedia.org/wiki/Wikipedia:Request_an_account (to get an account from behind Tor) and to let more people get IP block exemptions even before they've made any edits (< 30 people have gotten exemptions on en.wp in 2012). Add encouraging "get an exempt account" language to the "you're blocked because you're using Tor" messaging. Then if there's an uptick in vandalism from Tor then they can just tighten up again. 2) Encourage people with closed proxies to re-vitalize https://en.wikipedia.org/wiki/Wikipedia:WOCP . Problem: using closed proxies is okay for people with some threat models but not others. 3) Look at Nymble - http://freehaven.net/anonbib/#oakland11-formalizing and http://cgi.soic.indiana.edu/~kapadia/nymble/overview.php . It would allow Wikimedia to distance itself from knowing people's identities, but still allow admins to revoke permissions if people acted up. The user shows a real identity, gets a token, and exchanges that token over tor for an account. If the user abuses the site, Wikimedia site admins can blacklist the user without ever being able to learn who they were or what other edits they did. More: https://cs.uwaterloo.ca/~iang/ Ian Golberg's, Nick Hopper's, and Apu Kapadia's groups are all working on Nymble or its derivatives. It's not ready for production yet, I bet, but if someone wanted a Big Project.... 3a) A token authorization system (perhaps a MediaWiki extension) where the server blindly signs a token, and then the user can use that token to bypass the Tor blocks. (Tyler mentioned he saw this somewhere in a Bugzilla suggestion; I haven't found it.) 4) Allow more users the IP block exemption, possibly even automatically after a certain number of unreverted edits, but with some kind of FlaggedRevs integration; Tor users can edit but their changes have to be reviewed before going live. We could combine this with (3); Nymble administrators or token-issuers could pledge to review edits coming from Tor. But that latter idea sounds like a lot of social infrastructure to set up and maintain. Thoughts? Are any of you interested in working on this problem? #tor on the OFTC IRC server is full of people who'd be interested in talking about this. -- Sumana Harihareswara Engineering Community Manager Wikimedia Foundation

10 years, 7 months

Making inter-language links shorter

by Pau Giner

As multilingual content grows, interlanguage links become longer on Wikipedia articles. Articles such as "Barak Obama" or "Sun" have more than 200 links, and that becomes a problem for users that often switch among several languages. As part of the future plans for the Universal Language Selector, we were considering to: - Show only a short list of the relevant languages for the user based on geo-IP, previous choices and browser settings of the current user. The language the users are looking for will be there most of the times. - Include a "more" option to access the rest of the languages for which the content exists with an indicator of the number of languages. - Provide a list of the rest of the languages that users can easily scan (grouped by script and region ao that alphabetical ordering is possible), and search (allowing users to search a language name in another language, using ISO codes or even making typos). I have created a prototype <http://pauginer.github.io/prototype-uls/#lisa> to illustrate the idea. Since this is not connected to the MediaWiki backend, it lacks the advanced capabilities commented above but you can get the idea. If you are interested in the missing parts, you can check the flexible search and the list of likely languages ("common languages" section) on the language selector used at http://translatewiki.net/ which is connected to MediaWiki backend. As part of the testing process for the ULS language settings, I included a task to test also the compact interlanguage designs. Users seem to understand their use (view recording<https://www.usertesting.com/highlight_reels/qPYxPW1aRi1UazTMFreR>), but I wanted to get some feedback for changes affecting such an important element. Please let me know if you see any possible concern with this approach. Thanks -- Pau Giner Interaction Designer Wikimedia Foundation

10 years, 8 months

Single User Login finalisation: some accounts will be renamed

by James Forrester

All, The developer team at Wikimedia is making some changes to how accounts work, as part of our on-going efforts to provide new and better tools for our users (like cross-wiki notifications). These changes will mean users have the same account name everywhere, will let us give you new features that will help you edit & discuss better, and will allow more flexible user permissions for tools. One of the pre-conditions for this is that user accounts will now have to be unique across all 900 Wikimedia wikis.[0] Unfortunately, some accounts are currently not unique across all our wikis, but instead clash with other users who have the same account name. To make sure that all of these users can use Wikimedia's wikis in future, we will be renaming a number of accounts to have "~” and the name of their wiki added to the end of their accounts' name. This change will take place on or around 27 May. For example, a user called “Example” on the Swedish Wiktionary who will be renamed would become “Example~svwiktionary”. All accounts will still work as before, and will continue to be credited for all their edits made so far. However, users with renamed accounts (whom we will be contacting individually) will have to use the new account name when they log in. It will now only be possible for accounts to be renamed globally; the RenameUser tool will no longer work on a local basis - since all accounts must be globally unique - therefore it will be withdrawn from bureaucrats' tool sets. It will still be possible for users to ask on Meta for their account to be renamed further, if they do not like their new user name, once this takes place. A copy of this note is posted to meta [1] for translation. Please forward this to your local communities, and help get it translated. Individuals who are affected will be notified via talk page and e-mail notices nearer the time. [0] - https://meta.wikimedia.org/wiki/Help:Unified_login [1] - https://meta.wikimedia.org/wiki/Single_User_Login_finalisation_announcement Yours, -- James D. Forrester Product Manager, VisualEditor Wikimedia Foundation, Inc. jforrester(a)wikimedia.org | @jdforrester

10 years, 9 months

Quo Vadis, Vagrant

by Ori Livneh

I'm writing to report on progress with Mediawiki-Vagrant, and to tell you a bit about how it could be useful to you. If you've looked at MediaWiki-Vagrant in the past and thought, 'big deal -- I know how to configure MediaWiki', this e-mail is for you. But first, a small snippet to whet your appetite: # == Class: role::mobilefrontend # Configures MobileFrontend, the MediaWiki extension which powers # Wikimedia mobile sites. class role::mobilefrontend { include role::mediawiki include role::eventlogging mediawiki::extension { 'MobileFrontend': settings => { wgMFForceSecureLogin => false, wgMFLogEvents => true, } } } MW-V has evolved to become a highly organized and consistent Puppet code base for describing MediaWiki development environments. Puppet, you'll recall, is the same configuration management and software automation tool that TechOps uses to run the cluster. Puppet provides a domain-specific language for articulating software configuration in a declarative way. You tell Puppet what resources are configured on your machine and what relationships inhere among them, and Puppet takes your description and executes it. MW-V uses Puppet to automate the configuration of MediaWiki & related bits of software. But MW-V goes a bit further, too: it exploits the flexibility of Puppet's syntax and semantics to give MediaWiki developers a set of affordances for describing their own development setup. The description takes the concrete form of a Puppet 'role'. These can be submitted as patches against the MW-V repo and thus shared with others. The combination of Vagrant / Puppet / VirtualBox make it quite easy to select and swap machine roles. Roles can be toggled by adding or removing a line, 'include role::<name>', from puppet/manifests/site.pp. The current set of roles are listed here: https://gerrit.wikimedia.org/r/gitweb?p=mediawiki/vagrant.git;a=blob;f=pupp… They include a VisualEditor role that is powered by a local Parsoid instance, and roles for Selenium testing, Echo, MobileFrontend, GettingStarted, and EventLogging. If you're interested in checking it out: 0. Delete any old instances. 1. Download & install VirtualBox: https://www.virtualbox.org/wiki/Downloads 2. Download & instal Vagrant: http://downloads.vagrantup.com/tags/v1.2.2 3. git clone https://gerrit.wikimedia.org/r/p/mediawiki/vagrant.git 4. Edit puppet/manifests/site.pp and uncomment the role you want to check out. 5. Run 'vagrant up' from the repository's root directory. Finally: wait! The first run takes an obnoxiously long time (15-20 minutes). Subsequent runs are much faster (~5 seconds if there are no major configuration changes.) The documentation on MediaWiki.org is still meager, but it is concentrated here: http://www.mediawiki.org/wiki/Mediawiki-Vagrant If you're interested in using MW-V to document and automate your development setup, get in touch -- I'd be happy to help. I'm 'ori-l' on IRC, and I'll be at the Amsterdam Hackathon later this month too. Finally, if you're interested in how this relates to Labs (short answer: the use-cases are largely complementary), Andrew Bogott and I have an open submission for a Wikimania talk on this subject: https://wikimania2013.wikimedia.org/wiki/Submissions/Turnkey_Mediawiki_Test…

10 years, 9 months

Data for skin testing

by Yury Katkov

Hi everyone! How do you test Vector-compatible skins? I mean there are a lot of css classes that embedded in the Vector and in the extensions so I guess some data for tests should exist somewhere. ----- Yury Katkov, WikiVote

10 years, 10 months

Announcing the Wikimedia technical search tool

by Waldir Pimenta

Hi all, I'd like to announce a recently created tool that might help the Wikimedia technical community find stuff more easily. Sometimes relevant information is buried in IRC chat logs, messages in any of several mailing lists, pages in mediawiki.org, commit messages, etc. This tool (essentially a custom google search engine that filters results to a few relevant URL patterns) is aimed at relieving this problem. Test it here: http://hexm.de/mw-search The motivation for the tool came from a post by Niklas [1], specifically the section "Coping with the proliferation of tools within your community". In the comments section, Nemo announced his initiative to create a custom google search to fit at least some of the requirements presented in that section, and I've offered to help him tweak it further. The URL list is still incomplete and can be customized by editing the page http://www.mediawiki.org/wiki/Wikimedia_technical_search (syncing with the actual engine still will have to happen by hand, but should be quick). Besides feedback on whether the engine works as you'd expect, I would like to start some discussion about the ability for Google's bots to crawl some of the resources that are currently included in the URL filters, but return no results. For example, the IRC logs at bots.wmflabs.org/~wm-bot/logs/. Some workarounds are used (e.g. using github for code search since gitweb isn't crawlable) but that isn't possible for all resources. What can we do to improve the situation? --Waldir 1. http://laxstrom.name/blag/2013/02/11/fosdem-talk-reflections-23-docs-code-a…

10 years, 10 months

Re: [Wikitech-l] ZERO architecture

by Adam Baso

Sorry to reply on a thread that will probably not sort nicely on the mailman web interface or threading mail clients. Anybody know of an easy way to reply to digest email in Gmail such that mailman will retain threading? I'll be working on the conversion of Wikipedia Zero banners to ESI. There will be good lessons here I think for looking at dynamic loading in other parts of the interface. Arthur, to address your questions in the parent to Mark's reply: "is there any reason you'd need serve different HTML than what is already being served by MobileFrontend?" Currently, some languages are not whitelisted at carriers, meaning that users may get billed when they hit <language>.m.wikipedia.org or <language>.zero.wikipedia.org. Thus, a number of <a href>s are rewritten to include interstitials if the link is going to result in a charge. By the way, we see some shortcomings in the existing rewrites that need to be corrected (e.g., some URLs don't have interstitials, but should), but that's a separate bug. My thinking is that we start intercepting all clicks by JavaScript if it's a Javascripty browser, or via a default interceptor at the URL on that <language>.(m|zero).wikipedia.org's language's corresponding subdomain otherwise. In either case, if the destination link is on a non-whitelisted Wikipedia domain for the carrier or if the link is external of Wikipedia, the user should land at an interstitial page hosted on the same whitelisted subdomain from whence the user came. "Out of curiosity, is there WAP support in Zero? I noticed some comments like '# WAP' in the varnish acls for Zero, so I presume so. Is the Zero WAP experience different than the MobileFrontend WAP experience?" No special WAP considerations in ZeroRatedMobileAccess above and beyond MobileFrontend, as I recall. The "# WAP" comments is just for us to remember in case a support case comes up with that particular carrier. We'll want to keep in mind impacts for USSD/SMS support. I think Jeremy had some good conversations at the Wikimedia Hackathon in Amsterdam that will help him to refine how his middleware receives and transforms content. Mark Bergsma mark at wikimedia.org Fri May 31 09:44:48 UTC 2013 Previous message: [Wikitech-l] ZERO architecture Next message: [Wikitech-l] ZERO architecture Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] ________________________________ >> * feature phones -- HTML only, the banner is inserted by the ESI >> ** for carriers with free images >> ** for carriers without free images >> > > What about including ESI tags for banners for smart devices as well as > feature phones, then either use ESI to insert the banner for both device > types or, alternatively, for smart devices don't let Varnish populate the > ESI chunk and instead use JS to replace the ESI tags with the banner? That > way we can still serve the same HTML for smart phones and feature phones > with images (one less thing for which to vary the cache). I think the verdict is still out on whether it's better to use ESI for Banners in Varnish or use JS for that client-side. I guess we'll have to test and see. > Are there carrier-specific things that would result in different HTML for > devices that do not support JS, or can you get away with providing the same > non-js experience for Zero as MobileFrontend (aside from the > banner, presumably handled by ESI)? If not currently, do you think its > feasible to do that (eg make carrier-variable links get handled via special > pages so we can always rely on the same URIs)? Again, it would be nice if > we could just rely on the same HTML to further reduce cache variance. It > would be cool if MobileFrontend and Zero shared buckets and they were > limited to: > > * HTML + images > * HTML - images > * WAP That would be nice. > Since we improved MobileFrontend to no longer vary the cache on X-Device, > I've been surprised to not see a significant increase in our cache hit > ratio (which warrants further investigation but that's another email). Are > there ways we can do a deeper analysis of the state of the varnish cache to > determine just how fragmented it is, why, and how much of a problem it > actually is? I believe I've asked this before and was met with a response > of 'not really' - but maybe things have changed now, or others on this list > have different insight. I think we've mostly approached the issue with a > lot more assumption than informed analysis, and if possible I think it > would be good to change that. Yeah, we should look into that. We've already flagged a few possible culprits, and we're also working on the migration of the desktop wiki cluster from Squid to Varnish, which has some of the same issues with variance (sessions, XVO, cookies, Accept-Language...) as MobileFrontend does. After we've finished migrating that and confirmed that it's working well, we want to unify those clusters' configurations a bit more, and that by itself should give us additional opportunity to compare some strategies there. We've since also figured out that the way we've calculate cache efficiency with Varnish is not exactly ideal; unlike Squid, cache purges are done as HTTP requests to Varnish. Therefore in Varnish, those cache lookups are calculated into the cache hit rate, which isn't very helpful. To make things worse, the few hundreds of purges a second vs actual client traffic matter a lot more on the mobile cluster (with much less traffic but a big content set) than it does for our other clusters. So until we can factor that out in the Varnish counters (might be possible in Varnish 4.0), we'll have to look at other metrics. More useful therefore is to check the actual backend fetches ("backend_req"), and these appear to have gone down some. Annoyingly, every time we restart a Varnish instance we get a spike in the Ganglia graphs, making the long-term graphs pretty much unusable. To fix that we'll either need to patch Ganglia itself or move to some other stats engine (statsd?). So we have a bit of work to do there on the Ops front. Note that we're about to replace all Varnish caches in eqiad by (fewer) newer, much bigger boxes, and we've decided to also upgrade the 4 mobile boxes with those same specs. And we're also doing that in our new west coast caching data center as well as esams. This will increase the mobile cache size a lot, and will hopefully help by throwing resources at the problem. -- Mark Bergsma <mark at wikimedia.org> Lead Operations Architect Wikimedia Foundation

10 years, 10 months

[RFC] Update our code to use RDFa 1.1 instead of RDFa 1.0

by Daniel Friesen

I was going through our code contemplating dropping XHTML 1.1 support and ran into the RDFa support stuff and realized how out of date and limited it is. I've put together an RFC for replacing our code that appears to be based on the RDFa 1.0 from 2008 with RDFa 1.1 and expanding support for RDFa. https://www.mediawiki.org/wiki/Requests_for_comment/Update_our_code_to_use_… -- ~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://danielfriesen.name/]

10 years, 11 months

Jump to page:

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Wikitech-l May 2013