Wikitech-l October 2010

wikitech-l@lists.wikimedia.org

95 participants
67 discussions

Security warning to trunk users running tests
by Platonides 29 Oct '10

29 Oct '10

Since r61917 (3 February), running the api tests have been creating a user called 'Useruser' which initially had a random password. r72475 (6 September) screwed up by using a hardcoded pasword. Since it wasn't too bad for wikis allowing account creation, r74118 (1 October) made that user a sysop. Finally, r74213 (3 October) split it into two users. I added in r75588 a feature to block weak passwords, which will disable them. In r75589 I reverted to random password users and renamed them so that they a) Are much more unlikely to conflict with any human account creation. b) It is clear where did such sysop come from. Sadly, we can't add them to $wgReservedUsernames without breaking the tests. Any user having run 'make destructive' in the last two months should update to r75589 inmediatly. They are also encouraged to desysop and block the accounts 'Useruser' and 'Useruser1'. They are only at risk if that install is publicly accessible, though. Users not running MediaWiki from trunk or which haven't run the phpunit tests are unaffected.

5 4

RT
by a b 28 Oct '10

28 Oct '10

After the recent dicussions open open-ness and clarity with requests by serveral people what is contained within the RT after several people have asked and given answers like "it's staff stuff". So what is stored in it that can't be within either the staff or internal wiki where it must be private or bugzilla for other matters?

3 4

SimpleSearch doesn't work in recently created projects
by Amir E. Aharoni 28 Oct '10

28 Oct '10

In these recently created projects article name completion doesn't seem to work in the search box: http://bjn.wikipedia.org http://koi.wikipedia.org http://mrj.wikipedia.org I have "Enable enhanced search suggestions (Vector skin only)" checked and "Disable AJAX suggestions" unchecked; I believe that it's the default. Is it a problem with some index or cache that needs to be filled or a bug? -- אָמִיר אֱלִישָׁע אַהֲרוֹנִי · Amir Elisha Aharoni http://aharoni.wordpress.com "We're living in pieces, I want to live in peace." - T. Moore

2 1

Firesheep
by Hay (Husky) 27 Oct '10

27 Oct '10

Has anyone seen this? http://codebutler.com/firesheep A new Firefox plugin that makes it trivially easy to hijack cookies from a website that's using HTTP for login over an unencrypted wireless network. Wikipedia isn't in the standard installation as a site (lots of other sites, such as Facebook, Twitter, etc. are). We are using HTTP login by default, so i guess we're vulnerable as well (please say so if we're using some other kind of defensive mechanism i'm not aware of). Might it be a good idea to se HTTPS as the standard login? Gmail has been doing this since april this year. -- Hay

15 20

New installer is here
by Chad 27 Oct '10

27 Oct '10

Good afternoon, In r75437, r75438[0][1] I moved the old installer to old-index.php and moved the new to index.php. At this stage in the process, I don't see us backing this out before we branch 1.17. I really want people to test it out and report any major breakages [2]. This has been a long development process for almost 2 years now, and I'd like to thank Max, Mark H., Jure, Jeroen, Roan and Siebrand for their invaluable help in working on this. And especially thanks to Tim for starting the project and providing feedback, as always. There is a *lot* of code in includes/installer, and I'd like to highlight some of the major changes that you'll need to know. Database updaters: They have been moved from the gigantic file in maintenance/updaters.inc (patchfiles still go in the same place though). Each supported DB type has a class that needs to subclass DatabaseUpdater. The format's very similar, only it's operating on methods in the classes instead of global functions. The globals $wgExtNewTables, etc. are retained for back compat and will be for quite some time. However, you can pass more advanced callbacks since the LoadExtensionSchemaUpdates hook now passes the DatabaseUpdater subclass as a param. DB2 and MSSQL have been dropped from the installer. The implementations are far from complete and I'm not comfortable advertising their use yet. Other known issues: - Some UI quirks still exist, but work is coming here - Postgres and Oracle are *almost* done - Stuff listed on mw.org[2] -Chad [0] http://www.mediawiki.org/wiki/Special:Code/MediaWiki/75437 [1] http://www.mediawiki.org/wiki/Special:Code/MediaWiki/75438 [2] http://www.mediawiki.org/wiki/New-installer_issues

6 7

Re: [Wikitech-l] Parallel computing project
by James Salsman 27 Oct '10

27 Oct '10

Aryeh Gregor writes: > > To clarify, the subject needs to 1) be reasonably doable in a short > timeframe, 2) not build on top of something that's already too > optimized.... Integrating a subset of RTMP (e.g. the http://code.google.com/p/rtmplite subset) into the chunk-based file upload API -- http://www.mediawiki.org/wiki/API:Upload#Chunked_upload -- would be an example of parallel I/O that we really need if we ever hope to have reasonable microphone uploads for Wiktionary pronunciation collection. I know Flash sucks, but it sucks way less for microphone upload than currently nonexistent HTML5 audio upload support, client side Java, or any other alternative, and probably will suck way less than any of those alternatives for years. Soon GNU Gnash should have microphone Speex upload on all three major platforms, assuming the Gnash programming team doesn't starve to death first.

1 0

Parralel computing project
by Erik Zachte 26 Oct '10

26 Oct '10

Robert Rohde: Getting back to Wikimedia, it appears correct that the Wikistats code is designed to run from the compressed files ....(source linked from [1]). As you suggest, one could use the properties of .bz2 format to parallelize that. I would also observe that parsers tend to be relatively slow, while decompressors tend to be relatively fast. Some additional notes: Yes wikistats processes compressed dumps. Nowadays these are mostly stub dumps. Most monthly metrics can be collected here, with few exceptions like word count. For stub dumps decompression is the major resource hog, for full dumps some heavy regexp's do contribute considerably. Wikistats could benefit a lot from parallelization (although these days dump production for larger wikis is generally the bottleneck). First thing I would want to look into (some day) is running the count scripts for several wikis in parallel. All intermediate data are stored in csv files, often one file for one metric for all languages. Decoupling and aggregation as post processing step is simple. Running several count threads on one machine might tax memory. Some hashes are huge (much has been externalized, but e.g. edits per user per namespace is still a hash file). The basic structure dates from the time that a full archive dump for English Wikipedia was processed in minutes rather than months. There have been a lot of optimizations , but general setup is still like this: Every months all counts for past 10 years are reproduced from scratch. Wikistats basically has no memory. This probably sounds crazy, incremental processing has been suggested more than once. Main reason to keep it this way is: ever so often new functionality is added to the scripts (and the occasional bug fix) In order to have new counts for full history we would need to rerun from scratch ever so often anyway. People asked me how come the counts can change from to month to month. Same answer: counts are redone for all months, newer dumps will have more deletions for earlier months. Although this mostly effects last two months: nearly all deletions occur within a month or two. In early years deletions were very rare. most were done to prevent court orders (privacy). Nowadays deletionism has taken hold. Still wikistats treats deleted content as 'should not have been there in the first place'. This makes our editor counts somewhat conservative, basically skews the activity patterns in favor of good content contributors. Erik Zachte

1 0

Parallel computing project
by Aryeh Gregor 26 Oct '10

26 Oct '10

This term I'm taking a course in high-performance computing <http://cs.nyu.edu/courses/fall10/G22.2945-001/index.html>, and I have to pick a topic for a final project. According to the assignment <http://cs.nyu.edu/courses/fall10/G22.2945-001/final-project.pdf>, "The only real requirement is that it be something in parallel." In the class, we covered * Microoptimization of single-threaded code (efficient use of CPU cache, etc.) * Multithreaded programming using OpenMP * GPU programming using OpenCL and will probably briefly cover distributed computing over multiple machines with MPI. I will have access to a high-performance cluster at NYU, including lots of CPU nodes and some high-end GPUs. Unlike most of the other people in the class, I don't have any interesting science projects I'm working on, so something useful to MediaWiki/Wikimedia/Wikipedia is my first thought. If anyone has any suggestions, please share. (If you have non-Wikimedia-related ones, I'd also be interested in hearing about them offlist.) They shouldn't be too ambitious, since I have to finish them in about a month, while doing work for three other courses and a bunch of other stuff. My first thought was to write a GPU program to crack MediaWiki password hashes as quickly as possible, then use what we've studied in class about GPU architecture to design a hash function that would be as slow as possible to crack on a GPU relative to its PHP execution speed, as Tim suggested a while back. However, maybe there's something more interesting I could do.

10 17

InlineEditor new version (previously Sentence-Level Editing)
by Jan Paul Posma 26 Oct '10

26 Oct '10

Hi all, As presented last Saturday at the Hack-A-Ton, I've committed a new version of the InlineEditor extension. [1] This is an implementation of the sentence-level editing demo posted a few months ago. Basically what this new version does is building a tree structure from all the markings. This has the advantage of being able to render only part of the tree when doing a preview, which has a significant performance advantage. However, there's also the problem of dependencies throughout the page, the most notable of which is the Cite extension. Right now this is resolved by just rendering the entire page whenever a dependency is encountered. Extensions are responsible for telling this by using a hook, right now there's only built-in support for references (Cite). A more scalable solution would be to enable extensions to rerender only the dependency by using some stored data acquired on the initial parse, and data from the subsequent partial parse. This will be one of the goals for the next version. Another advantage of using this tree structure is that now nested markings are possible. Now there are basically two (or perhaps more) possibilities of defining editing options. One is to differentiate based on functionality, like in the initial demo: Text, Media, Templates, etc. Another possibility is to differentiate based on block size (which uses nested markings): Sentences, Paragraphs, Sections, Full Text. Personally I think the second option will be better if the goal of the interface is to educate new users to become gradually more accustomed to wikitext. Anyway, further research should investigate what's the best interface. I'll be doing some usability research myself in the next few months, and the Wikimedia Foundation will be doing further usability research next year. If you like to play around with the editor, these are the lines you can add to LocalSettings.php to get started: require_once( "$IP/extensions/InlineEditor/InlineEditorFunctional.php" ); // functional approach *or* require_once( "$IP/extensions/InlineEditor/InlineEditorBlocks.php" ); // block size approach Feedback is welcome! Thanks for your time. Regards, Jan Paul [1] http://www.mediawiki.org/wiki/Special:Code/MediaWiki/75344

2 1

Pending Changes development update: October 25
by Rob Lanphier 25 Oct '10

25 Oct '10

Hi everyone, This is another update on Pending Changes work. Over the Hack-a-ton weekend, Chad Horohoe and Priyanka Dhanda worked on two of the bigger features for the November 16 Pending Changes update: Bug 25294 - "Reject" button confirmation screen in Pending Changes https://bugzilla.wikimedia.org/show_bug.cgi?id=25294 Bug 25289 - Make review load faster by speeding up display of old revisions https://bugzilla.wikimedia.org/show_bug.cgi?id=25289 As of today, both of these are now deployed to our test site: http://prototype.wikimedia.org/flaggedrevs Additionally, since the last update, Brandon Harris has made a mockup available of some additional UI changes: http://www.mediawiki.org/wiki/Pending_Changes_enwiki_trial/NovemberReleaseD… The full list of issues for the November 16 release is listed here: https://bugzilla.wikimedia.org/showdependencytree.cgi?id=25293 We're not sure at this point just how much of the list we're going to make it through, but we plan to do additional updates shortly after November 16 with things that we don't get to. Please provide your input in Bugzilla if you're comfortable with that; otherwise, please remark on the feedback page: http://en.wikipedia.org/wiki/Wikipedia:Pending_changes/Feedback Rob

1 0

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Wikitech-l October 2010