Wikitech-l July 2013

wikitech-l@lists.wikimedia.org

158 participants
162 discussions

What about feature poll?
by Yury Katkov 10 Jul '13

10 Jul '13

Hi everyone! It seems that the for-profit company can be as open as the community! I really like the idea of feature poll of the Blue Spice [1] I wonder: have anyone tried to run something similar for MediaWiki? In Semantic MediaWiki we also run polls and surveys from time to time, here is the one: http://semantic-mediawiki.org/wiki/SMW_users_survey_results I think that such surveys can have surprising results for the developers and they are surely nice steps to openness in MediaWiki production process. [1] http://answers.blue-spice.org/index.php/questions ----- Yury Katkov, WikiVote

1 0

Re: [Wikitech-l] [Xmldatadumps-l] Suggested file format of new incremental dumps
by Petr Onderka 10 Jul '13

10 Jul '13

On Mon, Jul 8, 2013 at 6:53 AM, Randall Farmer <randall(a)wawd.com> wrote: > > Keeping the dumps in a text-based format doesn't make sense, because > that can't be updated efficiently, which is the whole reason for the new > dumps. > > First, glad to see there's motion here. > > It's definitely true that recompressing the entire history to .bz2 or .7z > goes very, very slowly. Also, I don't know of an existing tool that lets > you just insert new data here and there without compressing all of the > unchanged data as well. Those point towards some sort of format change. > > I'm not sure a new format has to be sparse or indexed to get around those > two big problems. > > For full-history dumps, delta coding (or the related idea of long-range > redundancy compression) runs faster than bzip2 or 7z and produces good > compression ratios on full-history dumps, based on some tests<https://www.mediawiki.org/wiki/Dbzip2#rzip_and_xdelta3> > . (I'm going to focus mostly on full-history dumps here because they're > the hard case and one Ariel said is currently painful--not everything here > will apply to latest-revs dumps.) > > For inserting data, you do seemingly need to break the file up into > independently-compressed sections containing just one page's revision > history or a fragment of it, so you can add new diff(s) to a page's > revision history without decompressing and recompressing the previous > revisions. (Removing previously-dumped revisions is another story, but it's > rarer.) You'd be in new territory just doing that; I don't know of existing > compression tools that really allow that. > > You could do those two things, though, while still keeping full-history > dumps a once-every-so-often batch process that produces a sorted file. The > time to rewrite the file, stripped of the big compression steps, could be > bearable--a disk can read or write about 100 MB/s, so just copying the 70G > of the .7z enwiki dumps is well under an hour; if the part bound by CPU and > other steps is smallish, you're OK. > > A format like the proposed one, with revisions inserted wherever there's > free space when they come in, will also eventually fragment the revision > history for one page (I think Ariel alluded to this in some early notes). > Unlike sequential read/writes, seeks are something HDDs are sadly pretty > slow at (hence the excitement about solid-state disks); if thousands of > revisions are coming in a day, it eventually becomes slow to read things in > the old page/revision order, and you need fancy techniques to defrag (maybe > a big external-memory sort <http://en.wikipedia.org/wiki/External_sorting>) > or you need to only read the dump on fast hardware that can handle the > seeks. Doing occasional batch jobs that produce sorted files could help > avoid the fragmentation question. > These are some interesting ideas. You're right that the copying the whole dump is fast enough (it would probably add about an hour to a process that currently takes several days). But it would also pretty much force the use of delta compression. And while I would like to use delta compression, I don't think it's a good idea to be forced to use it, because I might not have the time for it or it might not be good enough. Because of that, I decided to stay with my indexed approach. > There's a great quote about the difficulty of "constructing a software > design...to make it so simple that there are obviously no deficiencies." > (Wikiquote came through with the full text/attribution, of course<http://en.wikiquote.org/wiki/C._A._R._Hoare>.) > I admit it's tricky and people can disagree about what's simple enough or > even what approach is simpler of two choices, but it's something to strive > for. > > Anyway, I'm wary about going into the technical weeds of other folks' > projects, because, hey, it's your project! I'm trying to map out the > options in the hope that you could get a product you're happier with and > maybe give you more time in a tight three-month schedule to improve on your > work and not just complete it. Whatever you do, good luck and I'm > interested to see the results! > Feel free to comment more. I am the one implementing the project, but that's all. Input from others is always welcome. Petr Onderka

1 0

[Language Engineering] Reminder: Office hour on July 10, 2013 at 1700 UTC/1000 PDT
by Runa Bhattacharjee 10 Jul '13

10 Jul '13

Hello, This is a reminder that the Language Engineering team will be hosting an IRC office hour later today, i.e. July 10, 2013 at 1700 UTC/1000 PDT on #wikimedia-office (Freenode). Thanks Runa Agenda: ====== 1. ULS Rollout 2. Other updates 3. Q/A - We shall be taking questions during the session. Questions can also be sent to runa at wikimedia dot org or siebrand at wikimedia dot org before the event and can be addressed during the office-hour. ---------- Forwarded message ---------- From: Runa Bhattacharjee <rbhattacharjee(a)wikimedia.org> Date: Wed, Jul 3, 2013 at 10:11 PM Subject: [Language Engineering] Office hour on July 10, 2013 at 1700 UTC/1000 PDT To: Wikimedia developers <wikitech-l(a)lists.wikimedia.org>, Wikimedia Mailing List <wikimedia-l(a)lists.wikimedia.org>, MediaWiki internationalisation <mediawiki-i18n(a)lists.wikimedia.org> Hello, The Wikimedia Language Engineering team [1] invites everyone to join the team’s monthly office hour on July 10, 2013 at 1700 UTC/ 1000 PDT on #wikimedia-office. During this session we would be talking about some of our recent activities, including the Universal Language Selector (ULS) rollout and updates from the ongoing projects. See you all at the IRC office hour! regards, Runa Event Details: ========== Date: July 10, 2013 (Wednesday) Time: 1700-1800 UTC, 1000-1100 AM PDT IRC channel: #wikimedia-office on irc.freenode.net Agenda: 1. ULS Rollout 2. Other updates 3. Q/A - We shall be taking questions during the session. Questions can also be sent to runa at wikimedia dot org or siebrand at wikimedia dot org before the event and can be addressed during the office-hour. [1] http://wikimediafoundation.org/wiki/Language_Engineering_team -- Language Engineering - Outreach and QA Coordinator Wikimedia Foundation

1 0

Visual Editor Roadmap
by Derric Atzrott 09 Jul '13

09 Jul '13

Good day all, I was just taking a look at the Roadmap page[0] and the Engineering goals page[1] and noticed that on both the Visual Editor section ends right about now. I found these from the FAQ page [2] so we may want to update that with links to any pages that any of you are able to locate. Is there anywhere that I can find the Roadmap going forward until the end of the year? [0] https://www.mediawiki.org/wiki/Roadmap#VisualEditor [1] https://www.mediawiki.org/wiki/Wikimedia_Engineering/2013-14_Goals#VisualEd… [2] https://www.mediawiki.org/wiki/Help:VisualEditor/FAQ Thank you, Derric Atzrott Computer Specialist Alizee Pathology

3 3

How to get the number of pages in a category
by Daniel Mietchen 09 Jul '13

09 Jul '13

Hello together, in the framework of a GLAM project, we are looking for ways to (1) identify the number of pages in a given category - including via subcategories - on a given wiki (2) get the pageview stats for all these pages, including on aggregate (3) do the above across languages or projects (4) estimate what outcomes to expect in terms of Wikipedia pageviews and related metrics after an image donation of X files to a given category on Commons. I assume that part of it is available via the API but couldn't find anything close enough. Any pointers would be appreciated. Thanks and cheers, Daniel

3 2

Problems Updating from 1.18 to 1.21.1
by Derric Atzrott 09 Jul '13

09 Jul '13

Good morning all, I have a question about a problem that cropped up during my update from 1.18 to 1.21.1. With one exception everything went smoothly during my update, but now all of my images, appear to be without thumbnails and inadvertently using File protocol links. The generated source for embedded images looks like this: <p>[<a rel="nofollow" class="external text" href="File:ReportedTime.jpg%7C451px%7CReported">time per activity on Project</a>]</p> Generated from: [[File:ReportedTime.jpg|451px|Reported time per activity on Project]] Any idea what I may have done wrong. Is there a new setting that I may have missed? Has anyone ever seen this sort of issue before? Thank you, Derric Atzrott Computer Specialist Alizee Pathology

2 2

Are revision tags in the dump files?
by Robert Rohde 09 Jul '13

09 Jul '13

Various parts of Mediawiki will apply tags to specific edits in recent changes and histories. For example, the recently introduced Visual Editor is adding Tag: VisualEditor to all of its edits. Are such tags included in the XML dumps of Wikipedia? It will be a while before a new dump of enwiki is released, but once it is ready, I'm wondering if we can use it to track the adoption of Visual Editor by looking for such Tags in the dump file. Are the Tags included, and if so, which dump files are they contained in? I've looked briefly at the dump documentation and didn't see any mention of Tags, and I don't recall noticing them during any of the times I've worked with dump files in the past. -Robert Rohde

3 4

Re: [Wikitech-l] Problems Updating from 1.18 to 1.21.1
by Derric Atzrott 09 Jul '13

09 Jul '13

>Good morning all, > >I have a question about a problem that cropped up during my update from 1.18 to 1.21.1. > >With one exception everything went smoothly during my update, but now all of my images, appear to be without thumbnails and inadvertently using File protocol links. > >...snip... > >Any idea what I may have done wrong. Is there a new setting that I may have missed? Has anyone ever seen this sort of issue before? > >Thank you, >Derric Atzrott I determined the issue. I had $wgUrlProtocols[] = "file:"; in my LocalSettings.php from an earlier attempt to get File protocol links working (would have required me to write Firefox and Chrome extensions so it was abandoned as too much effort to securely manage). Thank you, Derric Atzrott

1 0

Features vs. Internet Explorers
by Chris McMahon 09 Jul '13

09 Jul '13

In recent times QA has expanded our automated cross-browser testing: we re-organized our builds, pointed the tests to beta labs wikis as well as test2wiki, and we've written a number of new tests. In the course of that a lot of our builds for Internet Explorer versions began to fail. I've just cleaned up most of the build failures and discovered what I think are some interesting facts about support for older versions of Internet Explorer across the set of WMF features. Not supported in IE6: AFTv5 by design VisualEditor by design UniversalLanguageSelector by design Interlanguage "Add links" feature known issue https://bugzilla.wikimedia.org/show_bug.cgi?id=49139 PageTriage by design PDF export broken: https://bugzilla.wikimedia.org/show_bug.cgi?id=49485 Page editing: degraded UI ACUX display garbled but functions, known issue Not supported in IE7: VisualEditor by design UniversalLanguageSelector by design Interlanguage "Add links" feature known issue https://bugzilla.wikimedia.org/show_bug.cgi?id=49139 PageTriage by design PDF export broken: https://bugzilla.wikimedia.org/show_bug.cgi?id=49485 Not supported in IE8: VisualEditor by design UniversalLanguageSelector minor issue https://bugzilla.wikimedia.org/show_bug.cgi?id=49447 Interlanguage "Add links" feature known issue https://bugzilla.wikimedia.org/show_bug.cgi?id=49139 Not supported in IE9: AFTv5 broken for now https://bugzilla.wikimedia.org/show_bug.cgi?id=49445(AFTv5 has a history of IE9-only issues) VisualEditor broken for now https://bugzilla.wikimedia.org/show_bug.cgi?id=49187 Interlanguage "Add links" feature known issue https://bugzilla.wikimedia.org/show_bug.cgi?id=49139 One other interesting note: we have an effective test for GuidedTour (it has turned up regression bugs) that runs properly across all the browsers, so thanks E3 team.

4 4

Global AbuseFilter testing!
by hoo 09 Jul '13

09 Jul '13

Here's a copy of a mail I just sent to stewards-l about the trial deployment of global AbuseFilters: Hello, after a long time we're finally confident that the AbuseFilter extension is in a state in which it can be used for global filters. Therefore I'm happy to announce that from now on global AbuseFilters can be used on (some) Wikimedia Wikis. The filters can be created and edited by Stewards using the normal AbuseFilter interface on meta ( https://meta.wikimedia.org/wiki/Special:AbuseFilter ) and basically work like local filters (you only have to set the "Global filter" flag). Although global AbuseFilters are already in use on Wikimedia Labs for quite some time we would like you to start using them slowly, preferable with logging-only filters to prevent unforeseeable damage. Global filters are yet enabled on: metawiki, testwiki, test2wiki, mediawikiwiki (They will only filter changes on these wikis... Of course this trial will later be extended to further Wikis and it's planned to cover all Wikis at some point) Please notice that it's not yet possible to create custom warning message for global filters ( https://bugzilla.wikimedia.org/show_bug.cgi?id=45164 ) and that global filters can't yet be enabled/ disabled for certain wikis only ( https://bugzilla.wikimedia.org/show_bug.cgi?id=41172 ). Cheers, Marius Hoch (Hoo man)

3 3

Jump to page:

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Wikitech-l July 2013