Please find below the notes for the internationalisation triage bug
triage held on Wednesday, September 14, 13:00UTC in #wikimedia-dev on
Freenode IRC.
9 people participated in the triage. Two of the participants were new
to MediaWiki. I'm very happy that the triage attracted new potential
MediaWiki developers!
The triage was to have have two sections, each taking about 30 minutes.
We ended up with a "first half" of about 55 minutes and a second half
Of about 20 minutes.
Next week's triage is on WikiBooks
http://www.mediawiki.org/wiki/Bug_management/Triage
== Message documentation and message related coding ==
https://bugzilla.wikimedia.org/show_bug.cgi?id=16026 --
MediaWiki:Revision-info should accept wikimarkup
* Volunteered by Amir.
https://bugzilla.wikimedia.org/show_bug.cgi?id=16111 --
MediaWiki:Cascadeprotected and MediaWiki:Cascadedprotectedwarning should
take the same parameters
* Volunteered by Amir.
https://bugzilla.wikimedia.org/show_bug.cgi?id=16175 -- Clean up the
rendering of messages displayed at the top of the edit window
* Volunteered by Amir. May be a difficult one, as these messages are
used a lot in Wikimedia projects.
https://bugzilla.wikimedia.org/show_bug.cgi?id=17148 -- The warning about
editing a semi-protected page can display an irrelevant edit summary
* Volunteered by Brian Wolff
https://bugzilla.wikimedia.org/show_bug.cgi?id=17865 -- Mismatched input
syntax for Cite error messages
* Some discussion. Cite is seen as scary! No one to take this yet.
Please give this some TLC.
https://bugzilla.wikimedia.org/show_bug.cgi?id=28557 -- Message
documentation for Extension:AddMediaWizard needed
* Volunteered by Blucal.
https://bugzilla.wikimedia.org/show_bug.cgi?id=25608 --
MediaWiki:Fundraiserstats-tab-ytd should not contain (USD)
* Goes to fundraising triage. A concern was voiced that i18n does not
appear to get the attention it needs. The Localisation team will get
in contact with the fundraising developers when they are in San
Francisco in two weeks to discuss..
https://bugzilla.wikimedia.org/show_bug.cgi?id=29357 -- CategoryTree
should have built-in localizable support for pretty Categorytree-member-num
* Amir volunteered for this. Brian is prepared to help him out where
needed.
https://bugzilla.wikimedia.org/show_bug.cgi?id=29927 -- CentralAuth using
wrong Language on Special:MergeAccount
123456789012345678901234567890123456789012345678901234567890123456789012
* Akshayagarwal volunteered for this, and has committed changes in
<http://www.mediawiki.org/wiki/Special:Code/MediaWiki/97168>
https://bugzilla.wikimedia.org/show_bug.cgi?id=30729 -- Not all numbers
are localised in AbuseFilter
* srikanthlogic has volunteered for this with Brian Wolff as mentor.
https://bugzilla.wikimedia.org/show_bug.cgi?id=29170 -- [[MediaWiki:Enotif
body]] needs GENDER support
* Some discussion, but nothing final.
== Harder issues and discussion ==
https://bugzilla.wikimedia.org/show_bug.cgi?id=28428 -- Allow saving pages
with LRM and RLM in titles, showing a warning and requiring a user right
https://bugzilla.wikimedia.org/show_bug.cgi?id=28411 -- titles of articles
with LTR titles in RTL wikis may be displayed incorrectly in categories
and special pages
* It looks like a schema change is not needed, and that the
page_props table can and should be used. Use batches where needed;
integrate that in Linker, LinkBatch and probably Title as well.
Ambitious according to Roan. Roan agrees that having displaytitle as a
page_ field makes more sense *conceptually*. In practice, he prefers
page_props because that avoids a schema change and is still cheap.
Niklas indicated the discussion gave him ideas on how to work towards
resolving these issues.
The other announced issues were not discussed.
Thanks to everyone participating. Looking forward to the results!
--
Siebrand Mazeland
Product Manager Localisation
Wikimedia Foundation
M: +31 6 50 69 1239
Skype: siebrand
Looking over an extension that was already badly coded, I realized
there's yet another type of injection vulnerability we have to consider
when coding.
CSS injection vulnerabilities.
Normally MediaWiki sanitizes any style="" tag created by user input.
Things like background-image's are stripped out. They can be used to
track users, as a type of spam, and if you're hitting IE users it's
possible you could do even more using a htc file. Oh right, and of
course there's the lovely ie expression(...) which allows raw JavaScript
to be injected right into css.
There are some extensions however that may try to build style strings
for their own html output. For example an extension trying to output
something where it just happens to want to let the user customize the
width and height of a div it happens to be outputting. To do so the
extension likely will concatenate the user's input into some css using
something like `"width: {$width}; height: {$height};` the problem here
of course is that if the user's input included something like
`width="5px; background-image: url(...)"` they could use it to inject a
background image that would not be filtered out by our normal code.
And just a note, yes extensions using our nice elegant Html interface to
try to avoid normal XSS injections can still be bitten by this:
If you use something like `Html::element( 'div', array( 'style' =>
"width: {$width}; height: {$height};" ), '...' );` then it can still be
used.
Things needed to fix issues like this:
- Any style tag you generate using user input should be properly
sanitized using Sanitizer::checkCss
- Keep in mind that checkCss only rejects things like url structures
expressions, etc... it does not ensure that only the properties that you
specified are used. So it's possible that users can still do less
malicious things like use a width="500px; position: absolute; ..." to
inject other css into your style. I suppose it's already possible to
absolutely position a flash video somewhere it's not supposed to be by
wrapping a div outside of it so that's not really as big a deal. Though
if you have css styles or some special case in your code where you might
not want a style to be overridden or added keep in mind that a user can
still inject those properties.
-- We should probably consider building an interface such as
`Html::element( 'div', array( 'style' => array( 'width' => $width,
'height' => $height ) ), '...' );` into core.
--
~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://daniel.friesen.name]
Hi folks,
I just wanted to quickly highlight the plan that we have for deploying
MediaWiki 1.18 to the Wikimedia sites:
* Monday, September 19 (-20), 23:00-01:00 UTC (4pm-6pm PDT):
MediaWiki 1.18 deployment to test2
* Wednesday, September 21 (-22), 23:00-03:00 UTC (4pm-8pm PDT):
MediaWiki 1.18 stage 1 deployment (simplewiki, simplewiktionary,
usabilitywiki, strategywiki, mediawikiwiki, hewikisource)
* Monday, September 26 (-27), 23:00-03:00 UTC (4pm-8pm PDT):
MediaWiki 1.18 stage 2 deployment (metawiki, enwikiquote, enwikibooks,
betawikiversity, eowiki, nlwiki)
* Tuesday, October 4 (-5), 23:00-03:00 UTC (4pm-8pm PDT): MediaWiki
1.18 stage 3 deployment (remaining wikis)
More about MediaWiki 1.18 can be found here:
http://www.mediawiki.org/wiki/MediaWiki_1.18
The main blocker right now? Code review:
http://www.mediawiki.org/wiki/MediaWiki_roadmap/1.18/Revision_report
See Mark's email earlier today for more detail on this. Please chip
in where you can.
Thanks!
Rob
Hi
I am trying to pull up a list of pageids from the snapshot, which belong to
a specific category. Basically I am trying to pull up pages which are on
book portal pages. I looked at the snapshot mysql db to see which tables i
can use. but the fields of tables category and category_links didnt make any
sense to me in regard to what they stand for. so i was wondering if somebody
could help me with the sql.
thanks
priyank
We would like to begin deployment of the 1.18 code base on Monday, but
we've fallen a little behind of where we need to be.
We want to have all code reviewed and all FIXMEs fixed by Friday night
so that we can enjoy the weekend, relax and prepare to push out the 1.18
code base to the first few wikis on Monday.
As of Thursday night, New York City time, we have 16 revisions left to
review and 4 FIXMEs left to fix.
More eyes on these problem revisions would help so here are the last 20
revisions that need work along with their committers and commit
summaries:
FIXMEs
==============
http://www.mediawiki.org/wiki/Special:Code/MediaWiki/86312 (reedy)
We need a title object for parsing, do one against the message key
Doesn't seem to be the best way, but it's the most applicable. If I
abused $wgTitle, Chad would come and beat me too ;)
http://www.mediawiki.org/wiki/Special:Code/MediaWiki/86735 (reedy)
Followup to r86312
<ialex> Reedy: that rev is breaking usage of {{PAGENAME}} in
messages, such as in MediaWiki:Noarticles
Allowing optional passing in of a Title object (like it may be set
in Message), but if it's not set, or not a title object, fall back
and use $wgTitle (I'm sorry!)
http://www.mediawiki.org/wiki/Special:Code/MediaWiki/91518 (robin)
(bug 6100; follow-up to r91315) Being bold and removing
$wgBetterDirectionality (and dependent wfUILang) in core, as most or
all work is finished. Also:
* Introduce classes mw-float-end, mw-float-start so we don't have to
use inline css depending on wfUILang()/$wgLang (see HistoryPage
and SpecialFileDuplicateSearch)
* Add direction mark to protection log
* Remove specialpageattributes as it is obsoleted by this commit
(also fixes bug 28572)
* Add two direction marks in wfSpecialList, which makes ltr links on
rtl wiki (and vice versa) display nicely as well (only on those
special pages however)
* Revert r91340 partially: use mw-content-ltr/rtl class anyway in
shared.css. Both ways have their [dis]advantages...
* Set the direction of input fields by default to the content
language direction (except buttons etc.) in shared.css
http://www.mediawiki.org/wiki/Special:Code/MediaWiki/91561 (bawolff)
(Bug 19725) Do not include suppressed edits in the "View X deleted
edits" message, and when doing prefix search of special:undelete.
I'm not 100% sure this is the right thing to do, see the bug for the
details. But basically this doesn't include an edit in the count if
its text is hidden and its hidden from admins. (Not sure if it
should not be included only if everything is hidden). Its also weird
to show people different things depending if they have suppress
rights, without really indicating that.
Minor db note: This causes the query to no longer use a covering
index. I don't think that matters but just thought i'd mention.
p.s. The upload page show deleted edits link is broken right now,
(from before) I'll fix in a follow-up.
Needs Review:
==============
http://www.mediawiki.org/wiki/Special:Code/MediaWiki/84591 (aaron)
* Made BeforeParserMakeImageLinkObj/BeforeGalleryFindFile let hooks
set sha1 parameter
* Made FlaggedRevs specify files by sha1,timestamp to handle renames
with no redirects. This makes them handled as well as templates in
this regard. (bug 27836)
* Moved BeforeGalleryFindFile hook to proper place (don't trigger
for non-NS_FILE titles)
* Removed unused mRevisionId field from ImageGallery
* Removed old hotfix from makeMediaLinkObj(); all the current
callers would crash beforehand if the title was null anyway
* Updated hook docs (some prior params were missing)
* Broke some long lines and cleaned up some whitespace
* TODO: track file info in core rather than fr_fileSHA1Keys and
ugly, duplicated, queries. This should be easy to do now.
http://www.mediawiki.org/wiki/Special:Code/MediaWiki/84610 (aaron)
* Put parser output file version tracking to core
* Added some ParserOutput accessors
* A few cleanups to fetchFile()
http://www.mediawiki.org/wiki/Special:Code/MediaWiki/84619 (aaron)
Follow-up r84610: removed fr_fileSHA1Keys and use core file version
tracking instead. Handles query spam FIXME in code.
http://www.mediawiki.org/wiki/Special:Code/MediaWiki/88780 (aaron)
* Follow-up r88740:
* Fixed parse() arguments in getRevIncludes()
* Changed clearTagHook() to avoid preprocessed-xml cache corruption
* Check current version cache in getRevIncludes()
http://www.mediawiki.org/wiki/Special:Code/MediaWiki/97022 (aaron)
Possible fix for issue reported in r83762. I had a hard time
confirming the problem/fix with profiling. Committing this now to
get more attention.
http://www.mediawiki.org/wiki/Special:Code/MediaWiki/97077 (aaron)
FU r97022: Fallback to empty array for getExtraSortFields() result
in __construct() if no value is set for the sort order/type.
http://www.mediawiki.org/wiki/Special:Code/MediaWiki/97034 (brion)
* (bug 6722) Spacing fixes for math functions with/without parens
* (bug 18912) Add math support for \sen Spanish variant of \sin
* (bug 18912) Fix spacing for \operatorname in math
Reapplies r86962, r87117, r87936, r87941 plus some parser tests.
Note that further batch testing to identify any other potential
problems due to the spacing tweaks is a good idea!
http://www.mediawiki.org/wiki/Special:Code/MediaWiki/74966 (catrope)
First shot at porting Monobook to Vector. The only
non-straightforward part is moving from a separate rtl.css file to a
main.css file we can safely Janus. This is half-done right now.
Notable things:
* No longer including rtl.css, instead relying on Janused main.css
* Renamed external.png to external-ltr.png so Janus will pick up on
external-rtl.png (already exists)
* Added @embed to all images, add @noflip where needed (may not have
covered all cases)
* Pulled some things from rtl.css into main.css such that they're
no-ops in LTR but are needed in RTL. Example: body {
direction:ltr;
* Killed some RTL-specific rules in main.css that were superseded in
main.css
* Changed padding: 0 foo; to padding-right: foo; so it gets Janused
right
* Commented out loading of IE*Fixes.css. Still need to incorporate
these into main.css somehow
http://www.mediawiki.org/wiki/Special:Code/MediaWiki/96560 (catrope)
OggHandler: Address issues with protocol-relative URLs. Live hack in
r96400.
* Use $wgExtensionAssetsPath instead of "$wgScriptPath/extensions"
* Use wfExpandUrl() rather than the DIY method of detecting whether
the URL needs expanding and prepending $wgServer. Left the
detection for $wgCortadoJarFile alone since that needs $scriptPath
prepended to it if it's not absolute
* Expand $this->videoUrl using PROTO_RELATIVE instead of r96400 's
PROTO_CURRENT to avoid cache pollution, and add a protocol the
second the URL arrives on the JS side
http://www.mediawiki.org/wiki/Special:Code/MediaWiki/97150 (catrope)
Update jquery.tablesorter for r97145: emulate <thead> if there is no
<thead> in the HTML, by walking down the table starting at the first
row and moving rows to the <thead> as long as all of its cells are
<th>s (or the row is empty). Also fix and simplify the sortbottom
code, which was incorrectly creating multiple <tfoot> elements if
there were multiple sortbottom rows.
http://www.mediawiki.org/wiki/Special:Code/MediaWiki/92054 (diebuche)
Render category links as an HTML list. Bug 12261. Based on patch by
Thana & Bergi. It's removing the textual pipe separator, wrapping
the links inside li elements and adding a left 1px border as
separator. This makes it much easier to manipulate the list via JS
or CSS and is also semantically correct
http://www.mediawiki.org/wiki/Special:Code/MediaWiki/89220 (platonides)
Remove Cite singleton. Store it inside each associated parser at
$parser->extCite This fixes bug 20748 and bug 15819 without breaking
the other tests. Reverts r88971. The conflict with CategoryTree was
the old problem of a message being called inside of a parser
callback, this time with clearState for which the hook is global.
http://www.mediawiki.org/wiki/Special:Code/MediaWiki/89706 (platonides)
Reinstate r79122 (fix for bug 14404), reverting r83868. The real bug
seem to have been r86131, fixed in r88902 (1.17) and r88902 (1.18).
This is not merged with the r86131 change to
Article::getParserOptions() since I don't see the point for the new
function yet. Reenabled its test ArticleTablesTest which was
disabled in r85618
http://www.mediawiki.org/wiki/Special:Code/MediaWiki/86304 (reedy)
* (bug 28532) wfMsgExt() and wfMsgWikiHtml() use $wgOut->parse()
* (bug 16129) Transcluded special pages expose strip markers when
they output parsed messages
Also adding some related documentation during my travels around the
code
http://www.mediawiki.org/wiki/Special:Code/MediaWiki/97192 (robin)
Use wfSpecialList() so it displays properly when user direction !=
content direction, and call Linker function statically.
http://www.mediawiki.org/wiki/Special:Code/MediaWiki/97145 (tstarling)
Reverted r85922 and related: new doTableStuff(). I copied in the old
doTableStuff() from before r85922 and reverted all parser test
changes that looked vaguely related. Apologies to Platonides, since
some of his parser tests appeared to be relevant to the old parser,
but it's simplest to just revert all the related changes and then
re-add any useful tests later. See CR r85922 for full rationale.
Hi folks,
we've been using LimeSurvey for the last couple of years to run
Wikimedia surveys, but as we pushed it to its limits, we've surfaced
numerous problems, including security vulnerabilities and major
concurrency issues. It's got many of the right capabilities and may
be fixable, but it would take a very significant amount of effort to
do so.
Do folks on this list have experience with other good open source
survey tools that you've used at scale, or seen used at scale?
Yes, there's always the option of rolling our own (perhaps building on
Jeroen's new survey extension for MediaWiki) -- but do keep in mind
that surveys can get pretty technically complex, so that may be
prohibitive for some of the surveys we're planning to run.
We're looking at both open and proprietary solutions, but if anyone
has any tips, they'd be much appreciated. I'd like to avoid lock-in to
a proprietary solution if at all possible. :-)
Thanks,
Erik
--
Erik Möller
VP of Engineering and Product Development, Wikimedia Foundation
Support Free Knowledge: http://wikimediafoundation.org/wiki/Donate
Hello!
Over the last few weeks, Yusuke Matsubara, Shawn Walker, Aaron Halfaker and
Fabian Kaelin (who are all Summer of Research fellows)[0] have worked hard
on a customized stream-based InputFormatReader that allows parsing of both
bz2 compressed and uncompressed files of the full Wikipedia dump (dump file
with the complete edit histories) using Hadoop. Prior to WikiHadoop and the
accompanying InputFormatReader it was not possible to use Hadoop to analyze
the full Wikipedia dump files (see the detailed tutorial / background for an
explanation why that was not possible).
This means:
1) We can now harness Hadoop's distributed computing capabilities in
analyzing the full dump files.
2) You can send either one or two revisions to a single mapper so it's
possible to diff two revisions and see what content has been addded /
removed.
3) You can exclude namespaces by supplying a regular expression.
4) We are using Hadoop's Streaming interface which means people can use this
InputFormat Reader using different languages such as Java, Python, Ruby and
PHP.
The source code is available at: https://github.com/whym/wikihadoop
A more detailed tutorial and installation guide is available at:
https://github.com/whym/wikihadoop/wiki
(Apologies for cross-posting to wikitech-l and wiki-research-l)
[0] http://blog.wikimedia.org/2011/06/01/summerofresearchannouncement/
Best,
Diederik
One of the things I believe we're missing from our skin system is a
template language.
PHP embedding is excessively verbose, and it makes it impossible to
contemplate letting users upload skins at whim in a farm scenario.
There are also other things we can't do with php that we can do with a
template language, such as preprocessing the template and extracting
information about the skin that would've required extra config, and
using that as an advantage that lets us keep backwards compatibility
based on what a skin has left out while adding new features (eg: adding
a pageicons area, but detecting old skins that don't have these and
instead automatically combining it with the title key).
As I see it a good template language for us would need to have the
following requirements:
- No evalable php in the syntax. Such a thing would nearly preclude the
safe-skins only scenario.
- We should be able to analyze a template; ie: The syntax should not be
one that prevents us from knowing beforehand what 'region' tags are in a
template, or if say a 'pageicon' key is omitted.
- The syntax should be context sensitive; I don't want to see anything
where we're explicitly html escaping, url encoding, etc... the syntax
should understand whether it's in a html context, an attribute, a href,
class, even a inline url(). ie: The syntax and markup must mingle
together and be dependent on each other.
- The syntax must support calling back to the host rather than requiring
all the data pre-generated up front. Being able to call back is an
absolutely necessity for the i18n system, and it's also necessary
because I intend to have things like other forms of navigation than the
sidebar, and we wouldn't want to pre-parse every single one of those
when a skin will typically only use one. (This precludes XSLT without
even needing to debate whether the language is powerful or clean to use.)
I've already looked over all the existing template languages I could
find and found none satisfactory. So in essence what I'm asking here is
NOT for people to look for a template library to use, but for ideas on
the best template language syntax we can implement for our purposes.
XSLT is already out. And our parser is so complex we would never want to
run it on each page, the syntax is also excessive for these purposes,
and it can't be analyzed easily either so the suggestion to use WikiText
for the template syntax is out too before anyone suggests it.
This is the template language I've been thinking of for awhile:
http://www.mediawiki.org/wiki/User:Dantman/Skinning_system/Monobook_template
It fulfills the requirements, though it could use some small tweaks, eg:
I might want to rethink the mw:header, mw:content, and mw:icon which
aren't first-level things.
But my biggest problem with it, is I never included a plan on how to let
you do things like omit the login/logout/createaccount links from
personal_tools and instead make them button-styled links located elsewhere.
The best idea I could come up with was to include the functionality of
WikiScripts/Lua/JS or whatever the language that we decide to include
into our WikiText templates to deal with Tim's regrets on creating
parser functions.
So, I'd love to see other peoples ideas on the best syntax for our skin
system.
Or ideas or comments on how to fix the problem with the syntax I came up
with.
--
~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://daniel.friesen.name]