Hi,
There's a thing I've been doing for exactly one year now, and some people
on this list may find it interesting: I've been counting how many article
space edits in the Hebrew Wikipedia added a <nowiki> tag.
These tags are very rarely needed in articles, but they are often added in
edits that go through Parsoid (VisualEditor and ContentTranslation).
Experienced editors complained that they are added too frequently and they
have to fix them manually, so I started meticulously counting _how_
frequently, and also _why_ are they added, so I'd be able to report Parsoid
/ VE / ContentTranslation bugs with the hope of reducing it.
I did the counting by checking Recent Changes every day for edits tagged "
nowiki" (added by a locally-defined AbuseFilter if a main space edit has a <
nowiki> tag in the new text), and checking every diff.
The full analyzed and sorted results are at https://he.wikipedia.org/wiki
/WP:VE/nowiki . I did my best to translate the most essential parts to
English, but please ask me if you have any more questions.
A summary of findings:
* There are on average about 3000 article-space edits in the Hebrew
Wikipedia per day.[1]
* There are on average about 450 edits with the VisualEditor tag in the
Hebrew Wikipedia per day.[2]
* There are rarely more than 20 edits per day that have <nowiki>, and
usually much less than that.
* The most common reason for the appearance of <nowiki> is writing two
apostrophes ('') instead of a double quotation mark (").[3] It's remarkable
how many people make this mistake, although it's possible that it's more
common in the Hebrew language because of the peculiar ways in which quote
characters are used in it and how they appear on common keyboards.
* The other most common reason is what I call "bad links" and "wrong
links". Both involve letters added after internal links, with a <nowiki/>
added immediately after the closing ']]'; for an explanation about the
difference "bad" and "wrong", see the linked page. Counted together, these
two categories of errors is the most common cause for the appearance of <
nowiki>.
* After the above reasons, the most common are vandalism (and I don't
consider it an issue on VisualEditor or Parsoid) and making mistakes in the
wiki syntax of template parameters.
As a result of this work I reported many Parsoid and VisualEditor bugs, and
their excellent developers fixed a bunch: Wiki syntax pasted in
VisualEditor is now correctly auto-converted in a DWIM way; empty runs of <
nowiki>'''</nowiki> are not created any longer if somebody makes text bold
but doesn't write anything; _some_ bugs related to ISBN and external links
handling were fixed (though a few remain); and more.
Something similar was also being done in the French Wikipedia[4] for some
time, but not updated since August 2015 :(
I wish I could do it for other languages, but there's no chance that I'll
find time for that. However, if anybody volunteers to do it for the
Wikipedia in their language, I'll be very happy to help you get started.
I'd be super-interested to know how it is in English, Spanish, Dutch,
Polish, Czech, Russian, Hungarian, and any other language. Takes no more
than 5 minutes per day with the volume of edits in Hebrew, but the time for
other languages will probably be different.
P.S. I'm stupid, please correct my queries if they are wrong.
[1] select substring(rev_timestamp, 1, 8) rev_date, count(rev_id) from
revision, page where page_id = rev_page and page_namespace = 0 and
rev_timestamp > 20160100000000 group by rev_date order by rev_date;
[2] select substring(rev_timestamp, 1, 8) rev_date, count(rev_id) from
revision, page, change_tag where page_id = rev_page and page_namespace = 0
and rev_timestamp > 20160100000000 and ct_tag = "visualeditor" and
ct_rev_id = rev_id group by rev_date order by rev_date;
[3] https://phabricator.wikimedia.org/T106641
[4]
https://fr.wikipedia.org/w/index.php?title=Wikip%C3%A9dia:%C3%89diteurVisue…
Nowiki&action=history
--
Amir Elisha Aharoni · אָמִיר אֱלִישָׁע אַהֲרוֹנִי
http://aharoni.wordpress.com
“We're living in pieces,
I want to live in peace.” – T. Moore
Annoyed by the difficulties of tracking events in the Wikimedia tech
community? Or by the difficulties of announcing events in an effective way?
Check this out:
Consolidate the many tech events calendars in Phabricator's calendar
https://phabricator.wikimedia.org/T1035
The hypothesis is that it is worth improving the current situation with
calendars in the Wikimedia tech community, and that Phabricator Calendar is
the best starting point. If we get a system that works for Wikimedia Tech,
I believe we can get a system that works for the rest of Wikimedia,
probably with some extra steps.
The Technical Collaboration team has some budget that we could use to fund
the Phabricator maintainers to prioritize some improvements in their
Calendar. If you think this is a bad idea and/or you have a better one,
please discuss in the task (preferred) or here. If you think this is a good
idea, your reassuring feedback is welcome too. ;)
Thank you!
--
Quim Gil
Engineering Community Manager @ Wikimedia Foundation
http://www.mediawiki.org/wiki/User:Qgil
Hello,
We have more and more MediaWiki PHPUnit tests which is great, but the
test runner is crippled with a lot of performances issues that makes it
rather slow. One would have noticed that running 10k+ tests is no fun.
What a surprise when this morning I noticed Ori Livneh (wmf Performance
team) sent a series of patch that would definitely speed up the test
run. Ranging from removing a sleep() to implementing base32 in plain PHP.
The speed up will benefit everyone and have the tests report faster when
patchsets are proposed in Gerrit. Ori kindly regrouped them under the
topic 'unit-tests-perf'. Please take sometime to review them:
https://gerrit.wikimedia.org/r/#/q/project:mediawiki/core+topic:unit-tests-…
Thx Ori!
--
Antoine "hashar" Musso
Dear Wikimedia technical community,
If you ever thought of organizing a small event, print some stickers, or
any other activity for the good of Wikimedia that costed money, now Rapid
Grants might be the simple solution you were looking for. Details below.
---------- Forwarded message ----------
From: Alex Wang <awang(a)wikimedia.org>
Date: Wed, May 18, 2016 at 5:11 AM
Subject: [Wikimedia-l] Announcing Rapid Grants
To: Wikimedia Mailing List <wikimedia-l(a)lists.wikimedia.org>
Hello Wikimedians,
We are excited to announce the launch of a new Wikimedia Foundation grants
program, Rapid Grants!
Rapid grants fund Wikimedia community members -- individuals, groups, or
organizations contributing to Wikimedia projects -- to organize projects
throughout the year for up to USD 2,000. Projects can include experiments
or standard needs that don't need broad review to get started. Applications
are reviewed weekly by WMF staff.
Read more about the new program and apply here:
https://meta.wikimedia.org/wiki/Grants:Project/Rapid
Questions? Email rapidgrants(a)wikimedia.org
For more information about next steps and important dates for the grants
program redesign, please visit:
https://meta.wikimedia.org/wiki/Grants:IdeaLab/Reimagining_WMF_grants/Imple…
Cheers,
Alex
--
Alexandra Wang
Program Officer
Community Resources
Wikimedia Foundation <http://wikimediafoundation.org/wiki/Home>
+1 415-839-6885
Skype: alexvwang
_______________________________________________
Wikimedia-l mailing list, guidelines at:
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
New messages to: Wikimedia-l(a)lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
<mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe>
--
Quim Gil
Engineering Community Manager @ Wikimedia Foundation
http://www.mediawiki.org/wiki/User:Qgil
Hi,
I'm a student from Chennai, India and my project is going to be related to
performing image processing on the images on commons.wikimedia to automate
categorization. DrTrigon had made the script catimages.py a few years ago
which was made in the old pywikipedia-bot framework. I'll be working
towards updating the script to the pywikibot-core framework, updating it's
dependencies, and using newer techniques when possible.
catimages.py is a script that analyzes an image using various computer
vision algorithms and allots categories to the image on commons. For
example, consider algorithms that detect faces, barcodes, etc. The script
uses these to categorize images to Category:Unidentified People,
Category:Barcode, and so on.
If you have any suggestions and categorizations you think might be useful
to you, drop in at #gsoc-catimages on freenode or my talk page[0]. You can
find out more about me on User:AbdealiJK[1] and about the project
at T129611[2].
Regards
[0] - https://commons.wikimedia.org/wiki/User_talk:AbdealiJK
[1] - https://meta.wikimedia.org/wiki/User:AbdealiJK
[2] - https://phabricator.wikimedia.org/T129611
Hi,
For the past few weeks I've been working[1] on rewriting Linker::link()
to be non-static, use LinkTarget/TitleValue and some of the other fancy
new services stuff. Yay!
For the most part, you'd use it in similar ways:
Linker::link( $title, $html, $attribs, $query );
is now:
$linkRenderer = MediaWikiServices::getInstance()
->getHtmlPageLinkRenderer();
$linkRenderer->makeLink( $title, $html, $attribs, $query );
And there are makeKnownLink() and makeBrokenLink() entry points as well.
Unlike Linker::link(), there is no $options parameter to pass in every
time a link needs to be made. Those options are set on the
HtmlPageLinkRenderer instance, and then applied to all links made using
it. MediaWikiServices has an instance using the default settings, but
other classes like Parser will have their own that should be used[2].
I'm also deprecating the two hooks called by Linker::link(), LinkBegin
and LinkEnd. They are being replaced by the mostly-equivalent
HtmlPageLinkRendererBegin and HtmlPageLinkRendererEnd hooks. More
details are in the commit message. [3] is an example conversion for
Wikibase.
The commit is still a WIP because I haven't gotten around to writing
specific tests for it (it passes all the pre-existing Linker and parser
tests though!), and will be doing that in the next few days.
Regardless, reviews / comments / feedback on [1] is appreciated!
[1] https://gerrit.wikimedia.org/r/#/c/284750/
[2] https://gerrit.wikimedia.org/r/#/c/288572/
[3] https://gerrit.wikimedia.org/r/#/c/288674/
-- Legoktm
Adam noticed that I broke the installer when introducing MediaWikiServices
(see T135169). In particular, localization and CSS stopped working. Here's the
fix:
https://gerrit.wikimedia.org/r/#/c/288648/
Please review!
--
Daniel Kinzler
Senior Software Developer
Wikimedia Deutschland
Gesellschaft zur Förderung Freien Wissens e.V.
Hello All,
I am Sriharsh ( I go by the nick of darthbhyrava on IRC and on
Phabricator), and this is a rather late introduction of myself. :P
I am a second year undergrad pursuing a dual degree (B.Tech in Computer
Science and MS in Computational Linguisitcs) course at IIIT-Hyderabad. I
was fortunate enough to be selected as a Google Summer of Code intern at
Wikimedia this year, and I will be working on implementing pywikibot
support for the Thanks extension. :)
I would like to thank my mentor jayvdb for helping me extensively since
February, and for being one of the primary reasons for my getting selected.
I would also like to polybuildr, my friend from college, who introduced me
to the world of Wikimedia, fellow intern AbdealiJK for his help, and all
the other people who replied to my doubts on IRC or reviewed my patches or
commented on my tasks. Thank you all, for giving me an opportunity to work
on Thanks! :)
It's been a while into the Community Bonding period, and I have enjoyed it
so far. I have realized the demands of the project, and look forward to the
challenge of implementing Pywikibot-Thanks over the next two months. This
being my first GSoC experience, I look forward to a learning a lot over the
summer and beyond!
Thank you for your time.
Sriharsh Bhyravajjula
IIIT-Hyderabad
P.S: You can take a look at my proposal and progress at
https://phabricator.wikimedia.org/T130585, or read my blog on my wonderful
Wikimedia experience so far at http://bhyrava.me/code . Thank you once
again!
Hey,
Here is the weekly update for the Revision Scoring project for the week of
May 9th through May 15th.
*New developments:*
- We do have a dedicated help page on how to request that we add support
for new languages[1]
- We deployed new version of revscoring and ORES, The biggest
improvement is speed. Improvments may vary in different wikis but for
English Wikipedia is about 20% [2]
- We are pre-generating list of bad words for different langauges. [3]
- shinken and icinga now report outages and recovery on #wikimedia-ai --
our main work channel[4]
*Maintenance and robustness:*
- Soon, Your unlabeled edits in Wikilabels will be made available to
others after 24 hours. [5]
- We improved logging of scoring errors in ORES [6]
1. https://phabricator.wikimedia.org/T135179
2. https://phabricator.wikimedia.org/T135381
3. https://phabricator.wikimedia.org/T134629
4. https://phabricator.wikimedia.org/T134726
5. https://phabricator.wikimedia.org/T134619
6. https://phabricator.wikimedia.org/T135399
For the last decade we've supported uploading SVG vector images to
MediaWiki, but we serve them as rasterized PNGs to browsers. Recently,
display resolutions are going up and up, but so is concern about
low-bandwidth mobile users.
This means we'd like sharper icons and diagrams on high-density phone
displays, but are leery of adding extra srcset entries with 3x or 4x
size PNGs which could become very large. (In fact currently MobileFrontend
strips even the 1.5x and 2x renderings we have now, making diagrams very
blurry on many mobile devices. See https://phabricator.wikimedia.org/T133496 -
fix in works.)
Here's the base bug for SVG client side rendering:
https://phabricator.wikimedia.org/T5593
I've turned it into an "epic" story tracking task and hung some blocking
tasks off it; see those for more details.
TL;DR stop reading here. ;)
One of the basic problems in the past was reliably showing them natively in
an <img>, with the same behavior as before, without using JavaScript hacks
or breaking the hamlet caching layer. This is neatly resolved for current
browsers by using the "srcset" attribute -- the same one we use to specify
higher-resolution rasterizations. If instead of PNGs at 1.5x and 2x
density, we specify an SVG at 1x, the SVG will be loaded instead of the
default PNG.
Since all srcset-supporting browsers allow SVG in <img> this should "just
work", and will be more compatible than using the experimental <picture>
element or the classic <object> which deals with events differently. Older
browsers will still see the PNG, and we can tweak the jquery.hidpi srcset
polyfill to test for SVG support to avoid breaking on some older browsers.
This should let us start testing client-side SVG via a beta feature (with
parser cache split on the user pref) at which point we can gather more
real-world feedback on performance and compatibility issues.
Rendering consistency across browser engines is a concern. Supposedly
modern browsers are more consistent than librsvg but we haven't done a
compatibility survey to confirm this or identify problematic constructs.
This is probably worth doing.
Performance is a big question. While clean simple SVGs are often nice and
small and efficient, it's also easy to make a HUGEly detailed SVG that is
much larger than the rasterized PNGs. Or a fairly simple small file may
still render slowly due to use of filters.
So we probably want to provide good tools for our editors and image authors
to help optimize their files. Show the renderings and the bandwidth balance
versus rasterization; maybe provide in-wiki implementation of svgo or other
lossy optimizer tools. Warn about things that are large or render slowly.
Maybe provide a switch to run particular files through rasterization always.
And we'll almost certainly want to strip comments and white space to save
bandwidth on page views, while retaining them all in the source file for
download and reediting.
Feature parity also needs more work. Localized text in SVGs is supported
with our server side rendering but this won't be reliable in the client;
which means we'll want to perform a server side transformation that creates
per-language "thumbnail" SVGs. Fonts for internationalized text are a big
deal, and may require similar transformations if we want to serve them...
Which may mean additional complications and bandwidth usage.
And then there are long term goals of taking more advantage of SVGs dynamic
nature -- making things animated or interactive. That's a much bigger
question and has implementation and security issues!
-- brion