Hi,
we are considering a policy for REST API end point result format
versioning and negotiation. The background and considerations are
spelled out in a task and mw.org page:
https://phabricator.wikimedia.org/T124365https://www.mediawiki.org/wiki/Talk:API_versioning
Based on the discussion so far, have come up with the following
candidate solution:
1) Clearly advise clients to explicitly request the expected mime type
with an Accept header. Support older mime types (with on-the-fly
transformations) until usage has fallen below a very low percentage,
with an explicit sunset announcement.
2) Always return the latest content type if no explicit Accept header
was specified.
We are interested in hearing your thoughts on this.
Once we have reached rough consensus on the way forward, we intend to
apply the newly minted policy to an evolution of the Parsoid HTML
format, which will move the data-mw attribute to a separate metadata
blob.
Gabriel Wicke
Hey all,
TLDR: ORES extension [1] which is an extension that integrates ORES service
[2] with Wikipedia to make fighting vandalism easier and more efficient is
in the progress of deployment. You can test it in
https://mw-revscoring.wmflabs.org (Enable it in your preferences first)
You probably know ORES. It's an API service that gives probably of an edit
being vandalism, it also does other AI-related stuff like guessing the
quality of articles in Wikipedia. We have a nice blog post in Wikimedia
Blog [3] and media paid some attention to it [4]. Thanks to Aaron Halfaker
and others [5] for their work in building this service. There are several
tools using ORES to highlight possibly vandalism edits. Huggle, gadgets
like ScoredRevisions, etc. But an extension does this job much more
efficiently.
The extension which is being developed by Adam Wight, Kunal Mehta and me
highlights unpatrolled edits in recentchanges, watchlists, related changes
and in future, user contributions if ORES score of those edits pass a
certain threshold. GUI design is made by May Galloway. ORES API (
ores.wmflabs.org) only gives you a score between 0 and 1. Zero means it's
not vandalism at all and one means it's vandalism for sure. You can test
its simple GUI in https://ores.wmflabs.org/ui/. It's possible to change the
threshold in your preferences in the recent changes tab (you have options
instead of numbers because we thought numbers are not very intuitive).
Also, we enabled it in a test wiki so you test it:
https://mw-revscoring.wmflabs.org. You need to make an account (use a dummy
password) and then enable it in beta features tab. Note that building AI
tool to detect vandalism in a test wiki sounds a little bit silly ;) so we
set up a dummy model that probability of an edit being vandalism is
backward of the last two digits (e.g. diff id:12345 = score:54%). In a more
technical aspect, we store these scores in ores_classification table so we
can do a lot more analysis with them once the extension is deployed. Fun
use cases such as the average score of a certain page or contributions of a
user or members of a category, etc.
We passed security review and we have consensus to enable it in Persian
Wikipedia. We are only blocked on ORES moving from Labs to production
(T106867 [6]). The next wiki is Wikidata, we are good to go once the
community finishes labeling edits so we can build the "damaging" model. We
can enable it Portuguese and Turkish Wikipedia after March because s2 and
s3 have database storage issues right now. For other Wikis, you need to
check if ORES supports the Wiki and if community finished labeling edits
for ORES (check out the table at [2])
If you want to report bugs or add feature requests you can find it in here
[7].
[1]: https://www.mediawiki.org/wiki/Extension:ORES
[2]: https://meta.wikimedia.org/wiki/Objective_Revision_Evaluation_Service
[3]:
https://blog.wikimedia.org/2015/11/30/artificial-intelligence-x-ray-specs/
[4]:
https://meta.wikimedia.org/wiki/Research:Revision_scoring_as_a_service/Media
[5]:
https://meta.wikimedia.org/wiki/Research:Revision_scoring_as_a_service#Team
[6]: https://phabricator.wikimedia.org/T106867
[7]: https://phabricator.wikimedia.org/tag/mediawiki-extensions-ores/
Best
My apologies for the short notice. Normally we announce these more than one
hour in advance, but I forgot.
In today's RFC meeting, we will discuss the following RFC:
* Standardise on how to access/register JavaScript interfaces
<https://phabricator.wikimedia.org/T108655>
<https://phabricator.wikimedia.org/E144
<https://phabricator.wikimedia.org/E140>>
The meeting will be on the IRC channel #wikimedia-office on
chat.freenode.net at the following time:
* UTC: Wednesday 22:00
* US PST: Wednesday 14:00
* Europe CET: Wednesday 23:00
* Australia AEDT: Thursday 09:00
Roan
https://www.mediawiki.org/wiki/Scrum_of_scrums/2016-02-24
= 2016-02-24 =
== Technology ==
=== Analytics ===
* '''Blocking''': (nobody we know)
* '''Blocked''': (on nothing)
* '''Updates''':
** Upgraded to CDH 5.5, comes with lots of improvements for those using the
Hadoop cluster:
http://www.cloudera.com/documentation/enterprise/latest/topics/cdh_rn_new_i…
** Internally released data that estimates the number of Unique Devices
hitting each of our domains, using the Last Access cookie. This is a major
release, and it's available in the wmf database in hive, in the
last_access_uniques_daily table.
** Fixed handling of uri-encoded page titles in the pageview API
=== Architecture ===
* '''Blocking''':
** ???
* '''Blocked''':
** ???
* '''Updates''':
** ???
=== Performance ===
* '''Blocking''':
** ???
* '''Blocked''':
** ???
* '''Updates''':
** ???
=== Release Engineering ===
* '''Blocking''':
** https://phabricator.wikimedia.org/T111259
** Update train email as to why stalled
* '''Blocked''':
** None
* '''Updates''':
** AQS deployed via Scap3, (hooray \o/ +1) ready for new services w/new
version
** Phabricator updates happened, puppet work continues
** Train (still) not running wmf.14 on testwiki and that's all
=== Research ===
* '''Blocking''':
** nothing we know of
* '''Blocked''':
** blocked on ops for ORES in production
*** also blocks deployment of ORES extension to fawiki and wikidata
*** halfak would like to engage with ops - could someone contact him?
* '''Updates''':
** none
=== Security ===
* '''Blocking''':
** ???
* '''Blocked''':
** ???
* '''Updates''':
** Working through lots of security bugs
** PageViewInfo review in progress
=== Services ===
* '''Blocking''':
** ???
* '''Blocked''':
** ???
* '''Updates''':
** ???
=== Technical Operations ===
* '''Blocking''':
** ???
* '''Blocked''':
** ???
* '''Updates''':
** ???
== Product ==
=== Community Tech ===
* '''Blocking''':
** ???
* '''Blocked''':
** ???
* '''Updates''':
** ???
=== Discovery ===
* '''Blocking''':
** none afaik
* '''Blocked''':
** Would like Ops input for https://phabricator.wikimedia.org/T126730 (caching
model for WDQS)
** Would like Sec review on SVG sanitizer JS lib
** Would like Sec review on Schema validator php lib
* '''Updates''':
** Preparing to switch completion suggester into production (March)
** A number of new interesting graphs at http://discovery.wmflabs.org/ e.g.
http://discovery.wmflabs.org/metrics/#failure_langproj,
http://discovery.wmflabs.org/portal/#browser_breakdown
** Not much new, mostly bugfixes, tweaks and maintenance
==== Graphs ====
** Pageview API graphs getting popular
=== Editing ===
==== Collaboration ====
* '''Blocking''':
** External Store - In progress. Will soon enable External Store on Beta
Cluster as a pre-requisite for this. If you want to look/give feedback on
the Beta change, see https://phabricator.wikimedia.org/T95871
* '''Blocked''':
** Flow dumps on dumps.wikimedia.org:
https://phabricator.wikimedia.org/T119511
** Schema change to make a column NOT NULL in production:
https://phabricator.wikimedia.org/T122111#2050844
* '''Updates''':
** Enabled Echo cross-wiki notifications feature on initial wave of wikis.
Good feedback so far.
** Working on some issues with Flow board moves.
** Also, not a Collaboration team thing, but we've asked for feedback on
some Code of Conduct proposed changes:
https://www.mediawiki.org/wiki/Talk:Code_of_Conduct/Draft#Suggested_changes
==== Language ====
* '''Blocking''':
** None
* '''Blocked''':
** None
* '''Updates''':
**
==== Multimedia ====
* '''Blocking''':
** ???
* '''Blocked''':
** ???
* '''Updates''':
** ???
==== Parsing ====
* '''Blocking''':
** None?
* '''Blocked''':
** None
* '''Updates''':
** Templatedata based serialization being deployed today (
https://phabricator.wikimedia.org/T111674 and
https://phabricator.wikimedia.org/T104599 )
** Kunal and Ori have been investigating
https://phabricator.wikimedia.org/T124356 ... Ori might have made some
headway there.
*** Filed https://phabricator.wikimedia.org/T127757 to fix getText()
semantics to prevent this kind of sneaky bugs in the future.
** Heads up for release engineering:
https://phabricator.wikimedia.org/T111259 bit us once more recently.
==== VisualEditor ====
* '''Blocking''':
** None known.
* '''Blocked''':
** Waiting on Design Research availability for user testing of Single Edit
Tab integration
* '''Updates''':
** Single Edit Tab went to Hungarian Wikipedia yesterday; now waiting on
user feedback.
** Some improvements to OOUI; note the breaking change for wmf.15+ (no
known issues in gerrit master code).
** Last week we said we'd update on assessing the performance impact of
OOUI on all read pages; this is not firm yet, but appears to be a trivial
additional cost.
=== Fundraising Tech ===
* No blockers/blocking
* Investigating anomalies
* Improving CiviCRM reporting
* Testing backup processor improvements
* Further Latin America processor work
=== Reading ===
==== Android ====
* '''Updates''':
** Nothing to report.
==== Reading Infrastructure ====
* We've been mostly chasing performance issues that people found that touch
stuff SessionManager also touched.
* In the not too distant future, load.php is going to start enforcing the
fact that it's supposed to not depend on the session or the request data.
** See https://phabricator.wikimedia.org/T127233 and subtasks.
** Check your ResourceLoaderModule subclasses and your
'ResourceLoaderGetConfigVars' hook functions to make sure you're not using
$wgUser or $wgLang (or their equivalents via RequestContext). You'll
generally want to use the user and language from the ResourceLoaderContext
or use the 'MakeGlobalVariablesScript' hook instead. Remember that Message
objects will use $wgLang by default.
** Check your parser hooks to make sure you're using
$parserOptions->getUser() or $parser->getTargetLanguage() instead of
$wgUser or $wgLang (or their equivalents via RequestContext), respectively.
Otherwise you're liable to blow things up if your hook gets used in a
message somewhere.
** Timo says he'll send an announcement of some sort once details are
settled.
Hello,
As we know, wiki (mainly wikipedia) articles go into a lot of details about
the subject. They often tend to become verbose. Sometimes individual
sections become as long as articles.
The information about a topic is split across various pages which are
linked in the article.We have to open several such links to get a good
understanding of the article.
Navigation popups/Hovercards make it a bit simpler. But the info provided
by them is often out of context .They are more about an introduction to the
linked article rather than the intended page and their connection; which
makes it disconnected and muddled. It helps a reader figure out the
importance of a page, but not its relevancy.
As part of GSoC project, I was thinking of making a summarization tool that
could automatically create a wholesome summary of the article. The links,
categories, infoboxes and other unique wiki things make it much different
and interesting than simple text summarization. It makes it easier to gauge
the context and relevancy of articles and the linked structure make it
possible to crawl to relevant pages (like Hovercard). Finally, combining
only the important and relevant information (from all sections), we can
form a coherent and lucid summary for the reader. The intro paragraphs just
provide an introduction to the article whereas the script will provide a
jist of the entire article (and hence would be bigger in most cases)
Though there has been some independent research
<http://lms.comp.nus.edu.sg/sites/default/files/publication.../acl09-yesr.pdf>
done on it, the possibility of such a tool was never discussed at length on
wikimedia.
So, I want to ask the opinion of all the members towards such a tool, in
the above or some other form. Also does it seem like something that can be
done as a GSoC project (MVP)? Would there be any mentors interested?
Hi all,
A request has come up (https://phabricator.wikimedia.org/T126832) to
re-create pt.wikimedia.org on the wikimedia cluster. Unfortunately it was
previously hosted there and so the 'ptwikimedia' database name is already
taken.
Since database renaming does not really appear to be an option, does anyone
have any objections to using 'pt2wikimedia' (or similar, suggestions
welcome) instead for the new wiki? I know this doesn't fit the existing
pattern so I'm unsure about just going ahead without asking for input from
a wider audience.
Alex
I'm trying to make images from an external source, provided by a parser
function, work with VisualEditor and Parsoid. For a very simplified
illustration I added the following to the bottom of LocalSettings.php
```
$wgExtensionMessagesFiles['myPfTest'] = "$IP/myPfTest.php";
$wgHooks['ParserFirstCallInit'][] = function( &$parser ) {
$parser->setFunctionHook(
'test_func',
function ( &$parser ) {
// Output a wiki image and an external image
$output = "[[File:Test.png|frameless|300px]]" .
"<br />" .
"<img src='http://goo.gl/fh3yKh' />";
return array(
$output,
"noparse" => true,
"isHTML" => true,
);
},
SFH_OBJECT_ARGS
);
};
$wgAllowExternalImages = false;
$wgAllowImageTag = false;
```
To satisfy the $wgExtensionMessagesFiles requirement, I also added a file
at $IP/myPfTest.php with the contents:
```
<?php
$magicWords = array();
$magicWords['en'] = array(
'test_func' => array( 0, 'test_func' ),
);
```
When I add a page with wikitext `{{#test_func: }}`, the page displays the
raw wikitext of the File:Test.png and shows the actual image for the
external image. This is as expected.
However, when I click "edit" for Visual Editor, File:Test.png DOES display
(not expected), but the external image does not (also not expected).
Instead the external image displays the raw text "<img src='
http://goo.gl/fh3yKh' />"
Adjusting $wgAllowExternalImages and $wgAllowImageTag to all possible
combinations has no effect. Adjusting "noparse" to true makes File:Test.png
display in view-mode (expected) as well as VE (expected) but has no effect
on the external image. "isHTML" appears to have no effect.
Is there some way to make external images play nicely with VE and/or
Parsoid?
I've tested this on two configurations:
MW 1.25.5
PHP 5.6.14
VE be84313 (head of the REL1_25 branch)
Parsoid ba26a55 (latest commit I've been able to get working with REL1_25)
MW 1.26.2
PHP 5.6.14
VE 34a21d8 (head of the REL1_26 branch)
Parsoid 8976ab9 (latest commit as of a week or so ago, I think)
Thanks,
James
Hello,
I am Mehul Sawarkar, a Computer Science student from India . I want to be
part of wikimedia in GSoC 2016.Pplease help me getting started .Is there
any small bug that I can fix?
Thanks,
Mehul Sawarkar.