Hi all,
I'd like to plead for advice on testing Recitation-bot to demonstrate we
have resolved a bug that resulted in a ban for splitting image pages into
two separate incomplete pages last month.
The log of attempts to test the fix is in this bug
<https://github.com/wpoa/recitation-bot/issues/69>.
Here are the key points / open questions:
- Each wiki I have successfully tried to test on, including the
production wikis, test.wikipedia.org, and testwiki.wiki, seems to
redirect the bot to a page with information about permissions / blocks,
something I had never seen prior to the block.
- What is the appropriate place to test? Seems to be www.thetestwiki.org
- Could we appeal the block on the strength of the apparent correctness
of the edit to fix the bug, at least temporarily so as to be able to
demonstrate the fix on a wiki we were running successfully on in the recent
past? Who would be best to approach with such a request?
Thanks any and all for any advice you can offer.
Anthony
Hi all!
Here are the minutes from this week's ArchCom meeting. You can also find the
minutes at <https://www.mediawiki.org/wiki/Architecture_committee/2017-03-15>.
See also the ArchCom status page at
<https://www.mediawiki.org/wiki/Architecture_committee/Status> and the RFC board
<https://phabricator.wikimedia.org/tag/mediawiki-rfcs/>.
Here are the minutes, for your convenience:
Notable activity:
* "Disabling LocalisationUpdate on WMF wikis" is being worked on; After some
discussion, it looks like it’ll be rewritten rather than removed.
<https://phabricator.wikimedia.org/T158360>
* CURL to become an optional dependency of MediaWiki, managed by composer
* Template styling RFC has reached agreement.
<https://phabricator.wikimedia.org/T155813>
* Thumbnail API RFC has seen no activity last week.
<https://phabricator.wikimedia.org/T66214>
* Brion is working with Jaime on a tool to check for DB schema inconsistencies
between server.
Public meeting on Mobile Frontend Requirements:
* Overview:
<https://docs.google.com/document/d/1jlBl_qAIrGPF7zqOmK77Db8y4InO1j3qX1qqipz…>
* Outcome: Collected feedback, will reiterate. Log:
<https://tools.wmflabs.org/meetbot/wikimedia-office/2017/wikimedia-office.20…>
Using hangout was an experiment to see whether this is feasible and useful. We
do not currently plan to make Hangout the default meeting mode. Observations:
* We failed to get the youtube stream to work.
* We did not come close to the 25 person limit for hangout.
* Using IRC for logging is useful. Relaying questions and notes between IRC and
Hangout works, but takes effort.
* Please give feedback about using Hangout for this kind of meeting!
No public meeting planned for next week (March 22)
--
Daniel Kinzler
Principal Platform Engineer
Wikimedia Deutschland
Gesellschaft zur Förderung Freien Wissens e.V.
--
Daniel Kinzler
Principal Platform Engineer
Wikimedia Deutschland
Gesellschaft zur Förderung Freien Wissens e.V.
Forwarded as received.
Dear Wikimedians,
The Wikimedia Foundation is pleased to announce a small new program called
the Hardware Donation Program. In a word, it is a program designed to
donate depreciated (but fully working) hardware from the WMF office to
community members who would put it to good use.
The program, including instructions on how to apply, is described on Meta,
here:
https://meta.wikimedia.org/wiki/Hardware_donation_program
Please read the information carefully. I especially encourage you to pay
attention to the program's design considerations, which determine most of
the decisions we'll be making.
We currently have approximately 20 laptops ready to be donated.
Applications are welcome.
The upcoming Wikimedia Conference in Berlin (in about two weeks) would be
an excellent opportunity to deliver some of those laptops in person to
approved applications, so if you think you might be interested, I'd
encourage you to apply as soon as possible.
Please also help spread the word about this program, by forwarding this
e-mail to other Wikimedia lists you're on, and posting the link to the
program page on village pumps and *community* (not public) social media
channels or other communication forms you use.
Special thanks to User:Anntinomy from Wikimedia Ukraine, who had the idea
of asking about possible donation of older machines from WMF, and inspired
this program.
Mini-FAQ:
Q: Why are you doing this?
A: WMF's Office IT determines a lifetime for work machines, and regularly
replaces older machines. This creates a stock of older, working machines,
that are available for donation. We can donate them locally to San
Francisco charities, but figure that if we can find low-cost ways to
deliver them to our own community members, that's so much better.
Q: Am I eligible?
A: Read the fine program documentation.
Q: If I'm eligible, am I guaranteed a donated laptop?
A: no.
Q: Once these 20 laptops are donated, will there be others?
A: yes, eventually.
Q: How can you ensure people would use the machines for Wikimedia purposes?
A: We can't. We'll be making a good-effort assessment of the likelihood of
Wikimedia use, and make a decision to donate (or not) the equipment. Once
donated, the equipment no longer belongs to WMF. We encourage, but can't
enforce, reporting on impact achieved using the equipment.
Q: I need a few laptops for my event in two weeks! Can I get them through
this program?
A: No. Read the fine program documentation.
Q: I'm really happy about this!
A: So are we! :)
Q: I'm really angry about this!
A: So it goes.
Q: I have more questions!
A: Hit 'Reply'. :)
Cheers,
Asaf
Hello,
After several months of hard work by the Discovery Search
<https://www.mediawiki.org/wiki/Wikimedia_Discovery/Search> team, we're
happy to announce that the CirrusSearch
<https://www.mediawiki.org/wiki/Extension:CirrusSearch> backend has been
upgraded to ElasticSearch
<https://www.elastic.co/products/elasticsearch> version
5. This update was bit tedious and difficult for the team; you can browse
the various tickets <https://phabricator.wikimedia.org/T154501> in
Phabricator if you'd like to know more.
This upgrade to ES5 is great, because we now have a new Ukrainian analyzer
to experiment with, a new reindexing API and soon, we’ll have an updated
completion suggester - all of which makes our backend more standard and
reduced a significant amount of technical debt.
Cheers from the Discovery Search Team!
--
deb tankersley
irc: debt
Product Manager, Discovery
Wikimedia Foundation
Hey folks!
This is the 32 - 41st weekly update from the revision scoring team that we
have sent to this mailing list. We've been busy, but our reporting fell
behind. So here I am getting us caught up! This is going to be a long
one. Bear with me.
One major thing we've done in the past few weeks is drafted and presented a
proposal to increase the resourcing for the ORES project in the 2017 Fiscal
Year. Currently, we're just one fully funded staff member (halfak) and
partially funded contractor (Amir1) working with a bunch of volunteers.
We're proposing to staff the team with fulltime engineers, a liaison and a
tech writer. See a full draft of our proposal and pitch deck here:
https://meta.wikimedia.org/wiki/Wikimedia_Foundation_Scoring_Platform_team
*New development:*
We've expanded support for our "editquality" models to more wikis and
improved the performance of some of the models.
- We scaled up the number of observations for Indonesian Wikipedia to
100k[1]
- We added language support for Romanian[2] and built the basic
"reverted" model[3]
- We trained and tested "damaging" and "goodfaith" models for Czech
Wikipedia[4]
- We implemented some params in our training utilites to control memory
usage[5]
- We deployed all of the above to Wikimedia Labs[6]. A production
deployment is coming soon.
Prompted by the 2016 community wishlist[7], we've implemented a
"draftquality" model for evaluating new page creations.
- We researched deletion reasons on English Wikipedia[8] and created a
labeled dataset using the deletion log.
- We engineered a set of features to predict the quality of new
articles[9] and built a model[10]
- We generated a set of datasets[11,12,13] to make it easier for
volunteers and external researchers to help us audit the performance of the
model.
- We deployed the model on WMFLabs[14] and announced it's presence to a
few interested patrollers in English Wikipedia
- We've started the process of deploying the model in production[15,16]
We completed a project exploring the use of advance natural-language
processing strategies to extract new signal about vandalism, article
quality and problematic new articles. Regretfully, memory issues prevent
us from trivially putting this into production[17], so we're looking into
alternative strategies[18].
- We implemented a strategy for extracting sentence from Wikitext[19]
- We built sentence banks for personal attacks[20, vandalism[21],
spam[22], and Featured Articles[23].
- We built PCFG-based models[24] and analyzed their ability to
differentiate[25]
We've been working with the Collaboration Team[26] on their Edit Review
Improvments project[27]
- We defined and implemented a set of new precision-based test
statistics that will inform thresholds used in their new user interface[28]
- But we also decided to continue to report recall-based test statistics
as well[29]
Based on advice from engineers on the Collaboration Team, we've begun the
process of converting Wiki labels[30] to a stand-alone tool in labs.
- We generalize the gadget interface so that it can handle all
langauges/wikis[31]
- We implemented a means to auto-configure wikis based on the
dbname[32,33] and that allowed us to simplify configuration[34]
- We also implemented some performance improvements with minification,
bundling[35]
*Labeling:*
In the past few weeks, we've set up labeling campaigns for a few wikis.
- We deployed an edit types campaign for Catalan Wikipedia[36]
- We deployed an edit quality campagin for Chinese[37] and Romanian[38]
Wikipedias
- We deployed a new type of campaign for English Wikipedia --
"discussion quality" asks editors to label talk posts as "toxic" or not[39]
*Maintenance and robustness:*
We've solved a large set of problems with logging issues, compatibility
with wikibase, and we've made minor improvements to performance.
- We addressed a few bugs in the ORES Review Tool[40,44]
- We quieted some errors from our logging in ORES[41,45]
- We updated our code to work with a wikibase schema change[42]
- We fixed a language fallback pattern in Wiki labels[43]
- We set up monitoring on ORES database disk sizes[46]
- We fixed some issues with scap, phabricator's diffusion and other
supporting systems so that we can continue deploying to beta labs[47]
- We split our assets repo so that we can let our WMFLabs deploy get
ahead of the Production deployment[48]
- ORES can now minify its JSON responses[49]
- We identified a bug in flask-assets and worked around it in our local
installation of Wiki labels[50]
*Communications and outreach:*
We had a big presence at the Wikimedia Developer summit, we've drafted a
resourcing proposal, and we've made some announcements about upcoming plans
for the ORES Review tool.
- We facilitated the "Artificial Intelligence to build and navigate
content" track[51]
- We ran a session for building an AI wishlist[52] and captured notes
about more than 20 new AI proposals on a new tag in phabricator[53]
- We also ran a session discussion the ethics and dangers of advanced
algorithms mediating our processes[54]
- We helped facilitate a session about where to surface current AIs in
Wikimedia Projects[55]
- We held a discussion with Legal about licensing labeled data that
comes out of Wiki labels[56] and updated the interface to state the CC0
license clearly[57]
- We worked with the Reading Infrastructure team to analyze the
consumption of "oresscores" through the MediaWiki API[58]
- We drafted a pitch for increasing the resources for our team[59]
- We worked with the Collaboration team to announce that they'll
experimenting with a new RecentChanged filtering strategy in the ORES
Review Tool[60,61]
1. https://phabricator.wikimedia.org/T147107 -- Scale up the number of
observations for idwiki to 100k
2. https://phabricator.wikimedia.org/T152482 -- Add language support for
Romanian
3. https://phabricator.wikimedia.org/T156504 -- Build reverted model for
Romanian Wikipedia
4. https://phabricator.wikimedia.org/T156492 -- Train and test
damaging/goodfaith models for Czech Wikipedia
5. https://phabricator.wikimedia.org/T156645 -- Add '--workers' param to
cv_train utility
6. https://phabricator.wikimedia.org/T154856 -- Clean up dependencies and
deploy newest ORES & Models in labs
7.
https://meta.wikimedia.org/wiki/2016_Community_Wishlist_Survey/Categories/M…
8.
https://meta.wikimedia.org/wiki/Research:Automated_classification_of_draft_…
9. https://phabricator.wikimedia.org/T148580 -- Build feature set for draft
quality model
10. https://phabricator.wikimedia.org/T148038 -- [Epic] Build draft quality
model (spam, vandalism, attack, or OK)
11. https://phabricator.wikimedia.org/T148581 -- Extract features for
deleted page (draft quality model)
12. https://phabricator.wikimedia.org/T156642 -- Generate scored dataset
for 2016-08 - 2017-01
13. https://phabricator.wikimedia.org/T156643 -- Generate extracted
features for 2016-08 - 2017-01
14. https://phabricator.wikimedia.org/T155576 -- Deploy draftquality models
to WMFLabs
15. https://phabricator.wikimedia.org/T156835 -- Create package stuff for
draftquality
16. https://phabricator.wikimedia.org/T157049 -- Create new repo:
research-ores-draftquality
17. https://phabricator.wikimedia.org/T148867#2816566 -- Memory footprint
is enormous!
18. https://phabricator.wikimedia.org/T155111 -- [Spike] Investigate use of
Apertium LTtoolbox API in labs/production
19. https://phabricator.wikimedia.org/T148867 -- Implement sentences
datascources
20. https://phabricator.wikimedia.org/T148035 -- Sentence bank for personal
attacks
21. https://phabricator.wikimedia.org/T148034 -- Sentence bank for vandalism
22. https://phabricator.wikimedia.org/T148032 -- Sentence bank for spam
23. https://phabricator.wikimedia.org/T148033 -- Sentence bank for Featured
Articles
24. https://phabricator.wikimedia.org/T148037 -- Generate PCFG sentence
models
25. https://phabricator.wikimedia.org/T151819 -- Analyze differentiation of
FA, Spam, Vandalism, and Attack models/sentences.
26. https://www.mediawiki.org/wiki/Collaboration
27. https://www.mediawiki.org/wiki/Edit_Review_Improvements
28. https://phabricator.wikimedia.org/T151970 -- Implement new
precision-based test stats for editquality models
29. https://phabricator.wikimedia.org/T156644 -- Restore
recall-threshold-based metrics for editquality models.
30. https://meta.wikimedia.org/wiki/Wiki_labels
31. https://phabricator.wikimedia.org/T151120 -- Generalize standalone
gadget interface
32. https://phabricator.wikimedia.org/T154433 -- Auto config wikilabels
using dbnames
33. https://phabricator.wikimedia.org/T155439 -- Use module loader to load
JS/CSS from wikis
34. https://phabricator.wikimedia.org/T154693 -- Remove host from
wikilabels config -- infer from request
35. https://phabricator.wikimedia.org/T154122 -- Minification and bundling
for wikilabels assets
36. https://phabricator.wikimedia.org/T152965 -- Deploy cawiki edit types
campaign
37. https://phabricator.wikimedia.org/T152561 -- Deploy zhwiki edit quality
campaign
38. https://phabricator.wikimedia.org/T156357 -- Deploy edit quality
campaign for Romanian Wikipedia
39. https://phabricator.wikimedia.org/T156303 -- Deploy "Discussion
quality" campaign in wikilabels
40. https://phabricator.wikimedia.org/T152542 -- Undefined method
ORES\Hooks::getDamagingThreshold()
41. https://phabricator.wikimedia.org/T146681 -- Quiet TimeoutError in
celery logging
42. https://phabricator.wikimedia.org/T154168 -- Quantity changes broke ORES
43. https://phabricator.wikimedia.org/T154897 -- Chinese translations are
not being loaded
44. https://phabricator.wikimedia.org/T155500 -- Fatal exception of type
"DBQueryError" on sorting ORES contributions
45. https://phabricator.wikimedia.org/T157078 -- ores logspam: Model
contains an error
46. https://phabricator.wikimedia.org/T155482 -- Set up monitoring for ORES
redis database
47. https://phabricator.wikimedia.org/T157135 -- Fix broken beta-labs deploy
48. https://phabricator.wikimedia.org/T154436 -- Split wheels repo into
Prod/WMFLabs branches and maintain independence
49. https://phabricator.wikimedia.org/T155931 -- Minify json responses
50. https://phabricator.wikimedia.org/T154865 -- assets url return empty
string
51. https://phabricator.wikimedia.org/T147708 -- Artificial Intelligence to
build and navigate content
52. https://phabricator.wikimedia.org/T147710 -- What should an AI do you
for you? Building an AI Wishlist.
53. https://phabricator.wikimedia.org/tag/artificial-intelligence/
54. https://phabricator.wikimedia.org/T147929 -- Algorithmic dangers and
transparency -- Best practices
55. https://phabricator.wikimedia.org/T148690 -- Where to surface AI in
Wikimedia Projects
56. https://phabricator.wikimedia.org/T145024 -- Licensing of labeled data
57. https://phabricator.wikimedia.org/T156052 -- Add notice of CC0 status
of Wikilabels data to UI & Docs
58. https://phabricator.wikimedia.org/T156273 -- Identify baseline api.php
Action API consumption
59. https://phabricator.wikimedia.org/T157470 -- Draft proposal/pitch for
ORES resourcing
60. https://phabricator.wikimedia.org/T150855 -- Gather assets for post
about ORES review tool including ERI filters
61. https://phabricator.wikimedia.org/T150858 -- Post about ORES review
tool including ERI filters
Sincerely,
Aaron from the Revision Scoring Scoring Platform team
The 'master' branch of MediaWiki-Vagrant will now provision and
maintain Debian Jessie based VMs. The next time you fetch
mediawiki/vagrant.git changes to your laptop or Labs VM and try to run
`vagrant up` or `vagrant provision` it will complain that your Vagrant
managed VM is not running the correct base operating system.
There are two ways to deal with this:
1) Follow the instructions given to delete and recreate your VM. This
is the most awesome long term thing to do, but may be annoying in the
short term. If you have heavily customized the wikis running in your
VM it is up to you to figure out how to backup things before you
destroy your current VM and then restore the changes after you build a
new Jessie-based VM.
2) Switch your git checkout to the 'trusty-compat' branch of
mediawiki/vagrant.git. This trades short term efficiency for long term
pain. The trusty-compat branch is not going away any time soon, but it
will drift out of sync with Puppet changes on the master branch.
See <https://phabricator.wikimedia.org/T136429> for known issues with
the Jessie conversion. The only two I'm aware of at this time are
related to fundraising (T154264) and an NFS permissions mapping
problem when installing ChangeProp on a VM with OSX as the host
operating system and NFS shares enabled for Vagrant (T158617).
Bryan
--
Bryan Davis Wikimedia Foundation <bd808(a)wikimedia.org>
[[m:User:BDavis_(WMF)]] Sr Software Engineer Boise, ID USA
irc: bd808 v:415.839.6885 x6855