Hi,
On Tue, Mar 1, 2016 at 3:36 PM, David Strine <dstrine(a)wikimedia.org> wrote:
> We will be holding this brownbag in 25 minutes. The Bluejeans link has
> changed:
>
> https://bluejeans.com/396234560
I'm not familiar with bluejeans and maybe have missed a transition
because I wasn't paying enough attention. is this some kind of
experiment? have all meetings transitioned to this service?
anyway, my immediate question at the moment is how do you join without
sharing your microphone and camera?
am I correct thinking that this is an entirely proprietary stack
that's neither gratis nor libre and has no on-premise (not cloud)
hosting option? are we paying for this?
-Jeremy
As of 950cf6016c, the mediawiki/core repo was updated to use DB_REPLICA
instead of DB_SLAVE, with the old constant left as an alias. This is part
of a string of commits that cleaned up the mixed use of "replica" and
"slave" by sticking to the former. Extensions have not been mass
converted. Please use the new constant in any new code.
The word "replica" is a bit more indicative of a broader range of DB
setups*, is used by a range of large companies**, and is more neutral in
connotations.
Drupal and Django made similar updates (even replacing the word "master"):
* https://www.drupal.org/node/2275877
* https://github.com/django/django/pull/2692/files &
https://github.com/django/django/commit/beec05686ccc3bee8461f9a5a02c607a023…
I don't plan on doing anything to DB_MASTER, since it seems fine by itself,
like "master copy", "master tape" or "master key". This is analogous to a
master RDBMs database. Even multi-master RDBMs systems tend to have a
stronger consistency than classic RDBMs slave servers, and present
themselves as one logical "master" or "authoritative" copy. Even in it's
personified form, a "master" database can readily be thought of as
analogous to "controller", "governer", "ruler", lead "officer", or such.**
* clusters using two-phase commit, galera using certification-based
replication, multi-master circular replication, ect...
**
https://en.wikipedia.org/wiki/Master/slave_(technology)#Appropriateness_of_…
***
http://www.merriam-webster.com/dictionary/master?utm_campaign=sd&utm_medium…
--
-Aaron
Hi all!
I invite you to try out "Project Ruprecht"[1][2], a tool that measures the
"tangledness" of PHP code, and provides you with a "naughty list" of things to fix.
For now, you will have to install this locally. I hope however to soon have this
run automatically against core on a regular basis, perhaps by integrating it
with SonarQube. Maybe some day we can also integrate it with CI, to generate a
warning when a new cyclic dependency is about to be introduced.
So that was the tl;dr. Now for some context, history, and shout-outs. And some
actual real world science, too!
For a while now, I have been talking about the how much of a problem cyclic
dependencies are in MediaWiki core: When two components (classes, namespaces,
libraries, whatever) depend on each other, directly or indirectly, this means
that one cannot be used without the other, nor can it be tested, understood, or
modified without also considering the other. So, in effect, they behave as *one*
component, not two. Applied to MW core, this means that roughly half of our 1600
classes effectively behave like a single giant class. This makes the code rather
hard to deal with.
To fix this, I have been looking for tools that let me identify "tangles" of
classes that depend on each other, and metrics' to measure the progress of
"untangling" the code. However, the classic code quality metrics focus on
"local" properties of the code, so they can't tell us much about the progress of
untangling. And the tools I found that would detect cyclic dependencies in PHP
code would all choke on MediaWiki core: they would try to list all detected
cycles - which, by the super-exponential nature of possible paths through a
graph, would be millions and millions. So, the tools would choke and die. That
approach isn't practical for us.
Two discoveries allowed me to come up with a working solution:
First, I decided to leave the PHP world and turned towards graph analysis tools
built for large data sets. Python's graph-tool did the trick. It's build on top
of boost and numpy, and it's *fast*. It crunched through the 7500 or so class
dependencies in MW core in a split second, and told me that we have 14 "tangles"
(non-trivial strongly connected components), and that 43% of our classes are in
these tangles, with 40% being part of one big tangle that is essentially our
monolith manifest. So now I had a metric to work with: the number of classes in
tangles.
That was great, but still didn't tell me where to start. Graph-tool was still
not fast enough to deal with millions of cycles, and even if it had been, that
data wouldn't be very useful. I needed some smart heuristics. Luckily, I
(totally unintentionally, promise!) nerd sniped[5] Amir Sarabadani one evening
at the WMDE office by telling him about this problem. The next day, he told me
that he had been digging into the problem all night, and he had found a paper
that sounded relevant, and it also came with working code: "Breaking Cycles in
Noisy Hierarchies"[3] by J. Sun, D. Ajwani, P.K. Nicholson, A. Sala, and S.
Parthasarathy. I played with the code a bit, and yes! It spat out a list of 290
or so dependencies[4] that it thought were bad - and I agree for a good number
of them. It's not a clean working list, but it gives a very good idea of where
to start looking.
I find it quite fascinating that this works so well for cleaning up a codebase.
After all, the heuristic wasn't design for this - it was designed for fixing
messy ontologies. Indeed, one of their test data sets was (English language)
Wikipedia's category system! I'd love to see what it does with Wikidata's
subclass hierarchy :)
But I suppose it makes sense - dependencies in software are conceptually a lot
like an ontology, and the same strategies of stratification and abstraction
apply. And the same difficulties, too - it's easy enough to spot a problematic
cycle, but often hard to say where it should be cut. And how to cut it - often,
the solution is not to just remove the dependency, but to introduce a new
abstraction that allows the relationship to exist without a cycle. I'd love to
see the research continue in that direction!
So, a big shout out to the researchers, and to Amir who found the paper!
I hope my ramblings have made you curious to play with Ruprecht, and see what it
has to say about other code bases. There's also another feature to play with
which I haven't discussed here: detection of risky classes using the Page Rank
algorithm. Fun!
Cheers,
Daniel
[1] https://phabricator.wikimedia.org/diffusion/MTDA/repository/master/
[2]
https://gerrit.wikimedia.org/r/admin/projects/mediawiki/tools/dependency-an…
[3] https://github.com/zhenv5/breaking_cycles_in_noisy_hierarchies
[4] https://phabricator.wikimedia.org/P8513
[5] https://xkcd.com/356/
--
Daniel Kinzler
Principal Software Engineer, Core Platform
Wikimedia Foundation
Hello,
The committee has finished selecting new members and the new committee
candidates are (In alphabetical order):
- Amir Sarabadani
- Lucie-Aimée Kaffee
- MusikAnimal
- Tonina Zhelyazkova
- Tony Thomas
And auxiliary members will be (In alphabetical order):
- Huji
- Matanya
- Nuria Ruiz
- Rosalie Perside
- Tpt
You can read more about the members in [0]
The changes are:
* Nuria and Rosalie are moving from main member to auxilary members
* MusikAnimal is moving from auxilary member to main
* Tonina Zhelyazkova is joining the main members
This is not the final structure. According to the CoC [1], the current
committee publishes the new members and call for public feedback for *six
weeks* and after that, the current committtee might apply changes to the
structure based on public feedback.
Please let the committee know if you have any concern regarding the members
and its structure until *19 June 2019* and after that, the new committee
will be in effect and will serve for a year.
[0]:
https://www.mediawiki.org/wiki/Code_of_Conduct/Committee/Members/Candidates
[1]:
https://www.mediawiki.org/wiki/Code_of_Conduct/Committee#Selection_of_new_m…
Amir, On behalf of the Code of Conduct committee
Best
Hi,
I would like to highlight a few notes from SoS Meeting Bookkeeping:
* We're still looking for a backup facilitator. 👀
* We're still looking for feedback if this meeting is useful. If it is, is
there anything we could do to make it more useful?
** SoS is really useful to Release Engineering because we are frequently
blocked by various teams because of train. We also frequently block other
teams because we're in charge of continuous integration (CI).
** Should we make the meeting notes as short as possible, so interested
people could read all of it? One step in that direction could be removing
teams that did not leave any updates.
Željko
--
= 2019-05-22 =
== Callouts ==
* Train blocked
** Growth - operand type was used: expects array(s) or collection(s) in
/srv/mediawiki/wmf-config/flaggedrevs.php on line 182
https://phabricator.wikimedia.org/T224116
** Growth - Special:ProblemChanges on several Wiktionary sites show raw
message IDs instead of translated strings
https://phabricator.wikimedia.org/T224124
* Introducing the codehealth pipeline beta (by Kosta Harlan)
https://phabricator.wikimedia.org/phame/post/view/160/introducing_the_codeh…
* Language -> SRE:
https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/506043/
== Audiences ==
=== Contributors ===
==== Community Tech ====
* Blocked by:
* Blocking:
* Updates:
==== Anti-Harassment Tools ====
* Blocked by:
* Blocking:
* Updates:
==== Editing ====
* Blocked by:
* Blocking:
* Updates:
** git #b6704010 - Automatically add a template when chosen from the
autocomplete list (Hackathon project)
** git #3285b7db - Initialize $restbaseHeaders to null (T223281)
** Working on a model corruption issue related to selections in VE:
(T202719)
==== Growth ====
* Blocked by:
* Blocking:
** Release Engineering - Train blocked
*** operand type was used: expects array(s) or collection(s) in
/srv/mediawiki/wmf-config/flaggedrevs.php on line 182
https://phabricator.wikimedia.org/T224116
*** Special:ProblemChanges on several Wiktionary sites show raw message IDs
instead of translated strings https://phabricator.wikimedia.org/T224124
* Updates:
** Introducing the codehealth pipeline beta (by Kosta Harlan)
https://phabricator.wikimedia.org/phame/post/view/160/introducing_the_codeh…
==== Language ====
* Blocked by:
** SRE (BBlack):
https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/506043/
* Blocking:
* Updates:
=== Readers ===
==== iOS native app ====
* Blocked by:
* Blocking:
* Updates:
** Continuing development on v6.3 - beginning user testing soon
*** Talk page MCS endpoint - https://phabricator.wikimedia.org/T221148
*** Talk page native work - https://phabricator.wikimedia.org/T215928
*** Link wizard in Editor - https://phabricator.wikimedia.org/T213979
*** Media wizard in Editor - https://phabricator.wikimedia.org/T209398
==== Android native app ====
* Blocked by:
* Blocking:
* Updates:
** New Beta release with minor bug fixes and enhancements (and with updated
logout behavior).
** Continuing to build on the Suggested Edits feature (beginning work on
editing structured captions on Commons).
==== Readers Web ====
* Blocked by:
* Blocking:
* Updates:
==== Readers Infrastructure ====
* Blocked by:
* Blocking:
* Updates:
** Maps: node10js migration is finished for a couple of weeks and remains
stable
==== Multimedia ====
* Blocked by:
* Blocking:
* Updates:
** qualifiers for depicts statements now on test-commons, should reach
commons itself later in the week
** next release - other statements, within the next few weeks
==== Parsing ====
* Blocked by:
* Blocking:
* Updates:
==== UI Standardization ====
* Blocked by:
* Blocking:
* Updates:
== Technology ==
=== Analytics ===
* Blocked by:
* Blocking:
* Updates:
** Team is at offsite next week, May 27 through May 31st
** Time selector updated on Wikistats 2 along with other improvements, on
track for beta release by the end of the quarter
** Looking into Apache Kylin for big data OLAP cube creation
** Working on moving reportupdater queries from repositories like
limn-language-data to the analytics-reportupdater-queries repository to
centralize reportupdater usage so we can more easily maintain and help with
queries going forward
=== Cloud Services ===
* Blocked by:
* Blocking:
* Updates:
=== Fundraising Tech ===
* Blocked by:
* Blocking:
* Updates:
** Looking at Perf team's patches to reduce CentralNotice overhead
** CentralNotice bannmonitoring
** Dedupe improvements for core CiviCRM
** Anti-fraud improvments
** Better-styled card entry form for main card processor
=== Core Platform ===
* Blocked by:
* Blocking:
* Updates:
** REST routing RFC to TechCom https://phabricator.wikimedia.org/T221177
** Evan, Cindy, Daniel at Wikimedia Hackathon 2019.
** Determining Session TTLs for new session storage
https://phabricator.wikimedia.org/T222990
** RESTBagOStuff for sessions ready for review
https://phabricator.wikimedia.org/T215533
** First action API integration tests in Phester
** RESTBase split, dropping old parsoid cache tables next week
=== Performance ===
* Blocked by:
* Blocking:
* Updates:
** (Peter) Trying out different FPS settings for synthetic performance
analysis in WebPageTest. – https://phabricator.wikimedia.org/T223502
** (Aaron) Improve APCu serialisation performance in preparation of PHP 7
roll out (avoid regression from HHVM).
** Gilles attended TheWebConf 2019 (SF), presenting the results from our
user-perceived performance research. –
https://twitter.com/fullstackjerk/status/1129549644789235712
** Timo attended Wikimedia Hackathon 2019 (Prague).
** Finished perf review for the new "Ajax log out" user interface (for
Audiences Design).
** Continuing reducing of startup cost for page views (collab with
Fundraising, Analytics, WMDE). https://phabricator.wikimedia.org/T209699,
https://phabricator.wikimedia.org/T203696
** Finishing page view perf analysis report from March 2019. Two sub tasks
remaining (collab with Analytics and Growth). –
https://phabricator.wikimedia.org/T219342
=== Release Engineering ===
* Blocked by:
** Growth - operand type was used: expects array(s) or collection(s) in
/srv/mediawiki/wmf-config/flaggedrevs.php on line 182
https://phabricator.wikimedia.org/T224116
* Blocking:
* Updates:
** Introducing the codehealth pipeline beta (by Kosta Harlan)
https://phabricator.wikimedia.org/phame/post/view/160/introducing_the_codeh…
** Train Health
*** Last week: 1.34.0-wmf.5 - https://phabricator.wikimedia.org/T220730
*** This week: 1.34.0-wmf.6 - https://phabricator.wikimedia.org/T220731
**** We're expecting slighly more problems this week, more commits than
usual are created last week because of Wikimedia Hackathon.
**** Growth - operand type was used: expects array(s) or collection(s) in
/srv/mediawiki/wmf-config/flaggedrevs.php on line 182
https://phabricator.wikimedia.org/T224116
**** Growth - Special:ProblemChanges on several Wiktionary sites show raw
message IDs instead of translated strings
https://phabricator.wikimedia.org/T224124
*** Next week: 1.34.0-wmf.7 - https://phabricator.wikimedia.org/T220732
=== Research ===
* Blocked by:
* Blocking:
* Updates:
=== Scoring Platform ===
* Blocked by:
* Blocking:
* Updates:
=== Search Platform ===
* Blocked by:
* Blocking:
* Updates:
=== Security ===
* Blocked by:
* Blocking:
* Updates:
=== Services ===
* Blocked by:
* Blocking:
* Updates:
=== Site Reliability Engineering ===
* Blocked by:
* Blocking:
** Language: https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/506043/
* Updates:
== TechComm ==
* Blocked by:
* Blocking:
* Updates:
** Hosted Office Hour at Hackathon
** No RFC IRC meeting this week
== Wikidata ==
* Blocked by:
* Blocking:
* Updates:
** Many interesting session, meetings and loads of hacking time at
Wikimedia Hackathon in Prague
** Working with SRE on deploying SSR service to kubernetes:
https://phabricator.wikimedia.org/T220402
** EntitySchema a.k.a. Shape Expressions live on beta, going live on
wikidata.org next week
== German Technical Wishlist ==
* Blocked by:
* Blocking:
* Updates:
== SoS Meeting Bookkeeping ==
* Blocked by:
* Blocking:
* Updates:
** We're still looking for a backup facilitator. 👀
** Please keep in mind that this page is copy/pasted to a wiki page. Text
on this page will be rendered as wikitext. See this page for some common
formatting errors.
https://www.mediawiki.org/w/index.php?diff=3240805&oldid=3240802&title=Scru…
** If your team is blocking/blocked, please make sure you copy/paste the
note from your team section to the other team's section.
** We're still looking for feedback if this meeting is useful. If it is, is
there anything we could do to make it more useful?
*** SoS is really useful to Release Engineering because we are frequently
blocked by various teams because of train. We also frequently block other
teams because we're in charge of continuous integration (CI).
*** Should we make the meeting notes as short as possible, so interested
people could read all of it? One step in that directon could be removing
teams that did not leave any updates.
Greetings,
This is the weekly update from the Search Platform team for the week
starting 2019-05-20.
As always, feedback and questions are welcome.
== Highlights==
* Most of the team attended a three-day offsite in Prague last week,
and Deb, Erik, Stas, and Trey also attended the Wikimedia Hackathon.
[0]
== Discussions ==
=== Search ===
* At the Hackathon, we hosted a session on "Advanced search syntax for
newbies" [1]—and we had a few in-depth discussions with volunteers
about search, our APIs, etc., and talked more in-depth about Arabic
and Slovak.
**As a result of our discussion, Trey opened a ticket to investigate
the effects of searching without diacritics in Slovak. [2]
*Trey completed a change to Arabic-language completion suggester
(upper left search box) to make Eastern Arabic Numerals and Western
Arabic Numerals equivalent. [3] It will still take a little while for
the change to be seen on-wiki.
* Stas made a set of preliminary patches to convert CirrusSearch
extension to extension.json registration (merged) and final conversion
patch still in review [4]
* David worked on several tasks to create a fallback method based on a
generic index [5]; making fallback methods configurable [6]; and
allowing the FallbackMethod to create their own SearchQuery [7]
* We noticed that multiple Elasticsearch nodes were getting overloaded
in eqiad in April - Erik patched it and found a few things that might
have caused the issues [8]
* When enabling cross cluster search to support multi-instance we had
to run custom scripts to update cluster settings -- and discovered
that the puppet repo was not aware of this; it's fixed now [9]
* Erik did a smorgasbord of fixes: "missing replica" error messages in
production logs was fixed by uniquely identify connections in
connection pool [10]; create archive indices and delete archive docs
from general indices and to ignore ancient logging rows with log_page
= null [11]; fixed a condition where we received a
cirrusSearchElasticaWrite job for an unwritable cluster cloudelastic
[12]; and documented the CirrusSearch schema [13].
* During the Hackathon, Erik also exposed CloudElastic to the WMF Cloud [14]
=== Wikidata Query Service ===
* At the Hackathon, with the help of Krinkle, the bug with URL
shortener widget being hard to use was fixed [15]
* WDQS bug with label service clauses nested in subqueries being
processed incorrectly was fixed [16]
* Stas fixed breakage in LDF server JSON-LD format [17]
== Did you know? ==
'''Naming Things is Hard, Volume 187:''' The Phab ticket mentioned
above to equate different numeral systems for Arabic-language wikis
uses the names Eastern Arabic Numerals (١٢٣...) and Western Arabic
Numerals (123…). In English, the numerals we usually use (123...) are
often called “Arabic numerals” [18] because they came to Europe from
Arabic sources. In Arabic, the Eastern Arabic Numerals are called
“Indian numerals” [19] because they came from Indian sources. In
English, “Indian numerals” refer to the numerals used in India
(१२३...) but they are just called “Devanagari numerals” in Hindi, for
example. [20] Some have tried to make the subtle distinction in
English that “arabic numerals” are the numerals that came from Arabic
sources (123...), while “Arabic numerals” are the ones that are used
by Arabic speakers (١٢٣...).
It’s also interesting to look at a table of the various related
numeral systems [21] and see the similarities and “false friends”—note
that your fonts may vary: Devanagari 7 looks like a 6 (“७”), Arabic 6
looks like a 7 (“٦”), Gujarati 5 looks like a 4 (“૫”), Bengali 4 looks
like an 8 (“৪”), Gurmukhi 1 looks like a 9 (“੧”), etc. But any of
those systems are MMMDCCXXIV times better than Roman numerals! [22]
[0] https://www.mediawiki.org/wiki/Wikimedia_Hackathon_2019
[1] https://phabricator.wikimedia.org/T216740
[2] https://phabricator.wikimedia.org/T223787
[3] https://phabricator.wikimedia.org/T117217
[4] https://phabricator.wikimedia.org/T87892
[5] https://phabricator.wikimedia.org/T222652
[6] https://phabricator.wikimedia.org/T222152
[7] https://phabricator.wikimedia.org/T221621
[8] https://phabricator.wikimedia.org/T220901
[9] https://phabricator.wikimedia.org/T218932
[10] https://phabricator.wikimedia.org/T222819
[11] https://phabricator.wikimedia.org/T222641
[12] https://phabricator.wikimedia.org/T222307
[13] https://phabricator.wikimedia.org/T220547
[14] https://phabricator.wikimedia.org/T223519
[15] https://phabricator.wikimedia.org/T221127
[16] https://phabricator.wikimedia.org/T153353
[17] https://phabricator.wikimedia.org/T222471
[18] https://en.wikipedia.org/wiki/Arabic_numerals
[19] https://ar.wikipedia.org/wiki/أرقام_هندية
[20] https://hi.wikipedia.org/wiki/देवनागरी_अंक
[21] https://en.wikipedia.org/wiki/Hindu–Arabic_numeral_system#Glyph_comparison
[22] https://en.wikipedia.org/wiki/Roman_numerals
----
Subscribe to receive on-wiki (or opt-in email) notifications of the
Discovery weekly update.
https://www.mediawiki.org/wiki/Newsletter:Discovery_Weekly
The archive of all past updates can be found on MediaWiki.org:
https://www.mediawiki.org/wiki/Discovery/Status_updates
Interested in getting involved? See tasks marked as "Easy" or
"Volunteer needed" in Phabricator.
[1] https://phabricator.wikimedia.org/maniphest/query/qW51XhCCd8.7/#R
[2] https://phabricator.wikimedia.org/maniphest/query/5KEPuEJh9TPS/#R
Yours,
Chris Koerner (he/him)
Community Relations Specialist
Wikimedia Foundation