I assumed the same, but better be explicit about those assumption :)
On Thu, Sep 22, 2016 at 4:27 PM, Alex Monk <amonk(a)wikimedia.org> wrote:
> I had been assuming that puppetised crons were not really relevant...
>
> On 22 September 2016 at 15:19, Guillaume Lederrey <glederrey(a)wikimedia.org>
> wrote:
>>
>> Hello!
>>
>> Increasing visibility sounds like a great idea! How far do we want to
>> go in that direction? In particular, I'm thinking of a few of the
>> crons we have for Cirrus. For example, we do have daily crons on
>> terbium that re-generate the suggester indices. Those can run for >
>> 1h.
>>
>> My understanding is that those kind of crons should not be considered
>> scripts, but standard working parts of the system. Adding them will
>> probably generate more noise than useful information. Is this a
>> reasonable understanding?
>>
>> Thanks!
>>
>> Guillaume
>>
>>
>>
>> On Wed, Sep 21, 2016 at 12:29 AM, Greg Grossmeier <greg(a)wikimedia.org>
>> wrote:
>> > In an effort to reduce surprises and potential mishaps it is now
>> > required to include any long running tasks in the deployment
>> > calendar[0].
>> >
>> > "Long running tasks" include any script that is run on production 'work
>> > machines' such as terbium that last for longer than ~1 hour. Think:
>> > migration and maintenance scripts.
>> >
>> > This was discussed and proposed in T144661[1].
>> >
>> > Best,
>> >
>> > Greg
>> >
>> > [0] https://wikitech.wikimedia.org/wiki/Deployments
>> > Relevant diff:
>> > https://wikitech.wikimedia.org/w/index.php?diff=850923&oldid=850244
>> > [1] https://phabricator.wikimedia.org/T144661
>> >
>> > --
>> > | Greg Grossmeier GPG: B2FA 27B1 F7EB D327 6B8E |
>> > | Release Team Manager A18D 1138 8E47 FAC8 1C7D |
>> >
>> > _______________________________________________
>> > Engineering mailing list
>> > Engineering(a)lists.wikimedia.org
>> > https://lists.wikimedia.org/mailman/listinfo/engineering
>> >
>>
>>
>>
>> --
>> Guillaume Lederrey
>> Operations Engineer, Discovery
>> Wikimedia Foundation
>> UTC+2 / CEST
>>
>> _______________________________________________
>> Wikitech-l mailing list
>> Wikitech-l(a)lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
>
>
>
> --
> Alex Monk
> VisualEditor/Editing team
> https://wikimediafoundation.org/wiki/User:Krenair_(WMF)
>
> _______________________________________________
> Engineering mailing list
> Engineering(a)lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/engineering
>
--
Guillaume Lederrey
Operations Engineer, Discovery
Wikimedia Foundation
UTC+2 / CEST
Let me clarify the reasoning for the idea:
We realized that some schema changes (which used to be scheduled like other
deployments) no longer take 1 hour (they can take 1 month, running
continuously like https://phabricator.wikimedia.org/T139090 , because it
affects 3 of our largest tables). Also, they no longer requires read-only
mode or affect code in anyway (unless they are a prerequisite).
On the other side, a schema change, combined with high read or write load
from long-running maintenance jobs, like those of the updateCollation
script, or any other (those where just an example), could potentially make
lagging a worse problem: a single transaction has to store pending changes
during its lifetime, or long-running reads can block and create pileups due
to metadata locking. We want to avoid those, which certainly caused
infrastructure issues in the past.
So, in summary, regular deployments are exclusive from each others.
Long-running maintenance work could affect each other. This is a way for me
(and others) to have visibility of those potential negative interactions,
and make sure we can coordinate: "You are doing work on enwiki? No problem,
we will just run this task for commons". "you need to do an emergency data
recovery? I will wait to do this other task that can wait". Even if only
DBAs use it, it is already useful to not perform incompatible changes at
the same time. But it will be even more useful if everybody uses it!
On Thu, Sep 22, 2016 at 4:27 PM, Alex Monk <amonk(a)wikimedia.org> wrote:
> I had been assuming that puppetised crons were not really relevant...
>
> On 22 September 2016 at 15:19, Guillaume Lederrey <glederrey(a)wikimedia.org
> > wrote:
>
>> Hello!
>>
>> Increasing visibility sounds like a great idea! How far do we want to
>> go in that direction? In particular, I'm thinking of a few of the
>> crons we have for Cirrus. For example, we do have daily crons on
>> terbium that re-generate the suggester indices. Those can run for >
>> 1h.
>>
>> My understanding is that those kind of crons should not be considered
>> scripts, but standard working parts of the system. Adding them will
>> probably generate more noise than useful information. Is this a
>> reasonable understanding?
>>
>> Thanks!
>>
>> Guillaume
>>
>>
>>
>> On Wed, Sep 21, 2016 at 12:29 AM, Greg Grossmeier <greg(a)wikimedia.org>
>> wrote:
>> > In an effort to reduce surprises and potential mishaps it is now
>> > required to include any long running tasks in the deployment
>> > calendar[0].
>> >
>> > "Long running tasks" include any script that is run on production 'work
>> > machines' such as terbium that last for longer than ~1 hour. Think:
>> > migration and maintenance scripts.
>> >
>> > This was discussed and proposed in T144661[1].
>> >
>> > Best,
>> >
>> > Greg
>> >
>> > [0] https://wikitech.wikimedia.org/wiki/Deployments
>> > Relevant diff:
>> > https://wikitech.wikimedia.org/w/index.php?diff=850923&oldid=850244
>> > [1] https://phabricator.wikimedia.org/T144661
>> >
>> > --
>> > | Greg Grossmeier GPG: B2FA 27B1 F7EB D327 6B8E |
>> > | Release Team Manager A18D 1138 8E47 FAC8 1C7D |
>> >
>> > _______________________________________________
>> > Engineering mailing list
>> > Engineering(a)lists.wikimedia.org
>> > https://lists.wikimedia.org/mailman/listinfo/engineering
>> >
>>
>>
>>
>> --
>> Guillaume Lederrey
>> Operations Engineer, Discovery
>> Wikimedia Foundation
>> UTC+2 / CEST
>>
>> _______________________________________________
>> Wikitech-l mailing list
>> Wikitech-l(a)lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>>
>
>
>
> --
> Alex Monk
> VisualEditor/Editing team
> https://wikimediafoundation.org/wiki/User:Krenair_(WMF)
>
> _______________________________________________
> Engineering mailing list
> Engineering(a)lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/engineering
>
>
--
Jaime Crespo
<http://wikimedia.org>
In an effort to reduce surprises and potential mishaps it is now
required to include any long running tasks in the deployment
calendar[0].
"Long running tasks" include any script that is run on production 'work
machines' such as terbium that last for longer than ~1 hour. Think:
migration and maintenance scripts.
This was discussed and proposed in T144661[1].
Best,
Greg
[0] https://wikitech.wikimedia.org/wiki/Deployments
Relevant diff:
https://wikitech.wikimedia.org/w/index.php?diff=850923&oldid=850244
[1] https://phabricator.wikimedia.org/T144661
--
| Greg Grossmeier GPG: B2FA 27B1 F7EB D327 6B8E |
| Release Team Manager A18D 1138 8E47 FAC8 1C7D |
Due to the Wikimedia Technical Operations Team having their team offsite
that week and generally being less than normally available there will be
no non-emergency deploys the week of September 26th (aka: next week).
See also:
https://wikitech.wikimedia.org/wiki/Deployments#Week_of_September_26th
Normal schedule will resume the following week.
Teaser: The week of October 17th will be a "no train deploys" weeks as
the Wikimedia Release Engineering Team will be at their offsite that
week. I'll send a reminder about this as well closer to that date.
Best,
Greg
--
| Greg Grossmeier GPG: B2FA 27B1 F7EB D327 6B8E |
| Release Team Manager A18D 1138 8E47 FAC8 1C7D |
One of ORES [1] applications is determining article quality. For example,
What would be the best assessment of an article in the given revision.
Users in wikiprojects use ORES data to check if articles need
re-assessment. e.g. if an article is in "Start" level and now good it's
enough to be a "B" article.
As part of Q4 goals, we made a dataset of article quality scores of all
articles in English Wikipedia [2] (Here's the link to download the dataset
[3]) and we are publishing it in figshare as something you can cite [4]
also we are working on publishing monthly data for researchers to track
article quality data change over time. [5]
As a pet project of mine, I always wanted to put these data in a database.
So we can query the database and get much more useful data. For example
quality of articles in category 'History_of_Essex' [6] [7]. The weighed sum
is a measure of quality which is a decimal number between 0 (really stub)
to 5 (a definitely featured article). We have also prediction column which
is a number in this map [8] for example if prediction is 5, it means ORES
thinks it should be a featured article.
I leave more use cases to your imagination :)
I'm looking for a more permanent place to put these data, please tell me if
it's useful for you.
[1] ORES is not a anti-vandalism tool, it's an infrastructure to use AI in
Wikipedia.
[2] https://phabricator.wikimedia.org/T135684
[3] (117 MBs)
https://datasets.wikimedia.org/public-datasets/enwiki/article_quality/wp10-…
[4] https://phabricator.wikimedia.org/T145332
[5] https://phabricator.wikimedia.org/T145655
[6] https://quarry.wmflabs.org/query/12647
[7] https://quarry.wmflabs.org/query/12662
[8]
https://github.com/wiki-ai/wikiclass/blob/3ff2f6c44c52905c7202515c5c8b525fb…
Have fun!
Amir
Hi,
As the subject line hints, we're doing something a little different with
this release. There was a lot of work done on MW-CS over the summer by
our GSoC student, Lethexie.
There are a significant number of changes, especially to do with
documentation comments. So we are releasing this as alpha quality to ask
for more feedback on the changes to the ruleset. You can upgrade in the
same manner as normal - by updating the version number in the
composer.json file.
Feel free to provide feedback by responding to this thread or by filing
a bug in the MediaWiki-Codesniffer Phabricator project. We would like to
release 0.8.0 by mid-October.
Thanks!
-- Legoktm
Hello,
We are holding the deployment of MediaWiki version 1.28.0-wmf.19 due to
a couple of bugs that have surfaced.
The first one is that renaming a user was blocked [T145596]. Reported by
K6ka, triaged by MarcoAurelio. The issue is fixed now thanks to Aaron
Schulz and Kunal Mehta.
The second blocker is way nastier. I have pushed the upgrade to group1.
matanya (a long volunteer with a lot of technical patches) immediately
reported the infoboxes on the Hebrew Wiki were on the wrong side, which
prompted a rollback. [T145673].
Timo Tijhof and Kunal Mehta have found the root cause. I can speak for
them as to who/when we will get a solution.
For now. I am holding the train. Will reassess tomorrow and ideally
push group1 at 19:00 UTC then follow with group2 at 20:00UTC.
Stay tuned!
Up-to-date status:
https://tools.wmflabs.org/versions/
MW-1.28.0-wmf.19 deployment blockers
https://phabricator.wikimedia.org/T143328
[T145596] Renames getting stuck on mediawiki.org (Sept 13, 2016)
https://phabricator.wikimedia.org/T145596
[T145673] 1.28.0-wmf.19 broke template RTL placement
https://phabricator.wikimedia.org/T145673
--
Antoine "hashar" Musso
In 2015, a phabricator task [0] and RfC discussion on meta [1] were
started to create a process for determining when a tool has been
abandoned by it's original maintainer(s) and how to hand control of
the tool over to interested volunteers. The process stalled out
without resolution.
Our on-wiki communities are still highly dependent on volunteer
developed tools and vulnerable to disruption when the original
developers move on. I have drafted two straw dog [2] policies that
attempt to define fair and workable solutions to the general problem.
The proposals take two different but compatible approaches to solving
the problem of abandonment. The Tool Labs developer community could
choose to adopt either or both policies as protection for the
communities that they serve.
The first policy describes a *right to fork* for all Tool Labs hosted
software. This policy clarifies the existing Tool Labs Open Source and
Open Data requirements and defines a process for requesting access to
code and data that are not already published publicly.
The second policy is a more aggressive *abandoned tool policy* that
describes a process for adding new maintainers to a tool account
(adoption) with a future possibility of removing the original
maintainers (usurpation). This policy is based primarily on the
discussions that happened on Meta in 2015.
Both policies propose creating a new committee of volunteers to
evaluate requests and perform cleanup of sensitive data in the tool
before providing the source code or direct access to the tool account.
This provision is key actually implementing both proposals. Paid
administration and management does not scale any better than paid
editing. To continue to grow and thrive, the Tool Labs developer
community needs to become more active in enforcing and expanding their
own policies. Membership in the committees would require signing the
Wikimedia Foundation's Volunteer NDA [3] to ensure that sensitive data
is handled appropriately. If both polices are adopted the two
committees should be collapsed into a single group with authority to
handle both types of requests.
The straw dog policies are posted on Wikitech:
* https://wikitech.wikimedia.org/wiki/Help:Tool_Labs/Right_to_fork_policy
* https://wikitech.wikimedia.org/wiki/Help:Tool_Labs/Abandoned_tool_policy
Discussion of the particulars of each proposal should happen on their
associated talk pages. As an example it would be appropriate to debate
whether the 14 day non-functional waiting period is too short or too
long on the Abandoned tool policy talk page. Discussion of the process
in general can happen on Meta [1].
I would like discussion to remain open through *2016-10-12* (3 weeks
from date of posting). Following the discussion period I hope to call
for an approval vote of some sort to make the policies official.
Wikitech and Tool Labs do not currently have well defined policies for
establishing consensus, but I'm sure we can collectively come up with
something reasonable.
[0]: https://phabricator.wikimedia.org/T87730
[1]: https://meta.wikimedia.org/wiki/Requests_for_comment/Abandoned_Labs_tools
[2]: https://en.wikipedia.org/wiki/Straw_man_proposal
[3]: https://wikitech.wikimedia.org/wiki/Volunteer_NDA
Bryan
--
Bryan Davis Wikimedia Foundation <bd808(a)wikimedia.org>
[[m:User:BDavis_(WMF)]] Sr Software Engineer Boise, ID USA
irc: bd808 v:415.839.6885 x6855
Scott suggested the following as one of three suggested topic ideas
for WikiDev17. The three ideas:
1) Collaboration
2) Wikitext Maintenance
3) Machine Translation
More inline about "1) Collaboration" below:
On Tue, Sep 20, 2016 at 10:05 AM, C. Scott Ananian
<cananian(a)wikimedia.org> wrote:
> *1. *(A unified vision for) *Collaboration*
>
> - Real-time collaboration (not just editing, but chatting, curation,
> patrolling)
> - WikiProject enhancements: User groups, finding people to work with,
> making these first class DB concepts
> - Civility/diversity/inclusiveness, mechanisms to handle/prevent
> harassment, vandalism, trolling while working together
> - Real-time reading -- watching edits occur in real time
> - Integration with WikiEdu
> - Broadening notion of "an edit" in DB -- multiple contributors,
> possibly multiple levels of granularity
> - Tip-toeing toward "draft"/"merge" models of editing
> - Better diff tools: refreshed non-wikitext UX, timelines, authorship
> maps, etc.
I've copied this wholesale into the "Collaboration" area on
[[WikiDev17/Topic ideas]], and quoted it directly here:
<https://www.mediawiki.org/wiki/Topic:Tbypptt9myumu7q7>
Let's use this thread to focus on this part of Scott's proposal. A
lot of these seems in scope for the Wikimedia Collaboration team.
Does the scope that you're thinking of align with what the team has
published on their page:
<https://www.mediawiki.org/wiki/Collaboration>
Rob
(p.s. please feel free to start separate threads with the other parts
of Scott's proposal)