Hi all!
This is a quick reminder that tonight, at the TechCOm IRC hour, we will be
talking about the job queue. There have been several issues iwth it lately, and
we want to make sure that we have all relevant aspects on the radar.
As always, the discussion will take place in the IRC channel
#wikimedia-office on Wednesday 21:00 UTC (2pm PDT, 23:00 CEST).
This is not an RFC meeting, as there is no concrete proposal. Rather, it's an
opportunity to further our understanding of the problems and hand, and to float
ideas for possible improvements.
I have prepared a quick brain dump of my current understanding of the job queue
issues <https://www.mediawiki.org/wiki/User:Daniel_Kinzler_(WMDE)/Job_Queue>.
Here's a copy for your convenience, but please comment directly in the document.
Observations:
* Latest instance of the JQ exploding: https://phabricator.wikimedia.org/T173710
* With 600k jobs in the backlog of commonswiki, only 7k got processed in a day.
* For wikis with just a few thousand pages, we sometimes see millions of
UpdateHtmlCache jobs sitting in the queue.
* Jobs that were triggered months ago were found to continue failing and re-trying
Issues and considerations:
* Jobs re-trying indefinitely
* Deduplication
** mechanism is obscure/undocumented. Some rely on rootJob parameters, some use
custom logic.
** Batching prevents deduplication. When and how should jobs do batch
operations? Can we automatically break up small batches?
** Delaying jobs may improve deduplication, but support for delayed jobs is
limited/obscure.
** Custom coalescing could improve the chance for deduplication.
* Scope and purpose of some jobs is unclear. E.g. UpdateHtmlCache invalidates
the parser cache, and RefreshLinks re-parse the page - but does not trigger an
UpdateHtmlCache, which it probably should.
* The throttling mechanism does not take into account the nature and run-time of
different job types.
* Scaling is achieved by running more cron jobs.
* Kafka-based JQ is being tested by Services. Generally saner. Should improve
ability to track causality (which job got triggered by which other job). T157088
* No support for recurrent jobs. Should we keep using cron?
--
Daniel Kinzler
Principal Platform Engineer
Wikimedia Deutschland
Gesellschaft zur Förderung Freien Wissens e.V.
Sorry for cross-posting!
Reminder: Technical Advice IRC meeting again **today 3-4 pm UTC** on
#wikimedia-tech.
The Technical Advice IRC meeting is open for all volunteer developers,
topics and questions. This can be anything from "how to get started" over
"who would be the best contact for X" to specific questions on your project.
If you know already what you would like to discuss or ask, please add your
topic to the next meeting: https://www.mediawiki.org/
wiki/Technical_Advice_IRC_Meeting
This meeting is an offer by WMDE’s tech team. Hosts of todays meeting are:
@addshore & @Tobi_WMDE_SW.
Hope to see you there!
Michi (for WMDE’s tech team)
--
Michael F. Schönitzer
Wikimedia Deutschland e.V. | Tempelhofer Ufer 23-24 | 10963 Berlin
Tel. (030) 219 158 26-0
http://wikimedia.de
Stellen Sie sich eine Welt vor, in der jeder Mensch an der Menge allen
Wissens frei teilhaben kann. Helfen Sie uns dabei!
http://spenden.wikimedia.de/
Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/681/51985.
Hi there,
I've got questions around Wiki Farm setup, I'm not sure if it's the right
place to ask the question. If there is a better channel, feel free to say
so.
In Wiki Family page [1] there is a quote saying
> It is recommended to use a different DB for each wiki (By setting a
> different $wgDBname
> <https://www.mediawiki.org/wiki/Special:MyLanguage/Manual:$wgDBname> for
> each wiki). However if you are limited to a single database, you can use a
> different prefix ($wgDBprefix
> <https://www.mediawiki.org/wiki/Special:MyLanguage/Manual:$wgDBprefix>)
> to separate the different installs.
>
But it's not detailed why it's recommended. I'd like to know if there is
any downside of using the prefix strategy (peformances, upgrades, etc) ?
Regards,
Manuel
[1] https://www.mediawiki.org/wiki/Manual:Wiki_family
Hi,
Occasionally someone will unintentionally create a new file or modify an
existing one to set the executable bit (+x). This is almost nearly an
accident, and usually someone will come along later and fix them en
masse[1][2].
With help from Anomie, I've written a tool, MinusX[3], that will search
and for executable files that shouldn't be, and optionally fix them.
It's written in PHP and should run as part of "composer test", but
operates on all types of files, not just PHP.
I've proposed adding this tool to all repositories[4] in an automated
manner.
If you have any suggestions/feature requests/bugs, feel free to create a
ticket in Phabricator or reply here.
[1] https://phabricator.wikimedia.org/T168659
[2] https://phabricator.wikimedia.org/P5913
[3] https://www.mediawiki.org/wiki/MinusX
[4] https://phabricator.wikimedia.org/T175794
Thanks,
-- Legoktm
Hi all!
Find below the minutes of the last meeting of the Technical Committee.
* PDF generation for Extension:Collection to be exposed as a stand-alone service
<https://phabricator.wikimedia.org/T171965>
* Quiddity et.Al. have laid out best practices for Special Interest Groups
(SIGs) <https://www.mediawiki.org/wiki/Special_Interest_Groups> (also see the
discussion on the talk page). TechCom sees no direct connection to its current
operation, but is looking forward to explore different modes of collaboration
with and between such groups.
* Next week’s IRC discussion: Job Queue - the good, the bad, the ugly.
<https://www.mediawiki.org/wiki/User:Daniel_Kinzler_(WMDE)/Job_Queue>. There are
several interlocking issues to discuss. We hope the conversation will surface
the most pressing issues, and perhaps produce new ideas for solutions.
As always, the discussion will take place in the IRC channel
#wikimedia-office on Wednesday 21:00 UTC (2pm PDT, 23:00 CEST).
You can also find our meeting minutes at
<https://www.mediawiki.org/wiki/Wikimedia_Technical_Committee/Minutes>
See also the TechCom RFC board
<https://phabricator.wikimedia.org/tag/mediawiki-rfcs/>.
--
Daniel Kinzler
Principal Platform Engineer
Wikimedia Deutschland
Gesellschaft zur Förderung Freien Wissens e.V.
A blog post about the Wikimedia Cloud Services team and the products
they help maintain is live on the Wikimedia blog:
<https://blog.wikimedia.org/2017/09/11/introducing-wikimedia-cloud-services/>
The post talks a bit about why we formed the Wikimedia Cloud Services
team and what the purpose of the product rebranding we have been
working on is. It also gives a shout out to a very small number of the
Toolforge tools and Cloud VPS projects that the Wikimedia technical
community make. I wish I could have named them all, but there are just
too many!
Bryan
--
Bryan Davis Wikimedia Foundation <bd808(a)wikimedia.org>
[[m:User:BDavis_(WMF)]] Manager, Cloud Services Boise, ID USA
irc: bd808 v:415.839.6885 x6855
Event: Wikimedia Developer Summit 2018
Dates: January 22 & 23, 2018
Location: San Francisco, California, USA
Details: The Wikimedia Developer Summit 2018 (WMDS 18) is an invitation
only event hosted by the Wikimedia Foundation. The format of the WMDS 18
will be discussions and conversations with stated goals and actionable
outcomes regarding the future Wikimedia Tech strategy. Participants will
submit position statements to the event organizers
<https://www.mediawiki.org/wiki/Wikimedia_Developer_Summit#Organizing_team>,
which will be anonymised and ranked by our Program Committee
<https://www.mediawiki.org/wiki/Wikimedia_Developer_Summit#Program_Committee>.
Invitations will be extended to the top 40 position statements; the 10
remaining invitations will be chosen by the program committee to ensure
diversity at the event in terms of technical background, organizations /
projects represented, and regional representation.
The Wikimedia Foundation has been hosting invitation-only, strategic,
technical events yearly beginning in 2012 starting with the Architecture
Summit. This event has changed names and grown over time until the
Wikimedia Developer Summit 2017. As we are experimenting with a new format
this year and will collect feedback and make changes for the future based
on that feedback.
Next steps for interested prospective attendees:
1.
Review the Wikimedia Developer Summit 2018 event page
<https://www.mediawiki.org/wiki/Wikimedia_Developer_Summit>
2.
Review Thematic Overview
<https://www.mediawiki.org/wiki/Wikimedia_Developer_Summit#Thematic_Overview>
3.
Spend some time thinking about how you believe Wikimedia Tech should
move forward into the future
4.
Submit Position Statement
<https://www.mediawiki.org/wiki/Wikimedia_Developer_Summit#Position_Papers_.…>
*Position Statements can be submitted between Friday, September 8th and
Friday, September 29th.*
We are looking forward to reviewing your statements!
On Fri, Aug 25, 2017 at 3:59 PM, Victoria Coleman <vcoleman(a)wikimedia.org>
wrote:
> Hi everyone,
>
> I wanted to give you all a heads up about the upcoming Dev Summit. This
> year the Summit will be held in San Francisco on January 22nd and 23rd,
> 2018. We are still finalizing the details and will be sending out the call
> for participation soon. But meantime, we wanted to share a preview of the
> game plan with you so that you can hold the dates and begin to think about
> ways of participating.
>
> This has been a year of strategy making for the Foundation and our
> communities. As the way forward becomes clearer, we, the technology
> community entrusted with delivering the products and infrastructure for
> supporting the community vision, need to reflect on what the movement
> strategy means for us and how to best prepare, plan and execute that
> support. This year, the Developer Summit is dedicated to this reflection.
> We invite technologists, managers and users to study, reflect and propose
> ways to support the strategic vision we are committed to. We would like you
> to capture your thoughts in a short position statement and join the
> conversation.
>
> Specifically, we invite you to think about ways of imagining, creating,
> planning, building and maintaining the technology foundation needed to
> enable the key tenets of our strategy:
>
> The infrastructure for open: We will empower individuals and institutions
> to participate and share, through open standards, platforms, and datasets.
> We will host, broker, share, and exchange free knowledge across
> institutions and communities. We will be a leading advocate and partner for
> increasing the creation, curation, and dissemination in free and open
> knowledge.
>
> An encyclopedia, and so much more: We will adapt to our changing world to
> offer knowledge in the most effective ways, across digital formats,
> devices, and experiences. We will adapt our communities and technology to
> the needs of the people we serve. As we include other forms of free
> knowledge, we will aim for these projects to be as successful as Wikipedia.
>
> Reliable, relevant information: We will continue our commitment to
> providing useful information that it is reliable, accurate, and relevant to
> users. We will integrate technologies that support accuracy at scale and
> enable greater insight into how knowledge is produced and shared. We will
> embrace the effort of increasing the quality, depth, breadth, and diversity
> of free knowledge, in all forms.
> This direction poses key questions for our technical community. Here are
> some example topics we would welcome ideas and discussion in:
>
>
> - How do we maintain and grow the technical community and ready it for
> the mission ahead?
> - What should the role of open source be in the next 15 years of the
> movement? How does it help or hinder? How do we promote it or adapt it? How
> do we leverage it?
> - What are the foundational building blocks for the language
> technologies we will need in order to be present everywhere where there are
> people?
> - Scaling. What tools do we need as the movement and the community
> grow?
> - What are the implications of the strategic direction for our
> infrastructure? Do we have any key gaps in this infrastructure? How ready
> is our infrastructure for what is to come?
> - How should MediaWiki evolve to support the mission?
> - What technologies are necessary for embracing mobility?
> - We operate in parts of the world where access to free knowledge is
> blocked, hindered or plain dangerous. What tools do we need to support
> these at-risk communities?
> - How and with whom should we partner to create the technologies
> needed to support the mission?
> - How can we leverage machine learning and analytics to support the
> mission and our communities?
> - What are emerging trends in technology that will impact our mission
> in the next 5-10-15 years?
>
>
> These conversations will be invaluable input to the next phase of the
> strategy process as we shift from exploration to definition to execution.
> We are energized, excited and hopeful for a great set of thoughtful,
> impactful conversations.
>
> As we embark on this journey we want to have an open but focused dialog so
> we are aiming for a smaller participant cohort than previous Dev Summits.
> We want to encourage everyone to consider these questions and put forward
> ideas in the form of a short position paper or abstract. We are selecting a
> Program Committee that consists of respected technologists and best
> represents the diversity of our communities. The Program Committee will
> screen and evaluate the position papers in a blind review process and will
> select those that best fit the strategic intent of this Summit. The authors
> will then be invited to participate. We hope to attract those within our
> community who are passionate about the future, hold a point of view and
> have concrete ideas for how we best use technology to support the
> objectives of the movement through 2030. We will bring the ideas and
> learnings from the Summit to the broader technology community during the
> upcoming hackathons and related events in the tech calendar.
>
> So stay tuned for the Call for Participation! Looking forward to seeing
> you in San Francisco!
>
>
>
> Victoria & the TechCom
>
>
>
>
> _______________________________________________
> Wmfall mailing list
> Wmfall(a)lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wmfall
>
>
--
Rachel Farrand
Events Program Manager
Technical Collaboration Team
Wikimedia Foundation