Re: [MediaWiki-Core] Project Scoping: Job queue length. Tuesday 6th May at 4pm.

1 May 2014

      On 05/01/2014 06:36 AM, Brad Jorsch (Anomie) wrote:
...
On Wed, Apr 30, 2014 at 3:44 PM, Dan Garry <dgarry@wikimedia.org
mailto:dgarry@wikimedia.org> wrote:
The meeting will be on Tuesday 6th May at 4pm.

I'm assuming that's SF time? ;)
Yeah. I won't be able to attend then as I'll be in Europe.
...
Regarding the job queue, another item of note is that there's a general
perception on enwiki that the job queue is either unreliable or too slow to
the point of uselessness when it comes to updating the links tables (e.g.
categorylinks) when time-related parser functions are used (e.g. using #if
to test if the current date is past a threshold). So people run bots to do
forcelinkupdate purges.
There are also sometimes complaints that links tables aren't being updated
in a timely manner after edits to templates, as in things aren't updated
weeks later and people reply that the job queue is just really slow. Then
usually someone does null edits on the pages transcluding the template.
I don't know if either of those issues are related or otherwise in-scope,
but I mention them Just In Case.
I think both are not directly the job queue's fault. There is some logic to
skip large template updates (more than 200k uses IIRC) which might lead to
the impression that the job queue is unreliable. This logic is only there as
processing such large template updates (up to 8 million uses on enwiki) is
very slow.
The reason for this is that template updates currently trigger a re-render
of the entire page. With information about which templates were involved in
the expansion of a given transclusion fragment, re-rendering those fragments
& updating the relevant link tables would be much more efficient and thus
faster.
Re the missing time-based invalidation: Fragment TTLs are currently not
tracked in MediaWiki. The only option for dynamic content is to disable (or
drastically shorten) caching for the entire page. Afaik we don't do this for
performance reasons, so time-based templates will go stale & need to be
manually purged. We have plans to track TTLs per transclusion and extension
in Parsoid, but have not implemented this yet. This will let us re-render
only the timed-out transclusion/extension, which is more efficient than
re-rendering the entire page from scratch.
So I'd propose to focus on tweaking the queue length monitoring for now. The
main job processor issues are already on the roadmap of other teams (Parsoid
for example), and replacing the job queue itself with something more
reliable & scalable (Kafka?) is probably not the highest priority at this point.
Gabriel

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

Re: [MediaWiki-Core] Project Scoping: Job queue length. Tuesday 6th May at 4pm.