I've implemented a new feature, which allows users to automatically fix double redirects which are created when they move a page. It's live now on Wikimedia projects.
On the page move form, a checkbox is shown, checked by default, labelled "update any redirects which point to the original title". Jobs are inserted into the job queue in order to perform the edits. The edits are performed by a user called "Redirect fixer", or the equivalent localised by [[MediaWiki:double-redirect-fixer]]. This user is created in the user table if it doesn't already exist, so it has a valid user ID. Its edits do not show up in Recent Changes.
Before each edit is performed, the redirect fixer will follow the chain of redirects to find the current final destination. It will then edit the page to point to that final destination. If there is a circular reference or invalid redirect, no action is taken. If the page is no longer a double redirect (say because the move was reverted), then no action is taken.
If the page has changed since the move was performed, the edit is not done.
The number of job queue threads has been increased.
This feature was inspired by a meeting with White Cat at Wikimania, seeing the terrible conditions his bots are forced to work under, shoulder to shoulder in 15 tiled command prompt windows, fixing double redirects all day and all night, working their poor little fingers to the bone. It was very sad.
Comments would be appreciated.
-- Tim Starling
On Thu, Jul 24, 2008 at 01:54:57AM +1000, Tim Starling wrote:
On the page move form, a checkbox is shown, checked by default, labelled "update any redirects which point to the original title". Jobs are inserted into the job queue in order to perform the edits. The edits are performed by a user called "Redirect fixer", or the equivalent localised by [[MediaWiki:double-redirect-fixer]]. This user is created in the user table if it doesn't already exist, so it has a valid user ID. Its edits do not show up in Recent Changes.
Presumably the stop-table which I know has to exist in CreateUser was update to forbid smartasses jumping out and registering that name?
Cheers, -- jra
Jay R. Ashworth wrote:
On Thu, Jul 24, 2008 at 01:54:57AM +1000, Tim Starling wrote:
On the page move form, a checkbox is shown, checked by default, labelled "update any redirects which point to the original title". Jobs are inserted into the job queue in order to perform the edits. The edits are performed by a user called "Redirect fixer", or the equivalent localised by [[MediaWiki:double-redirect-fixer]]. This user is created in the user table if it doesn't already exist, so it has a valid user ID. Its edits do not show up in Recent Changes.
Presumably the stop-table which I know has to exist in CreateUser was update to forbid smartasses jumping out and registering that name?
Yes, which is why I sent my announcement after it was deployed, not before. Note that the name is localisable, which increases the potential for mischief somewhat, but if there's a problem, a bureaucrat can usurp the name. No special privileges will be granted if a user manages to get logged in as the reserved name, it's just confusing.
-- Tim Starling
2008/7/23 Tim Starling tstarling@wikimedia.org:
This feature was inspired by a meeting with White Cat at Wikimania, seeing the terrible conditions his bots are forced to work under, shoulder to shoulder in 15 tiled command prompt windows, fixing double redirects all day and all night, working their poor little fingers to the bone. It was very sad.
Not to mention that these bots were forced to run in cmd.exe shells. Sad, indeed :(
If not running 24/7 scanning for Special:Doubleredirects, bot owners are used to run pywikipedia' double redirect fixer on each new XML dump of their project (real time bots were processing less double redirects than created, thus leaving a quite big number of double redirects to fix afterwards). I'll be looking forward at the data dumped after the activation of this feature on the WM sites: the number of "surviving" double redirects should be greatly reduced.
On Wed, Jul 23, 2008 at 11:54 AM, Tim Starling tstarling@wikimedia.org wrote:
On the page move form, a checkbox is shown, checked by default, labelled "update any redirects which point to the original title".
Is the typical user going to understand what this means and be able to make an intelligent decision about whether to use it? Is there any good reason this shouldn't always be done, with no option to skip it?
Also, what's the maximum number of redirects we're expecting here, and how long does each one take to fix? It would be nice if this could be done synchronously rather than on the job queue, so you wouldn't have to wait for possibly days for everything to be updated.
Simetrical wrote:
On Wed, Jul 23, 2008 at 11:54 AM, Tim Starling tstarling@wikimedia.org wrote:
On the page move form, a checkbox is shown, checked by default, labelled "update any redirects which point to the original title".
Is the typical user going to understand what this means and be able to make an intelligent decision about whether to use it? Is there any good reason this shouldn't always be done, with no option to skip it?
I don't know, someone asked for it at Wikimania. Michael maybe?
Also, what's the maximum number of redirects we're expecting here, and how long does each one take to fix? It would be nice if this could be done synchronously rather than on the job queue, so you wouldn't have to wait for possibly days for everything to be updated.
The number of redirects is unlimited, the time per redirect is probably on the order of 100ms. If there's a job queue lag of days, then it's broken and should be fixed. Someone else mentioned this problem before I deployed it, and indeed there was a high job queue number on en.wikipedia.org, which is why I quadrupled the number of threads. It's now near-zero.
-- Tim Starling
On Wed, Jul 23, 2008 at 2:09 PM, Tim Starling tstarling@wikimedia.org wrote:
Simetrical wrote:
Is the typical user going to understand what this means and be able to make an intelligent decision about whether to use it? Is there any good reason this shouldn't always be done, with no option to skip it?
I don't know, someone asked for it at Wikimania. Michael maybe?
It seems rather cluttery to me . . .
Simetrical hett schreven:
On Wed, Jul 23, 2008 at 11:54 AM, Tim Starling tstarling@wikimedia.org wrote:
On the page move form, a checkbox is shown, checked by default, labelled "update any redirects which point to the original title".
Is the typical user going to understand what this means and be able to make an intelligent decision about whether to use it? Is there any good reason this shouldn't always be done, with no option to skip it?
For some pages on some projects it is common to archive the page by moving "page X" to "page X/Archive" and recreate "page X" from the resulting redirect. But pre-existing redirects on "page X" should still point to "page X" after the move. Therefore it is indeed useful to have it optionally.
Also, what's the maximum number of redirects we're expecting here, and how long does each one take to fix? It would be nice if this could be done synchronously rather than on the job queue, so you wouldn't have to wait for possibly days for everything to be updated.
It's clearly an extreme and dubious example, but the highest number of redirects I ever came upon was on ksh.wikipedia. The article http://ksh.wikipedia.org/wiki/K%C3%B6lle%2C_Rejierungsbezirk for example has 16619 redirects. On wikis with bigger communities who can take corrective actions against single dominant users the numbers most likely will be much lower ;-) I think there are few realistic cases where numbers of more than perhaps 20 redirects are really needed.
Marcus Buck Slomox
Is the typical user going to understand what this means and be able to make an intelligent decision about whether to use it? Is there any good reason this shouldn't always be done, with no option to skip it?
At the moment I can think of no reason out of my experience, why there should be this option. But programmers and users before they really see the program tend to like things flexible, just for the case. :-)
Ting
On Wed, Jul 23, 2008 at 7:59 PM, Simetrical Simetrical+wikilist@gmail.com wrote:
Is the typical user going to understand what this means and be able to make an intelligent decision about whether to use it? Is there any good reason this shouldn't always be done, with no option to skip it?
I don't think it's happening any more, but if I recall correctly in the past both the Dutch and the Spanish Wikipedias were creating their month pages by moving the "Current events" each month, then creating a new one.
Graoully from fr.wp just commented that when fixing copyright violations (move page to page/copyvio; fix page/copyvio; move back page/copyvio to page), the redirect fixer can work back and forth if you leave the box checked.
2008/7/24 Andre Engels andreengels@gmail.com:
On Wed, Jul 23, 2008 at 7:59 PM, Simetrical Simetrical+wikilist@gmail.com wrote:
Is the typical user going to understand what this means and be able to make an intelligent decision about whether to use it? Is there any good reason this shouldn't always be done, with no option to skip it?
I don't think it's happening any more, but if I recall correctly in the past both the Dutch and the Spanish Wikipedias were creating their month pages by moving the "Current events" each month, then creating a new one.
-- André Engels, andreengels@gmail.com _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Simetrical wrote:
On Wed, Jul 23, 2008 at 11:54 AM, Tim Starling wrote:
On the page move form, a checkbox is shown, checked by default, labelled "update any redirects which point to the original title".
Is there any good reason this shouldn't always be done, with no option to skip it?
See this merge scenario: You do a deleting move of A to B. All redirects to A are changed to B. You restore B. You do a move_redir from B to A. All redirects to B are changed to A, *as well as all redirects to A which were already updated to point to B*.
As to why wouldn't you start by moving B to A, it replaces an edit operation (not always rollback) with a move one.
(and yes, if you can delete and restore, you should be able to do this in one action, any reason for not having Special:MergeHistory enabled?)
Tim, I suppose that when redirect fixer edits a redirect only page normal users aren't allowed to move over it any more?
Hoi, There are situations where what is a redirect is changed to a full article. In such cases it makes sense to revistit all the articles linking to the redirect. It is not a good idea to prevent the creation of the article. Thanks, GerardM
On Fri, Jul 25, 2008 at 11:21 AM, Platonides Platonides@gmail.com wrote:
Simetrical wrote:
On Wed, Jul 23, 2008 at 11:54 AM, Tim Starling wrote:
On the page move form, a checkbox is shown, checked by default, labelled "update any redirects which point to the original title".
Is there any good reason this shouldn't always be done, with no option to
skip it?
See this merge scenario: You do a deleting move of A to B. All redirects to A are changed to B. You restore B. You do a move_redir from B to A. All redirects to B are changed to A, *as well as all redirects to A which were already updated to point to B*.
As to why wouldn't you start by moving B to A, it replaces an edit operation (not always rollback) with a move one.
(and yes, if you can delete and restore, you should be able to do this in one action, any reason for not having Special:MergeHistory enabled?)
Tim, I suppose that when redirect fixer edits a redirect only page normal users aren't allowed to move over it any more?
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
On Wed, Jul 23, 2008 at 6:54 PM, Tim Starling tstarling@wikimedia.org wrote:
I've implemented a new feature, which allows users to automatically fix double redirects which are created when they move a page. It's live now on Wikimedia projects.
On the page move form, a checkbox is shown, checked by default, labelled "update any redirects which point to the original title". Jobs are inserted into the job queue in order to perform the edits. The edits are performed by a user called "Redirect fixer", or the equivalent localised by [[MediaWiki:double-redirect-fixer]]. This user is created in the user table if it doesn't already exist, so it has a valid user ID. Its edits do not show up in Recent Changes.
Before each edit is performed, the redirect fixer will follow the chain of redirects to find the current final destination. It will then edit the page to point to that final destination. If there is a circular reference or invalid redirect, no action is taken. If the page is no longer a double redirect (say because the move was reverted), then no action is taken.
If the page has changed since the move was performed, the edit is not done.
The number of job queue threads has been increased.
This feature was inspired by a meeting with White Cat at Wikimania, seeing the terrible conditions his bots are forced to work under, shoulder to shoulder in 15 tiled command prompt windows, fixing double redirects all day and all night, working their poor little fingers to the bone. It was very sad.
Comments would be appreciated.
Thanks Tim for this nice feature. I have changed the user name (also added the word bot in it) but it isn't registered, I suppose it will be registered on the next page move with double redirects?
WhiteCat should be relieved now :)
On Wed, Jul 23, 2008 at 10:50 PM, Mohamed Magdy mohamed.m.k@gmail.com wrote:
On Wed, Jul 23, 2008 at 6:54 PM, Tim Starling tstarling@wikimedia.org wrote:
I've implemented a new feature, which allows users to automatically fix double redirects which are created when they move a page. It's live now
on
Wikimedia projects.
On the page move form, a checkbox is shown, checked by default, labelled "update any redirects which point to the original title". Jobs are inserted into the job queue in order to perform the edits. The edits are performed by a user called "Redirect fixer", or the equivalent localised by [[MediaWiki:double-redirect-fixer]]. This user is created in the user table if it doesn't already exist, so it has a valid user ID. Its edits
do
not show up in Recent Changes.
Before each edit is performed, the redirect fixer will follow the chain
of
redirects to find the current final destination. It will then edit the page to point to that final destination. If there is a circular reference or invalid redirect, no action is taken. If the page is no longer a
double
redirect (say because the move was reverted), then no action is taken.
If the page has changed since the move was performed, the edit is not
done.
The number of job queue threads has been increased.
This feature was inspired by a meeting with White Cat at Wikimania,
seeing
the terrible conditions his bots are forced to work under, shoulder to shoulder in 15 tiled command prompt windows, fixing double redirects all day and all night, working their poor little fingers to the bone. It was very sad.
Comments would be appreciated.
Thanks Tim for this nice feature. I have changed the user name (also added the word bot in it) but it isn't registered, I suppose it will be registered on the next page move with double redirects?
Actually the account was created (without a user creation log entry) before you edited [[MediaWiki:Double-redirect-fixer]]
http://ar.wikipedia.org/w/index.php?title=Special:Contributions&limit=50...
I am not sure if the account will be recreated with the new name.
--User:Meno25
WhiteCat should be relieved now :)
-- --alnokta _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Comments would be appreciated.
This is a very nice feature, should reduce the double redirects in the future. Thanx
Ting
On Wed, Jul 23, 2008 at 5:54 PM, Tim Starling tstarling@wikimedia.org wrote:
This feature was inspired by a meeting with White Cat at Wikimania, seeing the terrible conditions his bots are forced to work under, shoulder to shoulder in 15 tiled command prompt windows, fixing double redirects all day and all night, working their poor little fingers to the bone. It was very sad.
Comments would be appreciated.
In general, I think this is a very handy feature which relieves a lot of work that is currently done either manual or by bot. There are two minor issues at the moment though:
* When a user moves a page to the wrong target and moves it again to fix that, the redirect fixer won't notice and still try to fix double redirects according to the first move. This happens quite often actually just by inserting a typo or using the wrong lettercase on part of the target.
* Once a page is moved that shouldn't, there's no way to stop the system from updating all the double redirects. This opens a new form of vandalism which we have no way of stopping while it's occurring because the job queue is not editable (or viewable in fact). The pseudo-user redirect fixer does not pay attention to blocks so blocking it has no effect (tested yesterday on dewiki). It would also be good if local admins could kill individual jobs in the job queue, especially if there are future features that may make use of the job queue for automatic edits.
Sebastian
Another dev commented that giving sysops access to the job que reduces the simplicity that it was meant to have. Perhaps the double redirect fixer should verify that everything is in order before it does a fix. At first thought, I think something as simple as making sure the page it is told to redirect to is not a redirect itself should suffice.
~Daniel Friesen(Dantman, Nadir-Seen-Fire) of: -The Nadir-Point Group (http://nadir-point.com) --It's Wiki-Tools subgroup (http://wiki-tools.com) --The ElectronicMe project (http://electronic-me.org) --Games-G.P.S. (http://ggps.org) -And Wikia ACG on Wikia.com (http://wikia.com/wiki/Wikia_ACG) --Animepedia (http://anime.wikia.com) --Narutopedia (http://naruto.wikia.com)
Sebastian Moleski wrote:
On Wed, Jul 23, 2008 at 5:54 PM, Tim Starling tstarling@wikimedia.org wrote:
This feature was inspired by a meeting with White Cat at Wikimania, seeing the terrible conditions his bots are forced to work under, shoulder to shoulder in 15 tiled command prompt windows, fixing double redirects all day and all night, working their poor little fingers to the bone. It was very sad.
Comments would be appreciated.
In general, I think this is a very handy feature which relieves a lot of work that is currently done either manual or by bot. There are two minor issues at the moment though:
- When a user moves a page to the wrong target and moves it again to
fix that, the redirect fixer won't notice and still try to fix double redirects according to the first move. This happens quite often actually just by inserting a typo or using the wrong lettercase on part of the target.
- Once a page is moved that shouldn't, there's no way to stop the
system from updating all the double redirects. This opens a new form of vandalism which we have no way of stopping while it's occurring because the job queue is not editable (or viewable in fact). The pseudo-user redirect fixer does not pay attention to blocks so blocking it has no effect (tested yesterday on dewiki). It would also be good if local admins could kill individual jobs in the job queue, especially if there are future features that may make use of the job queue for automatic edits.
Sebastian
Sebastian Moleski wrote:
On Wed, Jul 23, 2008 at 5:54 PM, Tim Starling tstarling@wikimedia.org wrote:
This feature was inspired by a meeting with White Cat at Wikimania, seeing the terrible conditions his bots are forced to work under, shoulder to shoulder in 15 tiled command prompt windows, fixing double redirects all day and all night, working their poor little fingers to the bone. It was very sad.
Comments would be appreciated.
In general, I think this is a very handy feature which relieves a lot of work that is currently done either manual or by bot. There are two minor issues at the moment though:
- When a user moves a page to the wrong target and moves it again to
fix that, the redirect fixer won't notice and still try to fix double redirects according to the first move. This happens quite often actually just by inserting a typo or using the wrong lettercase on part of the target.
Are you sure? It doesn't even record the destination of the move in the job queue, it just follows the redirect chain (using the master) before each edit, to make sure the job still has to be done.
- Once a page is moved that shouldn't, there's no way to stop the
system from updating all the double redirects. This opens a new form of vandalism which we have no way of stopping while it's occurring because the job queue is not editable (or viewable in fact). The pseudo-user redirect fixer does not pay attention to blocks so blocking it has no effect (tested yesterday on dewiki). It would also be good if local admins could kill individual jobs in the job queue, especially if there are future features that may make use of the job queue for automatic edits.
If you just move the page back, the creation of incorrect redirects will instantly stop, and new jobs will be queued up to fix the incorrect redirects. That's how it's meant to work anyway.
-- Tim Starling
On Thu, Jul 24, 2008 at 8:36 AM, Tim Starling tstarling@wikimedia.org wrote:
If you just move the page back, the creation of incorrect redirects will instantly stop, and new jobs will be queued up to fix the incorrect redirects. That's how it's meant to work anyway.
Would it make sense to delete all the obsolete job queue entries right off in this case? We seem to have an index on (job_cmd, job_namespace, job_title) that could be used.
On Wed, Jul 23, 2008 at 4:54 PM, Tim Starling tstarling@wikimedia.org wrote:
I've implemented a new feature, which allows users to automatically fix double redirects which are created when they move a page. It's live now on Wikimedia projects.
On the page move form, a checkbox is shown, checked by default, labelled "update any redirects which point to the original title". Jobs are inserted into the job queue in order to perform the edits. The edits are performed by a user called "Redirect fixer", or the equivalent localised by [[MediaWiki:double-redirect-fixer]]. This user is created in the user table if it doesn't already exist, so it has a valid user ID. Its edits do not show up in Recent Changes.
Before each edit is performed, the redirect fixer will follow the chain of redirects to find the current final destination. It will then edit the page to point to that final destination. If there is a circular reference or invalid redirect, no action is taken. If the page is no longer a double redirect (say because the move was reverted), then no action is taken.
If the page has changed since the move was performed, the edit is not done.
The number of job queue threads has been increased.
This feature was inspired by a meeting with White Cat at Wikimania, seeing the terrible conditions his bots are forced to work under, shoulder to shoulder in 15 tiled command prompt windows, fixing double redirects all day and all night, working their poor little fingers to the bone. It was very sad.
Comments would be appreciated.
Sounds nice (didn't try it yet). Could this be triggered on certain view events as well? So, someone tries to view [[A]] which redirects to [[B]] which redirects to [[C]]. Currently, he'll get "[[B]] (redirected from [[A]])". If this occurs (redirected to a redirect page), could a double redirect fix request be added to the queue? That would clear all dr's over time, as we have unlimited eyeballs in the long run :-)
Magnus
Magnus Manske wrote:
Sounds nice (didn't try it yet). Could this be triggered on certain view events as well? So, someone tries to view [[A]] which redirects to [[B]] which redirects to [[C]]. Currently, he'll get "[[B]] (redirected from [[A]])". If this occurs (redirected to a redirect page), could a double redirect fix request be added to the queue? That would clear all dr's over time, as we have unlimited eyeballs in the long run :-)
Would this not circumvent the "update any redirects which point to the original title" checkbox - to work successfully that value would have to be stored in the database, adding an additional overhead?
MinuteElectron.
On Thu, Jul 24, 2008 at 7:50 AM, MinuteElectron minuteelectron@googlemail.com wrote:
Magnus Manske wrote:
Sounds nice (didn't try it yet). Could this be triggered on certain view events as well? So, someone tries to view [[A]] which redirects to [[B]] which redirects to [[C]]. Currently, he'll get "[[B]] (redirected from [[A]])". If this occurs (redirected to a redirect page), could a double redirect fix request be added to the queue? That would clear all dr's over time, as we have unlimited eyeballs in the long run :-)
Would this not circumvent the "update any redirects which point to the original title" checkbox - to work successfully that value would have to be stored in the database, adding an additional overhead?
MinuteElectron.
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Assuming it was left over from a pagemove, yes. Otherwise, I think it would work.
-Chad
wikitech-l@lists.wikimedia.org