Hello wikipedians all around the world!
I am Christian aka SirJective from the german WP, and want to promote some knowledge about a database flaw that can be handled by sysops.
Did you ever ask yourselves why the "What links here" page of some articles lists articles that don't link there? Or found articles listed on Special:Shortpages that are not short (at least not as short as listed on that page)? Or articles listed on Special:Lonelypages that are not orphaned?
One possible solution is that these articles have duplicates, entries in the database tables with the exact same title. Only one of these duplicates is visible, but all of them contribute to automatically generated article lists.
These doubled entries are probably created, when a user wants to save a new article and clicks "Save page" several times. Instead of ignoring the request or saving the requests as different revisions of one article, the server creates new articles. The technical details are being discussed at wikitech-l under the same subject.
The doubled database entries can be removed by sysops who delete and undelete these articles. All but the newest copy will then be marked as older revisions of one article. At the german WP this worked nearly perfectly; some redirects were missed, probably by deleting the redirection target, not the redirect page itself.
I (and some users before me) created some lists of double database entries for different languages, and all databases I checked have those entries. I uploaded lists for en:, de: and nl: wikipedia. See http://meta.wikipedia.org/wiki/Multiple_titles for some more information and links to the lists.
You could run SQL queries to get lists of double entries on your WP, but I suggest downloading a dump, see http://download.wikimedia.org, and running the query on this dump. I would be happy to create these lists for other WPs, as soon as some sysop from the other WP is willing to work on this problem.
Christian aka SirJective
Christian Semrau wrote:
Hello wikipedians all around the world!
I am Christian aka SirJective from the german WP, and want to promote some knowledge about a database flaw that can be handled by sysops.
Did you ever ask yourselves why the "What links here" page of some articles lists articles that don't link there? Or found articles listed on Special:Shortpages that are not short (at least not as short as listed on that page)? Or articles listed on Special:Lonelypages that are not orphaned?
I've been working on these kinds of problems recently. The cause usually seems to be a SELECT performed without a FOR UPDATE option. Unfortunately using more locking will cause more deadlocks and lock wait timeout errors, but hopefully with appropriate use of retry loops and other measures, these problems can be kept to a minimum.
-- Tim Starling
Tim Starling <ts4294967296@...> writes:
Christian Semrau wrote:
Hello wikipedians all around the world!
I am Christian aka SirJective from the german WP, and want to promote some knowledge about a database flaw that can be handled by sysops.
Did you ever ask yourselves why the "What links here" page of some articles lists articles that don't link there? Or found articles listed on Special:Shortpages that are not short (at least not as short as listed on that page)? Or articles listed on Special:Lonelypages that are not orphaned?
I've been working on these kinds of problems recently. The cause usually seems to be a SELECT performed without a FOR UPDATE option. Unfortunately using more locking will cause more deadlocks and lock wait timeout errors, but hopefully with appropriate use of retry loops and other measures, these problems can be kept to a minimum.
Hello Tim,
I thank you for searching solutions to prevent these problems for the future. Do you have ideas/plans how to get rid of the existing duplicates? I would prefer some automated solution, instead of letting sysops delete and undelete these entries manually.
Today I uploaded a list to ja: and updated the nl: and the en: list. The de: list (ca. 100 entries) has been cleared by Fristu. (All reachably from [[meta:Multiple titles]].)
Christian de:Benutzer:SirJective
wikipedia-l@lists.wikimedia.org