[Wikipedia-l] Double database entries

Christian Semrau sirjective at gmx.de
Tue Aug 17 16:59:17 UTC 2004

Hello wikipedians all around the world!

I am Christian aka SirJective from the german WP, and want to promote some
knowledge about a database flaw that can be handled by sysops.

Did you ever ask yourselves why the "What links here" page of some articles
lists articles that don't link there?
Or found articles listed on Special:Shortpages that are not short (at least not
as short as listed on that page)?
Or articles listed on Special:Lonelypages that are not orphaned?

One possible solution is that these articles have duplicates, entries in the
database tables with the exact same title. Only one of these duplicates is
visible, but all of them contribute to automatically generated article lists.

These doubled entries are probably created, when a user wants to save a new
article and clicks "Save page" several times. Instead of ignoring the request or
saving the requests as different revisions of one article, the server creates
new articles. The technical details are being discussed at wikitech-l under the
same subject.

The doubled database entries can be removed by sysops who delete and undelete
these articles. All but the newest copy will then be marked as older revisions
of one article. At the german WP this worked nearly perfectly; some redirects
were missed, probably by deleting the redirection target, not the redirect page

I (and some users before me) created some lists of double database entries for
different languages, and all databases I checked have those entries. I uploaded
lists for en:, de: and nl: wikipedia. See
http://meta.wikipedia.org/wiki/Multiple_titles for some more information and
links to the lists.

You could run SQL queries to get lists of double entries on your WP, but I
suggest downloading a dump, see http://download.wikimedia.org, and running the
query on this dump. I would be happy to create these lists for other WPs, as
soon as some sysop from the other WP is willing to work on this problem.

Christian aka SirJective

More information about the Wikipedia-l mailing list