On [[en:Wikipedia:Protected page]], I find a number of links being shown as broken links, even though the page they linked to exists. I tried (quasi-zero-)edits on both the page itself and on one of the linked pages ([[en:Silesia]] - with apologies for editing a protected page), and neither solved the problem.
Anyone knows what the problem is, and how to solve it? I find this especially bad, because with the move these problems should have been solved - instead they seem to have been added.
Andre Engels
On Dec 3, 2003, at 03:14, Andre Engels wrote:
Anyone knows what the problem is, and how to solve it? I find this especially bad, because with the move these problems should have been solved - instead they seem to have been added.
I'm running a quick script to take out the bad brokenlinks entries (and clear linkscc). Obviously some problems in rebuildlinks remain...
-- brion vibber (brion @ pobox.com)
On Dec 3, 2003, at 04:07, Brion Vibber wrote:
I'm running a quick script to take out the bad brokenlinks entries (and clear linkscc). Obviously some problems in rebuildlinks remain...
'remove-brokenlinks.php' in maintenance subdirectory in unstable cvs.
Output (I forgot to add headers saying which database is which, but you can pretty much tell): http://download.wikipedia.org/archives/brokenlinks.out.gz
Altogether 13528 bad brokenlinks items turned up. The script didn't check which were or weren't also in the links table...
A link that's missing from the links table is an annoyance, but doesn't harm the appearance of links and will be safely re-added when the page is next saved. A bogus link in brokenlinks causes false edit links and generally don't seem to go away when the page is saved, so they're more of a problem.
-- brion vibber (brion @ pobox.com)
On Dec 3, 2003, at 04:29, Brion Vibber wrote:
On Dec 3, 2003, at 04:07, Brion Vibber wrote:
I'm running a quick script to take out the bad brokenlinks entries (and clear linkscc). Obviously some problems in rebuildlinks remain...
'remove-brokenlinks.php' in maintenance subdirectory in unstable cvs.
Output (I forgot to add headers saying which database is which, but you can pretty much tell): http://download.wikipedia.org/archives/brokenlinks.out.gz
Altogether 13528 bad brokenlinks items turned up. The script didn't check which were or weren't also in the links table...
I've gone over rebuildlinks.inc and found what I think is the problem. It goes through each page looking for links, then for each checks all the links it hasn't seen before to see if the pages exist. In this bit was a bug where it discarded the namespace data on the stuff returned from the database; this could both put incorrect l_to target IDs into the link tables and skip legitimate links, leaving them to get stuffed into the brokenlinks table.
I've fixed this in CVS to take the namespace into account. If people (in particular E23) could look it over and confirm it looks correct, we can rebuild the live tables again.
-- brion vibber (brion @ pobox.com)
On Dec 6, 2003, at 18:03, Brion Vibber wrote:
I've fixed this in CVS to take the namespace into account. If people (in particular E23) could look it over and confirm it looks correct, we can rebuild the live tables again.
I ran it on the test wiki. I can't confirm that _everything_ is correct, but a run of remove-brokenlinks.php after running the old rebuild script found 32 incorrect brokenlinks, while another run after newly rebuilding the tables found none. Progress!
-- brion vibber (brion @ pobox.com)
wikitech-l@lists.wikimedia.org