Hi.
Is there a policy or guideline about the level to which Wikimedia wikis care about data integrity? There are a few specific cases I'm talking about:
* edits or other logged actions with a wrong timestamp; * incomplete user renames (contributions split between two accounts); * weird entries in various *links tables (categorylinks, pagelinks, etc.); * weird entries in various non-links tables (page, user, etc.); and * revisions with weird user_id (page import bug, I think?).
This has come up in the context of some old Bugzilla bugs about edits with the wrong timestamp. There's been some recent activity to mark at least some of these old bugs as "wontfix". And perhaps this makes sense, given that some of them are very old and it may be quite likely that nobody will ever go back and tweak the old revision timestamps.
But I'm left wondering if there's anything concrete in this area to guide the other Bugzilla bugs, such as the bugs about botched user renames or *links tables having orphaned or invalid entries. And to some degree, there's a human component too, I suppose. An edit with a bad timestamp isn't a big deal; breaking a user's account simply because they have a lot of edits is a much bigger deal. So there's some level of triaging to be done.
I see a lot of these issues as falling under general "database integrity". Is there a page on MediaWiki.org or Meta-Wiki that discusses this particular issue?
I believe https://bugzilla.wikimedia.org/show_bug.cgi?id=16660 ("Database table cleanup (tracking)") is the relevant tracking bug for most of what I'm describing.
MZMcBride
On Sun, 2012-11-11 at 09:22 -0500, MZMcBride wrote:
Is there a policy or guideline about the level to which Wikimedia wikis care about data integrity? There are a few specific cases I'm talking about:
- edits or other logged actions with a wrong timestamp;
- incomplete user renames (contributions split between two accounts);
- weird entries in various *links tables (categorylinks, pagelinks, etc.);
- weird entries in various non-links tables (page, user, etc.); and
- revisions with weird user_id (page import bug, I think?).
I believe https://bugzilla.wikimedia.org/show_bug.cgi?id=16660 ("Database table cleanup (tracking)") is the relevant tracking bug
It is. It boils down to people with shell access & time to fix reports. https://www.mediawiki.org/wiki/Shell_requests/status#2012-10-monthly provides some statistics that anybody can use to draw conclusions. :)
The rest of this email covers the Bugzilla aspect of this case & my take on closing tickets as "WONTFIX". Skip this if you're not interested!
This has come up in the context of some old Bugzilla bugs about edits with the wrong timestamp. There's been some recent activity to mark at least some of these old bugs as "wontfix". And perhaps this makes sense, given that some of them are very old and it may be quite likely that nobody will ever go back and tweak the old revision timestamps.
It was me who WONTFIXed two or three tickets. In this specific case there are two aspects: 1) The underlying code problem that triggers occurrences: bug 2930. 2) Reports about occurrences (e.g. bug 6871), collected in bug 16660.
Obviously I'm rather new to Wikimedia Bugzilla so MZMcBride was kind enough to share the impression on IRC that "You might not fix the bug, but we don't close bugs based on that, usually."
Closing a report as WONTFIX means "I am against fixing this problem". I had to realize that I also use WONTFIX for "It is extremely unlikely that anybody will ever fix this; this is lower than lowest priority."
I like to communicate realistic expectations ("this won't get fixed because nobody cares, only providing patches might help") even if this disappoints people / the bug reporter. I prefer this to not communicating anything and letting a report silently rot (and make the Bugzilla search results noisy). This also disappoints the reporter - "First I haven't heard anything back for ages and now you simply close this without fixing it". Obviously I hurt or "punish" reporters in both cases, but I don't see us getting all bugs fixed quickly due to missing unlimited manpower. :)
So in case there is a culture in Wikimedia Bugzilla to not use WONTFIX too often, it could be interesting to discuss its use and the reasons.
andre
Andre, the answer is in this one-line comment by Brion: https://bugzilla.wikimedia.org/show_bug.cgi?id=16614#c3 None of this bugs should be closed unless both the reported occurrence and the underlying problem are fixed. The alternative is that it's proved to be an occasional problem which won't repeat itself ("servers were busy looking at hurricane Sandy clouds"), or that the bug is closed LATER. As for the noisy search results, the problem is https://bugzilla.wikimedia.org/show_bug.cgi?id=39644 ;-)
Nemo
Hi,
On Sun, 2012-11-11 at 16:51 +0100, Federico Leva (Nemo) wrote:
Andre, the answer is in this one-line comment by Brion: https://bugzilla.wikimedia.org/show_bug.cgi?id=16614#c3 None of this bugs should be closed unless both the reported occurrence and the underlying problem are fixed.
In that case Brion might have changed his opinion, as https://bugzilla.wikimedia.org/show_bug.cgi?id=2930#c13 to c16 all are occurrences closed as duplicates of the underlying code problem, but these occurrences have not been fixed. That's the same what I did in comment 23 to 25. :)
As for the noisy search results, the problem is https://bugzilla.wikimedia.org/show_bug.cgi?id=39644
Fixed now.
andre
On 11/11/2012 10:32 AM, Andre Klapper wrote:
So in case there is a culture in Wikimedia Bugzilla to not use WONTFIX too often, it could be interesting to discuss its use and the reasons.
Before I left WMF, I had come to use the lowest priority to indicate what you're using WONTFIX for.
I did this because using lowest priority instead of WONTFIX properly communicated the amount of attention the issue is going to get from developers and the WMF. It says "Sure, this is a problem, but we aren't going to spend any time on it. If you can find a way to fix this, we might use it."
The last bit -- "we might use it" -- is what WONTFIX explicitly denies. I felt that WONTFIX should be used for things that the developers clearly thought were bad ideas -- "Support Facebook logins and 'Like this page' in MediaWiki core" would be an example of that.
Of course, this is just how i came to use it. You may decide that this isn't the way you want to do things.
Mark.
On Sun, Nov 11, 2012 at 9:15 AM, Mark A. Hershberger mah@everybody.org wrote:
On 11/11/2012 10:32 AM, Andre Klapper wrote:
So in case there is a culture in Wikimedia Bugzilla to not use WONTFIX too often, it could be interesting to discuss its use and the reasons.
Before I left WMF, I had come to use the lowest priority to indicate what you're using WONTFIX for.
I did this because using lowest priority instead of WONTFIX properly communicated the amount of attention the issue is going to get from developers and the WMF. It says "Sure, this is a problem, but we aren't going to spend any time on it. If you can find a way to fix this, we might use it."
The last bit -- "we might use it" -- is what WONTFIX explicitly denies. I felt that WONTFIX should be used for things that the developers clearly thought were bad ideas -- "Support Facebook logins and 'Like this page' in MediaWiki core" would be an example of that.
Of course, this is just how i came to use it. You may decide that this isn't the way you want to do things.
I agree with this usage of WONTFIX. WONTFIX (to me) means we won't fix something ever--not "we'll fix it, just no commitments on when."
-Chad
On Sun, 2012-11-11 at 11:08 -0800, Chad wrote:
On Sun, Nov 11, 2012 at 9:15 AM, Mark A. Hershberger mah@everybody.org wrote:
I felt that WONTFIX should be used for things that the developers clearly thought were bad ideas -- "Support Facebook logins and 'Like this page' in MediaWiki core" would be an example of that.
I agree with this usage of WONTFIX. WONTFIX (to me) means we won't fix something ever--not "we'll fix it, just no commitments on when."
Thanks for sharing, you both have very good points! I'll stick to "WONTFIX = we are against fixing this request" now.
andre
On 11/11/2012 05:17 PM, Andre Klapper wrote:
I'll stick to "WONTFIX = we are against fixing this request" now.
FWIW, I also found it helpful to exclude the "lowest" and "unprioritized" priorities from my queries. This keeps the "noise" out and helps you focus on the things that really matter.
Mark.
In Commons there is a bunch of broken/corrupt/missing files (most old versions of the same file).
2012/11/11 MZMcBride z@mzmcbride.com
Hi.
Is there a policy or guideline about the level to which Wikimedia wikis care about data integrity? There are a few specific cases I'm talking about:
- edits or other logged actions with a wrong timestamp;
- incomplete user renames (contributions split between two accounts);
- weird entries in various *links tables (categorylinks, pagelinks, etc.);
- weird entries in various non-links tables (page, user, etc.); and
- revisions with weird user_id (page import bug, I think?).
This has come up in the context of some old Bugzilla bugs about edits with the wrong timestamp. There's been some recent activity to mark at least some of these old bugs as "wontfix". And perhaps this makes sense, given that some of them are very old and it may be quite likely that nobody will ever go back and tweak the old revision timestamps.
But I'm left wondering if there's anything concrete in this area to guide the other Bugzilla bugs, such as the bugs about botched user renames or *links tables having orphaned or invalid entries. And to some degree, there's a human component too, I suppose. An edit with a bad timestamp isn't a big deal; breaking a user's account simply because they have a lot of edits is a much bigger deal. So there's some level of triaging to be done.
I see a lot of these issues as falling under general "database integrity". Is there a page on MediaWiki.org or Meta-Wiki that discusses this particular issue?
I believe https://bugzilla.wikimedia.org/show_bug.cgi?id=16660 ("Database table cleanup (tracking)") is the relevant tracking bug for most of what I'm describing.
MZMcBride
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
On Sun, 2012-11-11 at 17:26 +0100, emijrp wrote:
In Commons there is a bunch of broken/corrupt/missing files (most old versions of the same file).
For rather specific issues, could you please file a bug report with enough info in bugzilla.wikimedia.org, if it doesn't exist yet? Thanks a lot in advance!
andre
wikitech-l@lists.wikimedia.org