-------- Original Message -------- Subject: Revision tagging: use cases needed Date: Tue, 14 Feb 2012 14:18:49 -0800 From: Dario Taraborelli dtaraborelli@wikimedia.org
We're getting to a point where we need to be able to flag specific revisions as generated via specific tools. For example if we generate edits via AFT call-to-actions we want to measure: • their volume (compared to regular edits) • their survival/revert rate
The same request is now emerging from the Article Creation Workflow team, and having talked to many of you it sounds like community, mobile and other engineering teams would benefit from the ability of saying:
"revision N was created with tool X [version Y]"
I started capturing some use cases on this etherpad:
http://etherpad.wikimedia.org/RevisionTags
I'd like to have your input to start building requirements and evaluating possible solutions. Let me know off list if you have any question/concern
Dario _______________________________________
change_tag table?
Seems straightforward. The only thing is that we may not want to show some of those automatic tags by default, so we would have to introduce a new concept of a 'hidden' tag. There are several ways to accomplish that, a list in the configuration, adding a new column, storing it in ct_params, or just using a convention in the tag name for hidden ones.
+1 to adding to a modified version of change_tag, or something like it. While unfamiliar with the current tagging interface(s), the content of ct_tag seems arbitrary ("possible movie studio tagger" appears 4 times in enwiki.change_tag.ct_tag out of >2mil rows) and it probably makes sense to keep machine tagging automatically added at the time of an edit distinct from the apparent post-edit human/bot annotation use of ct_tag.
Re: information on which automatic tags to hide, I don't think that should be stored with every row. Keeping that in configuration (where configuration options may consist of patterns to match) seems more appropriate.
The primary use cases for this feature appear to be around offline analysis and I'd like to see design take into account the possibility of this table existing in a separate database from the revision table at some point in the future.
-A
On Wed, Feb 15, 2012 at 10:27 AM, Platonides Platonides@gmail.com wrote:
change_tag table?
Seems straightforward. The only thing is that we may not want to show some of those automatic tags by default, so we would have to introduce a new concept of a 'hidden' tag. There are several ways to accomplish that, a list in the configuration, adding a new column, storing it in ct_params, or just using a convention in the tag name for hidden ones.
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
On Wed, Feb 15, 2012 at 12:05 PM, Asher Feldman afeldman@wikimedia.orgwrote:
+1 to adding to a modified version of change_tag, or something like it. While unfamiliar with the current tagging interface(s), the content of ct_tag seems arbitrary ("possible movie studio tagger" appears 4 times in enwiki.change_tag.ct_tag out of >2mil rows) and it probably makes sense to keep machine tagging automatically added at the time of an edit distinct from the apparent post-edit human/bot annotation use of ct_tag.
I'm going to jump in here and explain what change_tags actually is.
In 2009, while developing the Abuse Filter, I wanted a way to mark suspicious edits for human or bot review on the basis of abuse filter heuristics and rules.
I ended up developing the change_tags infrastructure, hoping to use it as a *generic* framework for marking edits in various ways. Currently you can filter Recent Changes, Contributions and Logs by their tags, and the tags appear on those logs, RC and contributions, generally in parentheses.
In the three years since I introduced the feature, AbuseFilter has been the only user of that functionality, and because the community could add arbitrary tags to filters, all the tags are currently community-added AbuseFilter tags.
I have some hopes that we could use change_tags for things other than AbuseFilter, but my understanding is that last time we tried this the community felt like the infrastructure was being "intruded on". Perhaps some modifications to the infrastructure could allow abuse filter and other tags to coexist in the ecosystem.
—Andrew
Andrew Garrett wrote:
I have some hopes that we could use change_tags for things other than AbuseFilter, but my understanding is that last time we tried this the community felt like the infrastructure was being "intruded on". Perhaps some modifications to the infrastructure could allow abuse filter and other tags to coexist in the ecosystem.
Uh, I think the community was mostly upset that you didn't provide any tag management interface. At all.
It's been a while, but the last time I looked, there was no way to add tags, remove tags, or modify tags. I think there are still a number of revisions on the English Wikipedia that have been tagged by the AbuseFilter extension with text like "potentially libelous addition" or other incendiary comments of that nature, with no means of removal for false positives.
I don't know if it was intentional (and I imagine it wasn't), but your reply about the community's feelings toward whatever infrastructure you're referring to reads like a bit of a slap in the face. The scare quotes didn't help.
The tagging system was poorly implemented. That's why it's been under-utilized.
MZMcBride
On Wed, Feb 15, 2012 at 3:24 PM, MZMcBride z@mzmcbride.com wrote:
Andrew Garrett wrote:
I have some hopes that we could use change_tags for things other than AbuseFilter, but my understanding is that last time we tried this the community felt like the infrastructure was being "intruded on". Perhaps some modifications to the infrastructure could allow abuse filter and
other
tags to coexist in the ecosystem.
Uh, I think the community was mostly upset that you didn't provide any tag management interface. At all.
It's been a while, but the last time I looked, there was no way to add tags, remove tags, or modify tags. I think there are still a number of revisions on the English Wikipedia that have been tagged by the AbuseFilter extension with text like "potentially libelous addition" or other incendiary comments of that nature, with no means of removal for false positives.
I don't know if it was intentional (and I imagine it wasn't), but your reply about the community's feelings toward whatever infrastructure you're referring to reads like a bit of a slap in the face. The scare quotes didn't help.
The tagging system was poorly implemented. That's why it's been under-utilized.
I caught Max online to talk about this.
I want to clarify that I was talking specifically about the possibility of using the AbuseFilter tagging interface in other extensions. For example, to tag changes made using particular tools or features, rather than being upset that the community had not used the tagging interface as much as I might have hoped. There are several technical shortcomings which make other uses of change tagging likely to "intrude" (scare quotes because I'm not sure that I have the right word) on the current community use of the tagging feature. Among them are my failure to secure namespacing for change tags, and the fact that tags are displayed in some way or other after items in logs unconditionally. The lack of a tag management interface is one, but it's a work-intensive problem that requires design work and does not address the idea of using tags in other software contexts – though it does open up new (and possibly helpful) uses of change tagging to the community as well as allowing some cleanup work to take place.
I also want to make sure that I reinforce my qualification on the comment that the community felt that change tagging was being "intruded upon". It's something that I heard somewhere and isn't intended to mean that we need to "work on" the community. I'm intending to say that some further work is needed to genericise the feature so that Abuse Filter and other tagging infrastructure can coexist – my point about the community objecting wasn't intended to imply that these hypothetical/mythological objectors were being unreasonable.
Hope this makes more sense to you than it does to me. :-)
—Andrew
wikitech-l@lists.wikimedia.org