Brian wrote:
I just wanted to be really clear about what I mean as
a specific
counter-example to this just being an example of "reconstructing that
rule set." Suppose you use the AbuseFilter rules on the entire history
of the wiki in order to generate a dataset of positive and negative
examples of vandalism edits. You should then *throw the rules away*
and attempt to discover features that separate the vandalism into
classes correctly, more or less in the blind.
That's precisely the case where you're attempting to reconstruct the
original rule set (or some work-alike). If you had positive and negative
examples that were actually "known good" examples of edits that really
are vandalism, and really aren't vandalism, then yes you could turn
loose an algorithm to generalize over them to discover a discriminator
between the "is vandalism" and "isn't vandalism" classes. But if
your
labels are from the output of the existing AbuseFilter, then your
training classes are really "is flagged by the AbuseFilter" and "is not
flagged by the AbuseFilter", and any machine-learning algorithm will try
to generalize the examples in a way that discriminates *those* classes.
To the extent the AbuseFilter actually does flag vandalism accurately,
you'll learn a concept approximating that of vandalism. But to the
extent it doesn't (e.g. if it systematically mis-labels certain kinds of
edits), you'll learn the same flaws.
That might not be useless--- you might recover a more concise rule set
that replicates the original performance. But if your training data is
the output of the previous rule set, you aren't going to be able to
*improve* on its performance without some additional information (or
built-in inductive bias).
-Mark