Good point.
I did not look for false negatives because of lack of time. That would have required me to
read the whole content of the pages, and look for items that I thought were questionable
eventhough they weren't highlighted by the system.
I agree with you that false negatives may be just as important as false positives.
It's just more work to evaluate that metric.
Alain
From: wiki-research-l-bounces(a)lists.wikimedia.org
[mailto:wiki-research-l-bounces@lists.wikimedia.org] On Behalf Of Ward Cunningham
Sent: December 20, 2007 11:01 AM
To: Research into Wikimedia content and communities
Subject: Re: [Wiki-research-l] Wikipedia colored according to trust
Alain -- Is it true that although you've seen 3x to 15x false positive, that you did
not see any false negatives? By false negative I would mean a questionable item that was
not highlighted. Maybe you weren't looking for these? Best regards. -- Ward
__________________
Ward Cunningham
503-432-5682
On Dec 20, 2007, at 6:12 AM, Desilets, Alain wrote:
Here is my feedback based on looking at a few pages on topics that I know very well.
Agile Software Development
·
http://wiki-trust.cse.ucsc.edu/index.php/Agile_software_development
· Not bad. I counted 13 highlighted items, 5 of which I would say are
questionable.
Usability
·
http://wiki-trust.cse.ucsc.edu/index.php/Usability
· Not as good. 14 highlighted items 3 of which I would say are questionable.
Open Source Software
·
http://wiki-trust.cse.ucsc.edu/index.php/Open_source_software
· Not so good either. 23 highlighted items, 3 of which I would say are
questionable.
This is a very small sample, but it's all I have time to do. It will be interesting to
see how other people rate the precision of the highlightings on a wider set of topics.
Based on these three examples, it's not entirely clear to me that this system would
help me identify questionable items in topics that I am not so familiar with.
Are you planning to do a larger scale evaluation with human judges? An issue in that kind
of study is to avoid favourable or disfavourable bias on the part of the judges. Also, you
have to make sure that your algorithm is doing better than random guessing (in other
words, there may be so many questionable phrases in a wiki page that random guessing would
be bound to guess right ounce out of every say, 5 times). One way to avoid these issues
would be to produce pages where half of the highlightings are produced by your system, and
the other half are highlighting a randomly selected contiguous contribution by a single
author.
I think this is really interesting work worth doing, btw. I just don't know how useful
it is in its current state.
Cheers,
Alain Désilets
_______________________________________________
Wiki-research-l mailing list
Wiki-research-l(a)lists.wikimedia.org
http://lists.wikimedia.org/mailman/listinfo/wiki-research-l