Having dabbled in this initiative a couple years back when it first started to gain some traction, I'll make some comments.
Yes, CorenSearchBot (CSB) did/does(?) operate in this space. It basically searched took the title of a new article, searched for that term via the Yahoo! Search API, and looked for nearly-exact text matches among the first results (using an edit distance metric).
Through the hard work of Jake Orlowitz and others we got free access to the TurnItIn API (academic plagiarism detection). Their tool is much more sophisticated in terms of text matching and has access to material behind many pay-walls.
In terms of Jane's concern, we are (rather, "we imagine being") primarily limited to finding violations originating at new article creation or massive text insertions, because content already on WP has been scraped and re-copied so many times.
*I want to emphasize this is a gift-wrapped academic research project*. Jake, User:Madman, and myself even began amassing ground-truth to evaluate our approach. This was nearly a chapter in my dissertation. I would be very pleased for someone to come along, build a tool of practice, and also get themselves a WikiSym/CSCW paper in the process. I don't have the free cycles to do low-level coding, but I'd be happy to advise, comment, etc. to whatever degree someone would desire. Thanks, -AW
--
Andrew G. West, PhD
Research Scientist
Verisign Labs - Reston, VA
Website: http://www.andrew-g-west.com