The new and improved version of the copy and detection bot that we at [[WP: MED]] have been using for nearly a year [ https://en.wikipedia.org/wiki/User:EranBot/Copyright here] is nearly ready to be expanded to other topic areas.
It can be found here [ https://en.wikipedia.org/wiki/User:EranBot/Copyright/rc]. If you install the common.js code it will give you buttons to click to indicate follow up of concerns. Additionally one can sort the edits in question by WikiProject. We are working to set up auto-archiving such that once concerns are dealt with they will be removed from the main list.
We also want to have automatic compilation of data such as the frequency of true positives and false positives generated by the bot. A blacklist of sites that are know mirrors of Wikipedia is here [ https://en.wikipedia.org/wiki/User:EranBot/Copyright/Blacklist]. As this list is improved / expanded the accuracy of the bot will improve. Many thanks to [[User:ערן]] for his amazing work.
The bot also has the potential to work in other languages.
Hi, James.
Is the source code available anywhere? IF you want to try your bot in other languages, I could help you with testing in Russian Wikipedia :)
Best regards. rubin16
2015-04-03 12:07 GMT+03:00 James Heilman jmh649@gmail.com:
The new and improved version of the copy and detection bot that we at [[WP: MED]] have been using for nearly a year [ https://en.wikipedia.org/wiki/User:EranBot/Copyright here] is nearly ready to be expanded to other topic areas.
It can be found here [ https://en.wikipedia.org/wiki/User:EranBot/Copyright/rc]. If you install the common.js code it will give you buttons to click to indicate follow up of concerns. Additionally one can sort the edits in question by WikiProject. We are working to set up auto-archiving such that once concerns are dealt with they will be removed from the main list.
We also want to have automatic compilation of data such as the frequency of true positives and false positives generated by the bot. A blacklist of sites that are know mirrors of Wikipedia is here [ https://en.wikipedia.org/wiki/User:EranBot/Copyright/Blacklist]. As this list is improved / expanded the accuracy of the bot will improve. Many thanks to [[User:ערן]] for his amazing work.
The bot also has the potential to work in other languages.
-- James Heilman MD, CCFP-EM, Wikipedian
The Wikipedia Open Textbook of Medicine www.opentextbookofmedicine.com _______________________________________________ Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe
Hi James
I often suspect copy-paste and find exact matches of the text elsewhere. However, whereas one can painstakingly (unless there is a trick that I am not aware of) ascertain when text was enetered into an article, it is not always possible to know when the other text first appeared on the internet to know for sure who coppied who. From my limited knowledge, I believe that some trace of the date of upload must be retained somewhere in the code - will this bot be able to pick up on that and provide a date?
Thanks and congratulations to all involved and for sharing.
Regards,
Rui
2015-04-03 11:07 GMT+02:00 James Heilman jmh649@gmail.com:
The new and improved version of the copy and detection bot that we at [[WP: MED]] have been using for nearly a year [ https://en.wikipedia.org/wiki/User:EranBot/Copyright here] is nearly ready to be expanded to other topic areas.
It can be found here [ https://en.wikipedia.org/wiki/User:EranBot/Copyright/rc]. If you install the common.js code it will give you buttons to click to indicate follow up of concerns. Additionally one can sort the edits in question by WikiProject. We are working to set up auto-archiving such that once concerns are dealt with they will be removed from the main list.
We also want to have automatic compilation of data such as the frequency of true positives and false positives generated by the bot. A blacklist of sites that are know mirrors of Wikipedia is here [ https://en.wikipedia.org/wiki/User:EranBot/Copyright/Blacklist]. As this list is improved / expanded the accuracy of the bot will improve. Many thanks to [[User:ערן]] for his amazing work.
The bot also has the potential to work in other languages.
-- James Heilman MD, CCFP-EM, Wikipedian
The Wikipedia Open Textbook of Medicine www.opentextbookofmedicine.com _______________________________________________ Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe
wikimedia-l@lists.wikimedia.org