Patches item #2070033, was opened at 2008-08-23 18:37 Message generated for change (Tracker Item Submitted) made by Item Submitter You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=603140&aid=2070033...
Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: None Status: Open Resolution: None Priority: 5 Private: No Submitted By: Nobody/Anonymous (nobody) Assigned to: Nobody/Anonymous (nobody) Summary: create library for diffs, expand userlib
Initial Comment: Add a new library, editlib, for parsing diffs. It will parse the diff html page, extracting info like editor, time, edit summary, etc, and creating lists of the lines and words marked as added and deleted. More details can be found in the docstrings.
For use with editlib, some new functions are also added to userlib. These will get the user's registration date, the number of edits the user has made, and the number of vandalism warnings on the user's talk page.
Only Wikipedia:en is supported - some of the regular expressions rely on English text. Within that scope, I've done reasonably thorough testing on the editlib functions, and somewhat less thorough testing on the new functions for userlib.
All of this is designed for use with automated vandalism detection. I am working on a bot to do just that - User:Kalbot - but it's still in an early experimental stage. I am User:DKalkin on the English Wikipedia, and can be reached at dkalkin@gmail.com.
This is the first time I've submitted a patch to any sourceforge project. Apologies if I've screwed anything up.
----------------------------------------------------------------------
You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=603140&aid=2070033...
pywikipedia-l@lists.wikimedia.org