Please do not reply to this email, it was automatically generated. If you would like to comment on the issue, visit this URL:
https://jira.toolserver.org/browse/PYWP-9
create library for diffs, expand userlib ----------------------------------------
Key: PYWP-9 URL: https://jira.toolserver.org/browse/PYWP-9 Project: pywikipedia Issue Type: New Feature Environment: Wikipedia:en (for now) Reporter: David Assignee: David Priority: Minor Attachments: kal.patch
Add a new library, editlib, for parsing diffs. It will parse the diff html page, extracting info like editor, time, edit summary, etc, and creating lists of the lines and words marked as added and deleted. More details can be found in the docstrings.
For use with editlib, some new functions are also added to userlib. These will get the user's registration date, the number of edits the user has made, and the number of vandalism warnings on the user's talk page.
Only Wikipedia:en is supported - some of the regular expressions rely on English text. Within that scope, I've done reasonably thorough testing on the editlib functions, and somewhat less thorough testing on the new functions for userlib.
All of this is designed for use with automated vandalism detection. I am working on a bot to do just that - User:Kalbot - but it's still in an early experimental stage. I am User:DKalkin on the English Wikipedia
Please do not reply to this email, it was automatically generated. If you would like to comment on the issue, visit this URL:
https://jira.toolserver.org/browse/PYWP-9
David updated PYWP-9: ---------------------
Attachment: kal.patch
Patch for review.
create library for diffs, expand userlib
Key: PYWP-9 URL: https://jira.toolserver.org/browse/PYWP-9 Project: pywikipedia Issue Type: New Feature Environment: Wikipedia:en (for now) Reporter: David Assignee: David Priority: Minor Attachments: kal.patch
Add a new library, editlib, for parsing diffs. It will parse the diff html page, extracting info like editor, time, edit summary, etc, and creating lists of the lines and words marked as added and deleted. More details can be found in the docstrings. For use with editlib, some new functions are also added to userlib. These will get the user's registration date, the number of edits the user has made, and the number of vandalism warnings on the user's talk page. Only Wikipedia:en is supported - some of the regular expressions rely on English text. Within that scope, I've done reasonably thorough testing on the editlib functions, and somewhat less thorough testing on the new functions for userlib. All of this is designed for use with automated vandalism detection. I am working on a bot to do just that - User:Kalbot - but it's still in an early experimental stage. I am User:DKalkin on the English Wikipedia