Revision: 3973
Author: wikipedian
Date: 2007-08-06 16:25:44 +0000 (Mon, 06 Aug 2007)
Log Message:
-----------
extended docu
Modified Paths:
--------------
trunk/pywikipedia/replace.py
Modified: trunk/pywikipedia/replace.py
===================================================================
--- trunk/pywikipedia/replace.py 2007-08-06 16:13:28 UTC (rev 3972)
+++ trunk/pywikipedia/replace.py 2007-08-06 16:25:44 UTC (rev 3973)
@@ -4,64 +4,52 @@
which pages might need changes either from an XML dump or a text file, or only
change a single page.
-You can run the bot with the following commandline parameters:
+These command line parameters can be used to specify which pages to work on:
--xml - Retrieve information from a local XML dump (pages_current, see
-
http://download.wikimedia.org).
- Argument can also be given as "-xml:filename".
--file - Work on all pages given in a local text file.
- Will read any [[wiki link]] and use these articles.
- Argument can also be given as "-file:filename".
--cat - Work on all pages which are in a specific category.
- Argument can also be given as "-cat:categoryname".
--page - Only edit a specific page.
- Argument can also be given as "-page:pagetitle". You can give
this
- parameter multiple times to edit multiple pages.
--ref - Work on all pages that link to a certain page.
- Argument can also be given as "-ref:referredpagetitle".
--linksearch - Retrieve all the results using Special:Linksearch.
- Argument can also be given as "-linksearch:url".
--filelinks - Works on all pages that link to a certain image.
- Argument can also be given as "-filelinks:ImageName".
--links - Work on all pages that are linked to from a certain page.
- Argument can also be given as "-links:linkingpagetitle".
--start - Work on all pages in the wiki, starting at a given page. Choose
- "-start:!" to start at the beginning.
- NOTE: You are advised to use -xml instead of this option; this is
- meant for cases where there is no recent XML dump.
--regex - Make replacements using regular expressions. If this argument
- isn't given, the bot will make simple text replacements.
--except:XYZ - Ignore pages which contain XYZ. If the -regex argument is given,
- XYZ will be regarded as a regular expression.
--summary:XYZ - Set the summary message text for the edit to XYZ, bypassing the
- predefined message texts with original and replacements inserted.
--fix:XYZ - Perform one of the predefined replacements tasks, which are given
- in the dictionary 'fixes' defined inside the file fixes.py.
- The -regex argument and given replacements will be ignored if
- you use -fix.
- Currently available predefined fixes are:
- * HTML - convert HTML tags to wiki syntax, and fix XHTML
- * syntax - try to fix bad wiki markup.
- * case-de - fix upper/lower case errors in German
- * grammar-de - fix grammar and typography in German
--namespace:n - Number of namespace to process. The parameter can be used
- multiple times. It works in combination with all other
- parameters, except for the -start parameter. If you e.g. want to
- iterate over all user pages starting at User:M, use
- -start:User:M.
--always - Don't prompt you for each replacement
--recursive - Recurse replacement until possible.
--nocase - Use case insensitive regular expressions.
--allowoverlap - When occurences of the pattern overlap, replace all of them.
- Warning! Don't use this option if you don't know what you're
- doing, because it might easily lead to infinite loops then.
-other: - First argument is the old text, second argument is the new text.
- If the -regex argument is given, the first argument will be
- regarded as a regular expression, and the second argument might
- contain expressions like \\1 or \g<name>.
+¶ms;
-NOTE: Only use either -xml or -file or -page, but don't mix them.
+ -xml Retrieve information from a local XML dump (pages-articles
+ or pages-meta-current, see
http://download.wikimedia.org).
+ Argument can also be given as "-xml:filename".
+ -page Only edit a specific page.
+ Argument can also be given as "-page:pagetitle". You can
+ give this parameter multiple times to edit multiple pages.
+Furthermore, the following command line parameters are supported:
+
+ -regex Make replacements using regular expressions. If this argument
+ isn't given, the bot will make simple text replacements.
+ -except:XYZ Ignore pages which contain XYZ. If the -regex argument is
+ given, XYZ will be regarded as a regular expression.
+ -summary:XYZ Set the summary message text for the edit to XYZ, bypassing
+ the predefined message texts with original and replacements
+ inserted.
+ -fix:XYZ Perform one of the predefined replacements tasks, which are
+ given in the dictionary 'fixes' defined inside the file
+ fixes.py.
+ The -regex argument and given replacements will be ignored if
+ you use -fix.
+ Currently available predefined fixes are:
+ * HTML - convert HTML tags to wiki syntax, and fix XHTML
+ * syntax - try to fix bad wiki markup.
+ * case-de - fix upper/lower case errors in German
+ * grammar-de - fix grammar and typography in German
+ -namespace:n Number of namespace to process. The parameter can be used
+ multiple times. It works in combination with all other
+ parameters, except for the -start parameter. If you e.g.
+ want to iterate over all categories starting at M, use
+ -start:Category:M.
+ -always Don't prompt you for each replacement
+ -recursive Recurse replacement until possible. Be careful, this might
+ lead to an infinite loop.
+ -nocase Use case insensitive regular expressions.
+ -allowoverlap When occurences of the pattern overlap, replace all of them.
+ Be careful, this might lead to an infinite loop.
+ other: First argument is the old text, second argument is the new text.
+ If the -regex argument is given, the first argument will be
+ regarded as a regular expression, and the second argument might
+ contain expressions like \\1 or \g<name>.
+
Examples:
If you want to change templates from the old syntax, e.g. {{msg:Stub}}, to the
@@ -70,10 +58,10 @@
python replace.py -xml -regex "{{msg:(.*?)}}" "{{\\1}}"
-If you have a dump called foobar.xml and want to fix typos, e.g.
+If you have a dump called foobar.xml and want to fix typos in articles, e.g.
Errror -> Error, use this:
- python replace.py -xml:foobar.xml "Errror" "Error"
+ python replace.py -xml:foobar.xml "Errror" "Error" -namespace:0
If you have a page called 'John Doe' and want to convert HTML tags to wiki
syntax, use:
@@ -90,6 +78,12 @@
import sys, re
import wikipedia, pagegenerators,catlib, config
+# This is required for the text that is shown when you run this script
+# with the parameter -help.
+docuReplacements = {
+ '¶ms;': pagegenerators.parameterHelp
+}
+
# Imports predefined replacements tasks from fixes.py
from fixes import fixes