2012/1/15 Eric K
<ek79501@yahoo.com>
Hi guys,
I just installed the pywikipedia bot on my wiki yesterday. I'm new to Python but I can try learn it since I'm familiar with PHP. It would take me a while though to make this first bot since I'm new to the language. The tasks are pretty straightforward. I would like the bot to run without any user input and do all of this by itself:
1. For every page on the wiki, check if it has these three characters: ( , ) , : . Any page containing any of these characters (curly brackets and colon) will be moved to a new title. The original title is var_1.
2. For the new title, brackets are simply deleted, and the : (colon) is replaced with a " - " (a dash with a space on each side). The new title generated is var_2.
3. Insert this text at
the top of this page: {{page_rename|var_1}}, and save page.
4. Find any existing links on the site to this page which would be in the format of [[var_1]], and change them to [[var_2|var_1]].
I don't need any menus or other functionality. Is something something pretty straightforward to make? I would appreciate any tips/help and if its something that can be made pretty easily, I would be really thankful if someone could do this for me or give me a good start.
I've looked at some of existing pywikipedia bot scripts (
basic.py,
movepages.py) but none of them would work for me and being new to Python, it would take me a long time to do what I need but in any case I will learn a lot in this first attempt.
thanks
Eric
_______________________________________________
Pywikipedia-l mailing list
Pywikipedia-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/pywikipedia-l
This is how I would do it. It is probably a hacky solution, and there may be better/more efficient ways of doing it, but it should work.
Step 1: Getting list of pages to change
Run this line:
python replace.py -regex -requiretitle:"\(|\)|:" "[A-Za-z0-9]" "test" -save:Pagestoberenamed.txt -start:!
Press "a" when it prompts.
This will not change anything, only save a list of all pages that
need to be renamed. The script assumes there is either a letter or
number in all the pages that needs to be changed.
Step 2: Put that template on top of the pages
Run this line:
python
add_text.py -up -text:"{{page_rename|{{subst:PAGENAME}}}}" -file:Pagestoberenamed.txt
Step 3: Creating list for renaming files
Open the file "Pagestoberenamed.txt" in a regex-supporting text editor and use the follow regex replacements:
Replace:
#\[\[([^:]*):([^\]]*)\]\]
with
[[\1:\2]] [[\1 - \2]]
and replace
#\[\[([^\(]*)\(([^\)]*)\)([^\]]*)\]\]
with
[[\1(\2)\3]] [[\1\2\3]]
I don't actually have a text editor that supports regex, so instead I
copypasted the contents of that file into a sandbox page, and ran the
following line:
python replace.py -page:SANDBOX -regex
"#\[\[([^:]*):([^\]]*)\]\]" "[[\1:\2]] [[\1 - \2]]"
"#\[\[([^\(]*)\(([^\)]*)\)([^\]]*)\]\]" "[[\1(\2)\3]] [[\1\2\3]]"
Save the text as Pagerenaming.txt
Hacky solution, but it should work.
Step 4: Moving the pages
Run this line:
python movepages.py -pairs:Pagerenaming.txt
It will not prompt you, it will move the pages as specified in Pagerenaming.txt
If you do not want to have redirects from the old page names, use
-noredirect as an additional argument. This may depend on how your wiki
is set up, I know Wikipedias didn't have this option until relatively
recently (and maybe it is only for administrators now).
Step 5: Fixing links
Links can be fixed using this line:
python replace.py -regex
"\[\[([^:]*):([^\]]*)\]\]" "[[\1 - \2|\1: \2]]"
"\[\[([^\(|^\[]*)\(([^\)]*)\)([^\]]*)\]\]" "[[\1\2\3|\1(\2)\3]]"
-start:!
If you think it is too slow, you can append -pt:1 to that.
With this last one you should be careful, and approve quite a few changes manually first (pressing "y" and not "a"), in case something is fishy with the regex.
Hope this helps.
--
mvh