Bináris,
On Thu, Nov 26, 2009 at 18:27, Bináris wikiposta@gmail.com wrote:
Hi!
2009/11/23 Chris Watkins chriswaterguy@appropedia.org
I'm using replace.py to create wikilinks. Usually I want to select only the
first occurrence of the search string, and my command works fine for this.
I don't understand that, how do you select only the first one? For me, replace.py either changes each instance within a page, or nothing.
In the command I use, look at the end of the search and replace strings:
python replace.py -regex "(?si)\b((?:CCAT|Campus Center for Appropriate Technology))\b(.*$)" "[[\1]]\2" -exceptinsidetag:link -exceptinsidetag:hyperlink -exceptinsidetag:header -exceptinsidetag:nowiki -exceptinsidetag:ref -excepttext:"(?si)[[((?:CCAT|Campus Center for Appropriate Technology)[|]])" -namespace:0 -namespace:102 -namespace:4 -summary:"[[Appropedia:Wikilink bot]] adding double square brackets to: CCAT|Campus Center for Appropriate Technology." -log -xml:currentdump.xml
Notice that the -regex parameter is used, and the search text ends with (.*$), which matches the entire rest of the article. Thus that text is not searched again. It is replaced in the replace string by \2, which I think means the second string from the search term.
I heard this tip from this mailing list over a year ago, and also from #regex on freenode - irc://irc.freenode.net/regex , which is an active and very helpful place to get regex help.
As far as I understand, at this opont replace.py gives the command to wikipedia.py: new_text = wikipedia.replaceExcept(new_text, old, new, exceptions,
allowoverlap=self.allowoverlap) So the solution should be in wikipedia.py.
Cool. Anyone have an idea what we can do with wikipedia.py?
Thanks Chris