On Thu, Nov 26, 2009 at 18:27, Bináris <wikiposta(a)gmail.com> wrote:
2009/11/23 Chris Watkins <chriswaterguy(a)appropedia.org>
I'm using replace.py to create wikilinks. Usually I want to select only the
first occurrence of the search string, and my
command works fine for this.
I don't understand that, how do you select only the first one? For me,
replace.py either changes each instance within a page, or nothing.
In the command I use, look at the end of the search and replace strings:
python replace.py -regex "(?si)\b((?:CCAT|Campus Center for Appropriate
Technology))\b(.*$)" "[[\\1]]\\2" -exceptinsidetag:link
-exceptinsidetag:hyperlink -exceptinsidetag:header -exceptinsidetag:nowiki
-exceptinsidetag:ref -excepttext:"(?si)\[\[((?:CCAT|Campus Center for
Appropriate Technology)[\|\]])" -namespace:0 -namespace:102 -namespace:4
-summary:"[[Appropedia:Wikilink bot]] adding double square brackets to:
CCAT|Campus Center for Appropriate Technology." -log -xml:currentdump.xml
Notice that the -regex parameter is used, and the search text ends with
(.*$), which matches the entire rest of the article. Thus that text is not
searched again. It is replaced in the replace string by \\2, which I think
means the second string from the search term.
I heard this tip from this mailing list over a year ago, and also from
#regex on freenode - irc://irc.freenode.net/regex
, which is an active and
very helpful place to get regex help.
As far as I understand, at this opont replace.py gives
the command to
new_text = wikipedia.replaceExcept(new_text, old, new,
So the solution should be in wikipedia.py.
Cool. Anyone have an idea what we can do with wikipedia.py?
- Sharing knowledge to build rich, sustainable lives.