bump.

Is it possible to do it this way?:
  1. Recognize the pattern where the term is already wikilinked, and then skip that page; and then
  2. If the wikilinked term doesn't already exist, then perform the search and replace operation.
If so, how could I do that? Do I need a different bot from replace.py?

Thanks,
Chris

On Fri, Aug 15, 2008 at 20:12, Chris Watkins <chriswaterguy@appropedia.org> wrote:
Context:

Thanks to the helpful people on this list, I've now got a replace.py bot which successfully adds wikilinks to key terms. E.g. to wikilink "sustainability", the command I'm using is (from the CLI, within the Pywikipediabot directory):

python replace.py -regex "(?s)sustainability(.*$)" "[[sustainability]]\\1" -xml:currentdump.xml -exceptinsidetag:link -exceptinsidetag:hyperlink -exceptinsidetag:header -namespace:0 -namespace:4 -namespace:102

This code finds the first occurrence of the term sustainability that is not wikilinked, and replaces it with [[sustainability]]. (I don't understand the regex stuff, but I can copy and paste.)

Question:

If the first occurrence of the term sustainability is already wikilinked, it goes on to wikilink the second occurrence. I actually only want the first term linked, so I would prefer that it skips the page in this case.

Any ideas?

Thanks!


--
Chris Watkins (a.k.a. Chriswaterguy)

My email inbox is oh so full, so don't be offended if my emails are short & to the point :-).


Appropedia.org - Sharing knowledge to build rich, sustainable lives.

Blog: chriswaterguy.livejournal.com/

Buying at Amazon, eBay etc? Start at http://appropedia.maatiam.com and a percentage of your purchase supports Appropedia - at no extra cost.



--
Chris Watkins (a.k.a. Chriswaterguy)

Appropedia.org - Sharing knowledge to build rich, sustainable lives.

Blog: chriswaterguy.livejournal.com/


Aiming for emails of 5 sentences or less - http://five.sentenc.es/

Buying at Amazon, eBay etc? Start at http://appropedia.maatiam.com and a percentage of your purchase supports Appropedia - at no extra cost.