Context:

Thanks to the helpful people on this list, I've now got a replace.py bot which successfully adds wikilinks to key terms. E.g. to wikilink "sustainability", the command I'm using is (from the CLI, within the Pywikipediabot directory):

python replace.py -regex "(?s)sustainability(.*$)" "[[sustainability]]\\1" -xml:currentdump.xml -exceptinsidetag:link -exceptinsidetag:hyperlink -exceptinsidetag:header -namespace:0 -namespace:4 -namespace:102

This code finds the first occurrence of the term sustainability that is not wikilinked, and replaces it with [[sustainability]]. (I don't understand the regex stuff, but I can copy and paste.)

Question:

If the first occurrence of the term sustainability is already wikilinked, it goes on to wikilink the second occurrence. I actually only want the first term linked, so I would prefer that it skips the page in this case.

Any ideas?

Thanks!


--
Chris Watkins (a.k.a. Chriswaterguy)

My email inbox is oh so full, so don't be offended if my emails are short & to the point :-).


Appropedia.org - Sharing knowledge to build rich, sustainable lives.

Blog: chriswaterguy.livejournal.com/

Buying at Amazon, eBay etc? Start at http://appropedia.maatiam.com and a percentage of your purchase supports Appropedia - at no extra cost.