I'm using replace.py to create wikilinks. Usually I want to select only the
first occurrence of the search string, and my command works fine for this.
But sometimes, the first hit is not suitable (e.g. it's part of a book or
course title, so I don't want to add the wikilink). If I choose n for no,
the bot goes to the next page.
Is there a way I can skip to the next occurrence in the same page? I'm
guessing it will need a modified version of replace.py, so that it gives an
extra option besides ([y]es, [N]o, [e]dit, open in [b]rowser, [a]ll,
[q]uit)
The actual command I'm using is:
python replace.py -regex "(?si)\b((?:FOO1|FOO2))\b(.*$)
" "[[\\1]]\\2" -exceptinsidetag:link -exceptinsidetag:hyperlink
-exceptinsidetag:header -exceptinsidetag:nowiki -exceptinsidetag:ref
-excepttext:"(?si)\[\[((?:FOO1|FOO2)[\|\]])" -namespace:0 -namespace:102
-namespace:4 -summary:"[[Appropedia:Wikilink bot]] adding double square
brackets to: FOO1|FOO2." -log -xml:currentdump.xml
Many thanks!
--
Chris Watkins
Appropedia.org - Sharing knowledge to build rich, sustainable lives.
blogs.appropedia.org
identi.ca/appropedia
twitter.com/appropedia
Hi!
Do you have any idea why, using replace.py on some large dumps, I get
this error message:
C:\pywikipedia>replace.py -xml:enwiki-20091128-pages-articles.xml
Please enter the text that should be replaced: impossibletofindword
Please enter the new text: found
Please enter another text that should be replaced, or press Enter to start:
The summary message will default to: Robot: Automated text
replacement (-impossibletofindword +found
)
Press Enter to use this default message, or enter a description of the
changes your bot will make: test
Reading XML dump...
Traceback (most recent call last):
File "C:\pywikipedia\pagegenerators.py", line 847, in __iter__
for page in self.wrapped_gen:
File "C:\pywikipedia\pagegenerators.py", line 779, in
DuplicateFilterPageGenerator
for page in generator:
File "C:\pywikipedia\replace.py", line 218, in __iter__
for entry in self.parser:
File "C:\pywikipedia\xmlreader.py", line 295, in new_parse
for rev in self._parse(event, elem):
File "C:\pywikipedia\xmlreader.py", line 304, in _parse_only_latest
yield self._create_revision(revision)
File "C:\pywikipedia\xmlreader.py", line 341, in _create_revision
redirect=self.isredirect
File "C:\pywikipedia\xmlreader.py", line 64, in __init__
self.username = username.strip()
AttributeError: 'NoneType' object has no attribute 'strip'
'NoneType' object has no attribute 'strip'
I updated pywikipedia to the last revision with no results.
As you can see it does not seem to be user-fixes.py or regex-related.
Thanks in advance!
Davide Bolsi
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Hello,
I wrote on the talk page, but wanted to ensure the most people were
aware as possible. I've recently done a bit of work to get the
pywikipediabot documentation into a more textbook-like format. I'd like
now to move it to the English Wikibooks, where it can be integrated and
linked with similar texts, like [[b:Linux Guide]] or the python books
([[b:Subject:Python programming language]]).
If this were done, I think it would help readers by providing textbooks
covering "co-requisites" in the same environment. Textbooks in other
languages could be developed on other Wikibooks language editions.
Thanks,
- -Mike
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
iEYEARECAAYFAktZwN8ACgkQst0AR/DaKHvnWgCgqa6ObbIuVkDK/NJ8iq8lDn5f
0FEAn3AnHszjvfDNmpg2C/fue5TuB6kr
=LfDx
-----END PGP SIGNATURE-----
Hi,
Could someone apply the changes in the attached patch for the Hungarian
configuration of noreferences.py?
The changes:
add to msg = 'hu':u'Hiányzó {{Források}} pótlása',
add to referencesSubstitute = 'hu': u'{{Források}}',
and change referencesTemplates= 'hu': [u'reflist',u'források'],
Thanks,
Bence Damokos