I'm using replace.py to create wikilinks. Usually I want to select only the
first occurrence of the search string, and my command works fine for this.
But sometimes, the first hit is not suitable (e.g. it's part of a book or
course title, so I don't want to add the wikilink). If I choose n for no,
the bot goes to the next page.
Is there a way I can skip to the next occurrence in the same page? I'm
guessing it will need a modified version of replace.py, so that it gives an
extra option besides ([y]es, [N]o, [e]dit, open in [b]rowser, [a]ll,
[q]uit)
The actual command I'm using is:
python replace.py -regex "(?si)\b((?:FOO1|FOO2))\b(.*$)
" "[[\\1]]\\2" -exceptinsidetag:link -exceptinsidetag:hyperlink
-exceptinsidetag:header -exceptinsidetag:nowiki -exceptinsidetag:ref
-excepttext:"(?si)\[\[((?:FOO1|FOO2)[\|\]])" -namespace:0 -namespace:102
-namespace:4 -summary:"[[Appropedia:Wikilink bot]] adding double square
brackets to: FOO1|FOO2." -log -xml:currentdump.xml
Many thanks!
--
Chris Watkins
Appropedia.org - Sharing knowledge to build rich, sustainable lives.
blogs.appropedia.org
identi.ca/appropedia
twitter.com/appropedia
I want to generate a list of matches for a search, but not do anything to
the page.
E.g. I want to list all pages that contain "redirect[[:Category", but I
don't want to modify the pages.
I guess that it's possible to modify redirect.py (I don't speak python, but
it shouldn't be hard) and run it with -log. But maybe there's a simpler way?
Thanks in advance.
--
Chris Watkins
Appropedia.org - Sharing knowledge to build rich, sustainable lives.
blogs.appropedia.orgcommunity.livejournal.com/appropedia
identi.ca/appropedia
twitter.com/appropedia
Hi!
Do you have any idea why, using replace.py on some large dumps, I get
this error message:
C:\pywikipedia>replace.py -xml:enwiki-20091128-pages-articles.xml
Please enter the text that should be replaced: impossibletofindword
Please enter the new text: found
Please enter another text that should be replaced, or press Enter to start:
The summary message will default to: Robot: Automated text
replacement (-impossibletofindword +found
)
Press Enter to use this default message, or enter a description of the
changes your bot will make: test
Reading XML dump...
Traceback (most recent call last):
File "C:\pywikipedia\pagegenerators.py", line 847, in __iter__
for page in self.wrapped_gen:
File "C:\pywikipedia\pagegenerators.py", line 779, in
DuplicateFilterPageGenerator
for page in generator:
File "C:\pywikipedia\replace.py", line 218, in __iter__
for entry in self.parser:
File "C:\pywikipedia\xmlreader.py", line 295, in new_parse
for rev in self._parse(event, elem):
File "C:\pywikipedia\xmlreader.py", line 304, in _parse_only_latest
yield self._create_revision(revision)
File "C:\pywikipedia\xmlreader.py", line 341, in _create_revision
redirect=self.isredirect
File "C:\pywikipedia\xmlreader.py", line 64, in __init__
self.username = username.strip()
AttributeError: 'NoneType' object has no attribute 'strip'
'NoneType' object has no attribute 'strip'
I updated pywikipedia to the last revision with no results.
As you can see it does not seem to be user-fixes.py or regex-related.
Thanks in advance!
Davide Bolsi
Hi Russel,
the main reason not to join to the rewrite branch is, I did not got it running yet. I get an importError for simplejson. And I have no idea seting PYTHONPATH playing with idle. Whereas the trunk is easy to use: install python, download the bot and expand it, run it. This is the usability I would expect.
Most of the scripts are out of date since they are modified in trunk but not actualized at rewrite. I guess both forks have to be developed in parallel for a while until all (main) scripts are merged. I could supporting the rewrite development but since I could not test that stuff I wouldn't.
However, I have reservations about the effect that the development for older mw versions are cut.
Regards
----- Original Nachricht ----
Von: Russell Blau <russblau(a)imapmail.org>
An: Pywikipedia discussion list <pywikipedia-l(a)lists.wikimedia.org>
Datum: 30.03.2010 16:18
Betreff: [Pywikipedia-l] Request for feedback on rewrite branch
> I am at a point where it would be helpful to have some feedback from other
> Pywikipedia users about the future of the rewrite branch. As those who
> watch the SVN commits know, I have not had as much time to work on this
> lately, and have to prioritize what time I do spend on it.
>
> For those who have used the rewrite branch, what (if anything) needs to be
> done to it to get you to use it exclusively and retire the old wikipedia.py
>
> system? What is missing? What is broken? What is present but could be
> improved?
>
> For those who have chosen not to use the rewrite branch, why not? What
> might lead you to take another look?
>
> And then, I'm sure there are many whose reaction to this post has been,
> "What's the rewrite branch?" I don't know what to ask you, so feel free to
>
> move on to the next message.
>
> Most critically, is there any reason to continue development of the trunk
> once the rewrite branch is at a point where most users are ready to switch
> to it?
>
> -- Russ
>
>
> _______________________________________________
> Pywikipedia-l mailing list
> Pywikipedia-l(a)lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/pywikipedia-l
>
Traumziele - von Beschreibung bis Buchung jetzt kompakt auf den Reise-Seiten von Arcor.de! http://www.arcor.de/rd/footer.reise
I need some help sorting out a couple of issues with pywikipedia.
Everything is on an XP machine. Python 2.7, with the latest build of
pywikipedia from 8/30/2010. From all appearances everything is correct.
I can log in and run some scripts without a problem. I ran catall.py
without any issues, changed categories on pages. However.....!
For some reason I keep getting this error on many of the scripts,
category.py, weblinkchecker.py, interwiki.py, etc
"Received incomplete XML data" then the sleep count starts.
I am at a loss as to what is happening. I can even edit with the GUI
interface without a problem.
Can someone help me? What am I missing?
Tom
xqt, or anbody,
A few minutes ago I uploaded the version I would like to use in
wikipedia.py, it may need only cosmetic changes. This version returns the
size and the tags of edit filters with the page history.
We in Hungarian Wikipedia use a bot that is run by 2 or 3 users
simultaneously, and runs in every 10-30 minutes. This bot now needs the
sizes of the page versions because we want to know whether an article was
significally shortened to its biggest size. I solved it with
fullVersionHistory and len(), which makes the script slow down. That's why I
wrote this modification, because MediaWiki does store the size, so we don't
need to compute is again and download all the complete versions. Tags are
also useful, we plan to use them together with edit filter.
It would be great to have this modification in wikipedia.py so that all of
us can use the same version. May I hope you accept my contribution?
Thank you!
2010/8/30 SourceForge.net <noreply(a)sourceforge.net>
> You can respond by visiting:
>
> https://sourceforge.net/tracker/?func=detail&atid=603141&aid=3054755&group_…
>
>
> Initial Comment:
> Would it be possible to have the labels made by the edit filters in
> Page.getVersionHistoryTable, Page.getVersionHistory and
> Page.fullVersionHistory? There may be several of them for one edit.
>
> Comment By: Bináris (binbot)
> Date: 2010-08-30 23:19
>
> I attached a possible solution, please use it. Credits to Tgr and Hunyadym
> from huwiki.
>
>
--
Bináris
I tried to run a add_text.py in case of adding new section on user talks today.
After few minutes I noticed that add_text.py do not respect my
"cosmetic_changes = False" in config.py and user-config.py.
Add_text.py did what it should and simply move interwikilinks to the
end of page.
Please fix that :) I wanna decide whether I want to use cc or not.
Patrol
I've set the default behavior of featured.py to -top and introduced the new option -side for placing the templates next to its iw-links because interwiki.py isn't able to handle the second option yet and I heard from some trouble while a bot was locating these templates beside the links.
It's probably not a good idea to place these template according the desire of the bot operater. The beavior should be described in the family file. But Is there any community which wants the template on the right side of the link?
Greetings
xqt
----- Original Nachricht ----
Von: info(a)gno.de
An: jhsoby(a)gmail.com
Datum: 24.08.2010 17:32
Betreff: Aw: Re: [Pywikipedia-l] interwiki.py not cooperating with
featured.py
> Maybe we should deactivate this option and make the -top maintenance to the
> default since iw-bots could'nt handle it yet. If this is fixed, the game
> should be determining the rules for placing the templates on top or beside
> iw-links for each wiki in the family file instead giving each operator the
> choice for that.
>
--
Der Newskiosk
Von Sommerloch keine Spur: Alle Top-News der großen Tageszeitungen aus Wirtschaft, Politik, Sport, Lifestyle und mehr im News-Kiosk auf arcor.de.
http://www.arcor.de/rd/footer.newskiosk