Hi!
My old problem is that repalce.py can't write the pages to work on into a
file on my disk. I have used a modificated version for years that does no
changes but writes the title of the involved pages to a subpage on Wikipedia
in automated mode, and then I can make the replacements from that page much
more quickly than directly from dump or living Wikipedia. This is slow and
generates a plenty of dummy edits.
In other words, replace.py has a tool to get the titles from a file (-file)
or from a wikipage (-links), but has no tool to generate this file.
Now I am ready to rewrite it. This way we can start it and the bot will find
all the possible articles to work on and save the titles without editing
Wikipedia (and without artificial delay), meanwhile we can have the lunch or
run a marathon or sleep. Then we make the replacements from this with -file.
My idea is that replace.py should have two new parameters:
-save writes the results into a new file instead of editing articles. It
overwrites existing file without notice.
-saveappend writes into a file or appends to the existing one.
OR:
-save writes and appends (primary mode)
-savenew writes and overwrites
The help is here:
http://docs.python.org/howto/unicode.html#reading-and-writing-unicode-data
So we have to import codecs.
My script is:
articles=codecs.open('cikkek.txt','a',encoding='utf-8')
...
tutuzuzu=u'# %s\n' %page.aslink() <-- needs rewrite to the new syntax
articles.write(unicode(tutuzuzu)) <-- needs further testing, if nicode() is
really needed
articles.flush()
It works fine except '\n' is a unix-styled newline that has to be converted
by lfcr.py in order to make it readable with notepad.exe.
This is with constant filename, that should be developed to get from command
line.
Your opinions before I begin?
--
Bináris
I want to read a special page with Page.get(). The message is:
File "C:\Program Files\Pywikipedia\wikipedia.py", line 601, in get
raise NoPage('%s is in the Special namespace!' % self.aslink())
pywikibot.exceptions.NoPage
What is the solution?
--
Bináris
I'm using replace.py to create wikilinks. Usually I want to select only the
first occurrence of the search string, and my command works fine for this.
But sometimes, the first hit is not suitable (e.g. it's part of a book or
course title, so I don't want to add the wikilink). If I choose n for no,
the bot goes to the next page.
Is there a way I can skip to the next occurrence in the same page? I'm
guessing it will need a modified version of replace.py, so that it gives an
extra option besides ([y]es, [N]o, [e]dit, open in [b]rowser, [a]ll,
[q]uit)
The actual command I'm using is:
python replace.py -regex "(?si)\b((?:FOO1|FOO2))\b(.*$)
" "[[\\1]]\\2" -exceptinsidetag:link -exceptinsidetag:hyperlink
-exceptinsidetag:header -exceptinsidetag:nowiki -exceptinsidetag:ref
-excepttext:"(?si)\[\[((?:FOO1|FOO2)[\|\]])" -namespace:0 -namespace:102
-namespace:4 -summary:"[[Appropedia:Wikilink bot]] adding double square
brackets to: FOO1|FOO2." -log -xml:currentdump.xml
Many thanks!
--
Chris Watkins
Appropedia.org - Sharing knowledge to build rich, sustainable lives.
blogs.appropedia.org
identi.ca/appropedia
twitter.com/appropedia
I want to generate a list of matches for a search, but not do anything to
the page.
E.g. I want to list all pages that contain "redirect[[:Category", but I
don't want to modify the pages.
I guess that it's possible to modify redirect.py (I don't speak python, but
it shouldn't be hard) and run it with -log. But maybe there's a simpler way?
Thanks in advance.
--
Chris Watkins
Appropedia.org - Sharing knowledge to build rich, sustainable lives.
blogs.appropedia.orgcommunity.livejournal.com/appropedia
identi.ca/appropedia
twitter.com/appropedia
Hi all;
I think that there is an error in xmlreader.py. When parsing a full revision
XML (in this case[1]), using this code[2] (look at the try-catch, it writes
when fails) I get correctly username, timestamp and revisionid, but
sometimes, the page title and the page id are None or empty string.
The first error is:
['', None, 'QuartierLatin1968', '2004-10-10T04:24:14Z', '4267']
But if we do:
7za e -bd -so kwwiki-20100926-pages-meta-history.xml.7z 2>/dev/null | egrep
-i '2004-10-10T04::14Z' -C20
We get this[3], which is OK, the page title and the page id are available in
the XML, but not correctly parsed. And this is not the only page title and
page it that fails.
Perhaps I have missed something, because I'm learning to parsing XML. Sorry
in that case.
Regards,
emijrp
[1]
http://download.wikimedia.org/kwwiki/20100926/kwwiki-20100926-pages-meta-hi…
[2] http://pastebin.ca/1951930
[3] http://pastebin.ca/1951937
Hi everyone,
Currently translation of messages outputted to the wiki's is done in
source. This is quite tedious to maintain. A good step would be to move
translations to http://www.translatewiki.org so it is much easier to get
things translated. But how? We have several options.
1. Gettext: http://en.wikipedia.org/wiki/GNU_gettext Supported by
translatewiki, build in support in python
(http://docs.python.org/library/gettext.html), but it doesn't seem to be
really targeted on working on multiple languages the same time like we
do in for example interwiki.py.
2. Big python dictionary: Easy to implement in pywikipedia but isn't
really a standard so translatewiki needs to be modified. You can see a
test in r8684
3. Properties files: http://en.wikipedia.org/wiki/.properties These
files are used in Java programs to store translations. Supported at
translatewiki. We could probably parse these with something like
http://docs.python.org/library/configparser.html . Downside is that
these files are not utf-8 encoded.
4. YAML : http://en.wikipedia.org/wiki/YAML . Supported by
translatewiki, need a lib to use it in pywikipedia.
So what do you guys think? What should we do? Anyone willing to help
implement this?
We've talked about this several times. I hope we manage to implement
something this time :-)
Maarten
Now the script works, but I still don't understand!
I was advised to get the site and a page ad its _getActionUser('block')
method. The but ran as BinBott and at _this point_ it asked for my sysop
password. I wrote it. After this I ran my blocking script again, and it
worked fine!!! But it didn't ask me for the password itself. I try to copy
the relevant lines of my script here:
tag=u'Teszteszter'
user=userlib.User(site,tag)
if user.block(expiry='infinite', reason=blockmessage):
pywikibot.output(u'%s blokkolva van.' % user.name())
else:
pywikibot.output(u'%s blokkolása nem sikerült.' % user.name())
2010/10/23 <klaus.seiler(a)arcor.de>
> If you are logged in as BinBot it wouldn't asked for the sysop password.
> Either log in as sysop or create a new directory and use one for the bot and
> one for sysop operations.
>
> Greetings
> xqt
> ----------
> But my config has:
> usernames['wikipedia']['hu'] = u'BinBott'
> sysopnames['wikipedia']['hu'] = u'Bináris'
> Once it is not logged in, it should ask for a password, shouldn't it?
>
--
Bináris
Sorry, I don't understand this. :-( The bot HAD worked previously with the
same config, and something has gone wrong in the past few weeks. This is
just for what sysopnames is created, isn't it?
2010/10/23 <klaus.seiler(a)arcor.de>
> If you are logged in as BinBot it wouldn't asked for the sysop password.
> Either log in as sysop or create a new directory and use one for the bot and
> one for sysop operations.
>
> Greetings
> xqt
> ----------
> But my config has:
> usernames['wikipedia']['hu'] = u'BinBott'
> sysopnames['wikipedia']['hu'] = u'Bináris'
> Once it is not logged in, it should ask for a password, shouldn't it?
>
> 2010/10/21 <info(a)gno.de>
>
> > ...seems to me the bot is not logged in as sysop
> >
> >
> --
> Bináris
> _______________________________________________
> Pywikipedia-l mailing list
> Pywikipedia-l(a)lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/pywikipedia-l
>
--
Bináris
FYI
-------- Originele bericht --------
Onderwerp: [Wikitech-l] Migrating SVN from mayflower to formey
Datum: Sat, 23 Oct 2010 09:47:53 -0500
Van: Ryan Lane <rlane32(a)gmail.com>
Antwoord-naar: Wikimedia developers <wikitech-l(a)lists.wikimedia.org>
Aan: Wikimedia developers <wikitech-l(a)lists.wikimedia.org>
In the next hour or two we'll be migrating SVN to a new server.
Nothing is changing from the usage perspective. During this time SVN
may be inaccessible for a few minutes. Let me know if you are having
an access issue, and I'll fix it for you.
Respectfully,
Ryan Lane
_______________________________________________
Wikitech-l mailing list
Wikitech-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Please compare this to my previously uploaded solution:
https://sourceforge.net/tracker/?func=detail&aid=3054514&group_id=93107&ati…
Wouldn't it be nicer?
Bináris
2010/10/21 <amir(a)svn.wikimedia.org>
> Revision: 8679
> Author: amir
> Date: 2010-10-21 13:11:09 +0000 (Thu, 21 Oct 2010)
>
> Log Message:
> -----------
> If summary is more than 200 characters will be created copyright violation
>
> Modified Paths:
> --------------
> trunk/pywikipedia/catlib.py
>
> Modified: trunk/pywikipedia/catlib.py
> ===================================================================
> --- trunk/pywikipedia/catlib.py 2010-10-21 13:03:29 UTC (rev 8678)
> +++ trunk/pywikipedia/catlib.py 2010-10-21 13:11:09 UTC (rev 8679)
> @@ -458,7 +458,11 @@
> wikipedia.output('Moving text from %s to %s.' % (self.title(),
> targetCat.title()))
> authors = ', '.join(self.contributingUsers())
> creationSummary = wikipedia.translate(wikipedia.getSite(),
> msg_created_for_renaming) % (self.title(), authors)
> - targetCat.put(self.get(), creationSummary)
> + #Maybe sometimes length of summary is more than 200 characters
> and thus will not be shown.so bot must listify authors in another place
> + if len(creationSummary)>200:
> + targetCat.put(self.get()+u"/n/nAuthors: %s" % authors,
> creationSummary)
> + else:
> + targetCat.put(self.get(), creationSummary)
> return True
>
> #Like copyTo above, except this removes a list of templates (like
> deletion templates) that appear in
>
>
>
> _______________________________________________
> Pywikipedia-svn mailing list
> Pywikipedia-svn(a)lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/pywikipedia-svn
>