Hi all,
I have just installed the rewrite branch, and I have adapted the misspelling.py script for it. As I'm waiting for my new SSH public key to get added to the SVN server, could someone else please commit it to the rewrite branch for me?
I tested it on de: and en:, but it probably wouldn't be a bad idea to check my changes. After all, it's my first adaptation to the rewrite branch, and I haven't done PyWikiBot development for quite some time.
Daniel
node:pywikipedia kudu$ python replace.py -transcludes:sometemplate
-regex 'someregex' 'somereplacement'
unicode test: triggers problem #3081100
Getting references to [[Template:sometemplate]] via API...
Getting 1 pages from wiki:wiki...
Traceback (most recent call last):
File "replace.py", line 825, in <module>
main()
File "replace.py", line 816, in main
bot.run()
File "replace.py", line 399, in run
new_text = self.doReplacements(new_text)
File "replace.py", line 342, in doReplacements
allowoverlap=self.allowoverlap)
File "~/pywikipedia/pywikibot/textlib.py", line 175, in replaceExcept
match.group(groupID) + \
TypeError: coercing to Unicode: need string or buffer, NoneType found
~K
Hi Merlijn,
I've seen it during discussion with Nikerabbit on tw.net (look at the first LQ-threat according pywikipedia) and I am working on a pythonic version of the plural-gettext.txt. I am using lambda function for that like this one (for mk-wiki):
plural = lambda n: 0 if (n == 1 or n%10 == 1) else 1
I think this is the easiest way.
But we have a remaining problem found on table2wiki. Now we have 3 messages. One message is used if we have no warning, and we have two messages either for one or a lot of warnings. I guess we could keep the first "table2wiki-no-warning" message and merge the two messages "table2wiki-one-warning" and "table2wiki-warnings" using {{Plural:}} tag. It is also possible to use the first message every time and append the warning message if needed.
Greetings
----- Original Nachricht ----
Von: Merlijn van Deen <valhallasw(a)arctus.nl>
An: pywikipedia-l(a)lists.wikimedia.org
Datum: 25.08.2011 19:46
Betreff: Re: [Pywikipedia-l] [Pywikipedia-svn] SVN: [9448]
trunk/pywikipedia/pywikibot/i18n.py
> Hello xqt,
>
> On 21 August 2011 15:49, <xqt(a)svn.wikimedia.org> wrote:
>
> > + value. At the moment, we have only one plural_func = x: x!= 1 yet.
> > Multiple
> > + PLURAL tags are not supported (yet).
> >
>
> Good to see you're working on plural support. Have you seen
> http://svn.wikimedia.org/viewvc/mediawiki/trunk/extensions/Translate/data/pl
> ural-gettext.txt?view=markup
> ,
> which lists the plural possibilities for each language. Unfortunately,
> python does not support the a ? b : c notation, so it's not possible to
> copy the code 1-on-1.
>
> Alternatively, there is
> http://svn.wikimedia.org/viewvc/mediawiki/trunk/extensions/Translate/data/pl
> ural-cldr.yaml?view=markup
> ,
> but that is fairly unreadable IMO.
>
> Good luck!
>
> Best,
> Merlijn
>
>
> --------------------------------
>
> _______________________________________________
> Pywikipedia-l mailing list
> Pywikipedia-l(a)lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/pywikipedia-l
>
Arcor empfiehlt: Mal über die Karriere nachdenken! Wissenswertes und Nützliches finden Sie hierzu unter http://www.arcor.de/content/finanzen_job/job_karriere/bewerbung_karriere/ra…
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
If this is also an issue with section detection within pages
you could (if you like) also consider to use the code given
in 'getSections' [1]...
[1]
https://fisheye.toolserver.org/browse/drtrigon/pywikipedia/dtbext/dtbext_wi…
Greetings
DrTrigon
Am 03.09.2011 13:58, schrieb xqt(a)svn.wikimedia.org:
> http://www.mediawiki.org/wiki/Special:Code/pywikipedia/9494
>
> Revision: 9494 Author: xqt Date: 2011-09-03 11:58:48 +0000
> (Sat, 03 Sep 2011) Log Message: ----------- reverrevert r3147 due
> to bug #2989218; check for italic code in headings.TODO: use a
> better regex to find it.
>
> Modified Paths: -------------- trunk/pywikipedia/wikipedia.py
>
> Modified: trunk/pywikipedia/wikipedia.py
> ===================================================================
>
>
- --- trunk/pywikipedia/wikipedia.py 2011-09-03 11:17:47 UTC (rev 9493)
> +++ trunk/pywikipedia/wikipedia.py 2011-09-03 11:58:48 UTC (rev
> 9494) @@ -66,7 +66,6 @@ within a non-wiki-markup section of text
> decodeEsperantoX: decode Esperanto text using the x convention.
> encodeEsperantoX: convert wikitext to the Esperanto x-encoding. -
> sectionencode: encode text for use as a section title in
> wiki-links. findmarker(text, startwith, append): return a string
> which is not part of text expandmarker(text, marker, separator):
> return marker string expanded @@ -654,7 +653,7 @@ self._contents =
> contents hn = self.section() if hn: - m =
> re.search("=+ *%s *=+" % hn, self._contents) + m
> = re.search("=+[ ']*%s[ ']*=+" % hn, self._contents) if verbose and
> not m: output(u"WARNING: Section does not exist: %s" %
> self.aslink(forceInterwiki = True)) # Store any exceptions for
> later reference @@ -779,8 +778,8 @@ else: raise
> IsRedirectPage(redirtarget) if self.section(): - # TODO:
> What the hell is this? Docu please. - m =
> re.search("\.3D\_*(\.27\.27+)?(\.5B\.5B)?\_*%s\_*(\.5B\.5B)?(\.27\.27+)?\_*\.3D"
> % re.escape(self.section()),
> sectionencode(pageInfo['revisions'][0]['*'],self.site().encoding()))
>
>
+ m = re.search("=+[ ']*%s[ ']*=+" % re.escape(self.section()),
> + pageInfo['revisions'][0]['*']) if not
> m: try: self._getexception @@ -920,8 +919,8 @@ else: raise
> IsRedirectPage(redirtarget) if self.section(): - # TODO:
> What the hell is this? Docu please. - m =
> re.search("\.3D\_*(\.27\.27+)?(\.5B\.5B)?\_*%s\_*(\.5B\.5B)?(\.27\.27+)?\_*\.3D"
> % re.escape(self.section()),
> sectionencode(text,self.site().encoding())) + m =
> re.search("=+[ ']*%s[ ']*=+" % re.escape(self.section()), +
> text) if not m: try: self._getexception @@ -4140,8 +4139,7 @@
> page2._startTime = time.strftime('%Y%m%d%H%M%S', time.gmtime()) if
> section: - m =
> re.search("\.3D\_*(\.27\.27+)?(\.5B\.5B)?\_*%s\_*(\.5B\.5B)?(\.27\.27+)?\_*\.3D"
>
>
- - % re.escape(section),
sectionencode(text,page2.site().encoding()))
> + m = re.search("=+[ ']*%s[ ']*=+" %
> re.escape(section), text) if not m: try: page2._getexception @@
> -4302,7 +4300,7 @@ # Use the data loading time. page2._startTime =
> time.strftime('%Y%m%d%H%M%S', time.gmtime()) if section: -
> m =
> re.search("\.3D\_*(\.27\.27+)?(\.5B\.5B)?\_*%s\_*(\.5B\.5B)?(\.27\.27+)?\_*\.3D"
> % re.escape(section), sectionencode(text,page2.site().encoding()))
> + m = re.search("=+[ ']*%s[ ']*=+" %
> re.escape(section), text) if not m: try: page2._getexception @@
> -4531,10 +4529,6 @@ break return text
>
> -def sectionencode(text, encoding): - """Encode text so that it
> can be used as a section title in wiki-links.""" - return
> urllib.quote(text.replace("
> ","_").encode(encoding)).replace("%",".") - ######## Unicode
> library functions ########
>
> def UnicodeToAsciiHtml(s):
>
>
> _______________________________________________ Pywikipedia-svn
> mailing list Pywikipedia-svn(a)lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/pywikipedia-svn
>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
iEYEARECAAYFAk5ijR8ACgkQAXWvBxzBrDBNNQCgve2/z/SUa3bUNd625ibUKG/G
sEMAn2/LtRfr9kvdV1UX+aVKL9MQZwl8
=9anJ
-----END PGP SIGNATURE-----
Hi DrTrigon,
"wiki" is reserved by Family.known_families. You know you get the related dbName of a site just by Site.dbName() which returns "dewiki_p" for "wikipedia:de". Thereby for historic reasons, the databases are called xxwiki instead of xxwikipedia for Wikipedias.
You are using side effects doing your dbname2wikilink() conversion and I am sure you are able to modificate this stuff with stuff.replace("wiki:", "wikipedia:") if needed. In other words (your wrote me) "There should be one-- and preferably only one --__obvious__ way to do it" and btw "Although that way may not be obvious at first unless you're Dutch" (again: PEP20, The Zen of Python)
Sali ;)
xqt
----- Original Nachricht ----
Von: "Dr. Trigon" <dr.trigon(a)surfeu.ch>
An: pywikipedia-l(a)lists.wikimedia.org
Datum: 30.08.2011 19:02
Betreff: Re: [Pywikipedia-l] 'wiki' as synonym for 'wikipedia' in family?
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> > "wiki" is being use as the interwiki link to the original wiki,
> > WikiWikiWeb at http://c2.com/cgi/wiki, so you shouldn't use that
> > for anything related to Wikipedia. You are getting it returned
> > since Wikipedias can use it as an interwiki link to the
> > WikiWikiWeb, i.e. [[:wiki:WelcomeVisitors]] becomes
> > http://c2.com/cgi/wiki?WelcomeVisitors
>
> So this means essentially 'wiki' in toolserver DB (like e.g. 'dewiki')
> does not refer to the same as 'wiki' in pywikipediabot and interwiki...
> (strange...)
>
> But would also explain this 'inconsistency'... ;)
>
> Thanks and Greetings
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.11 (GNU/Linux)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
>
> iEYEARECAAYFAk5dF68ACgkQAXWvBxzBrDCi8ACg0YV5W6uEDDY61xcgxB9qsKq8
> 8KsAn0kHTE6IwqZZxZI8Lb/9Dk6E7ciP
> =dKzI
> -----END PGP SIGNATURE-----
>
> _______________________________________________
> Pywikipedia-l mailing list
> Pywikipedia-l(a)lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/pywikipedia-l
>