Expressions in Cosmetic changes

List overview All Threads
Download

newer

older

Re: [Pywikipedia-l]...

About upload multiple images in...

Jan Dudík

18 Aug 2011 18 Aug '11

2:53 a.m.

in my cosmetic_changes.py I have following lines:

text = pywikibot.replaceExcept(text, u' m. n. m.', u' m n. m.', ['comment', 'math', 'nowiki', 'pre']) text = pywikibot.replaceExcept(text, u' viz.', u' viz', ['comment', 'math', 'nowiki', 'pre'])

I think, that r in the beginning means regular expression, but u means simple text. But bot takes dot like any character, so i got e.g.

revize ->reviz má na mále -> m na mále

http://cs.wikipedia.org/w/index.php?title=Ferrari_FF&action=historysubmi...

How to channge it to correct output?

JAnD

-- -- Ing. Jan Dudík

Show replies by date

Ivaylo Balabanov

18 Aug 18 Aug

6:37 a.m.

Hi,

I don't have these lines. Where do I have to put them.

Do you know some way to upload many images in our local mediawiki site? I have tried with those python scripts, Extention:SpecialUploadLocal, Extention: SpecialMultiUploadViaZip, Java application commonist by the way it gave the same error as python scripts. Word add-in Word2MediaWikiPlus_Installer and nothing worked. I am almost desperate. I am about to write .NET application for this but I think is going to be very difficult.

Regards, Ivo -----Original Message----- From: pywikipedia-l-bounces@lists.wikimedia.org [mailto:pywikipedia-l-bounces@lists.wikimedia.org] On Behalf Of Jan Dudík Sent: Thursday, August 18, 2011 7:54 AM To: Pywikipedia discussion list Subject: [Pywikipedia-l] Expressions in Cosmetic changes

in my cosmetic_changes.py I have following lines:

text = pywikibot.replaceExcept(text, u' m. n. m.', u' m n. m.', ['comment', 'math', 'nowiki', 'pre']) text = pywikibot.replaceExcept(text, u' viz.', u' viz', ['comment', 'math', 'nowiki', 'pre'])

I think, that r in the beginning means regular expression, but u means simple text. But bot takes dot like any character, so i got e.g.

revize ->reviz má na mále -> m na mále

http://cs.wikipedia.org/w/index.php?title=Ferrari_FF&action=historysubmi...

How to channge it to correct output?

JAnD -- -- Ing. Jan Dudík

_______________________________________________ Pywikipedia-l mailing list Pywikipedia-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/pywikipedia-l

Alaric Favier

6:45 a.m.

When i try $ python cosmetic_changes.py -lang:cs -page:"Ferrari FF" this return "No changes were necessary in Ferrari FF"

Are you sure your pywikipedia is up to date ?

-- Alaric FAVIER

2011/8/18 Jan Dudík jan.dudik@gmail.com

...

in my cosmetic_changes.py I have following lines:
   text = pywikibot.replaceExcept(text, u' m. n. m.', u' m n.
m.', ['comment', 'math', 'nowiki', 'pre']) text = pywikibot.replaceExcept(text, u' viz.', u' viz', ['comment', 'math', 'nowiki', 'pre'])

I think, that r in the beginning means regular expression, but u means simple text. But bot takes dot like any character, so i got e.g.

revize ->reviz má na mále -> m na mále

http://cs.wikipedia.org/w/index.php?title=Ferrari_FF&action=historysubmi...

How to channge it to correct output?

JAnD

-- Ing. Jan Dudík

Pywikipedia-l mailing list Pywikipedia-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/pywikipedia-l

Jan Dudík

7:22 a.m.

These are *my own* lines (czech spelling) in one part:

def validXhtml(self, text): text = pywikibot.replaceExcept(text, r'(?i)<br[ /]*>', r' ', ['comment', 'math', 'nowiki', 'pre']) text = pywikibot.replaceExcept(text, r' ', r' ', ['comment', 'math', 'nowiki', 'pre']) text = pywikibot.replaceExcept(text, r' ', r' ', ['comment', 'math', 'nowiki', 'pre']) text = pywikibot.replaceExcept(text, r'', r' ', ['comment', 'math', 'nowiki', 'pre']) text = pywikibot.replaceExcept(text, r'<\br>', r' ', ['comment', 'math', 'nowiki', 'pre']) text = pywikibot.replaceExcept(text, r'{{DEFAULTSORT: ', r'{{DEFAULTSORT:', ['comment', 'math', 'nowiki', 'pre']) text = pywikibot.replaceExcept(text, u' m. n. m.', u' m n. m.', ['comment', 'math', 'nowiki', 'pre']) text = pywikibot.replaceExcept(text, u'ceřinn', u'ceřin', ['comment', 'math', 'nowiki', 'pre']) text = pywikibot.replaceExcept(text, u'newyork', u'newyor', ['comment', 'math', 'nowiki', 'pre']) text = pywikibot.replaceExcept(text, u'vikings', u'vikins', ['comment', 'math', 'nowiki', 'pre']) text = pywikibot.replaceExcept(text, u'vikingš', u'vikinš', ['comment', 'math', 'nowiki', 'pre']) text = pywikibot.replaceExcept(text, u'motlit', u'modlit', ['comment', 'math', 'nowiki', 'pre']) text = pywikibot.replaceExcept(text, u'terciální', u'terciární', ['comment', 'math', 'nowiki', 'pre']) text = pywikibot.replaceExcept(text, u'vymítit', u'vymýtit', ['comment', 'math', 'nowiki', 'pre']) text = pywikibot.replaceExcept(text, u' viz.', u' viz', ['comment', 'math', 'nowiki', 'pre']) text = pywikibot.replaceExcept(text, u'<references/>', u'<references />', ['comment', 'math', 'nowiki', 'pre']) return text

All lines works ok, but these two with dots are problematic.

2011/8/18 Alaric Favier mystou@gmail.com:

...

When i try $ python cosmetic_changes.py -lang:cs -page:"Ferrari FF" this return "No changes were necessary in Ferrari FF"

Are you sure your pywikipedia is up to date ?

Alaric FAVIER

2011/8/18 Jan Dudík jan.dudik@gmail.com

...
in my cosmetic_changes.py I have following lines:

text = pywikibot.replaceExcept(text, u' m. n. m.', u' m n. m.', ['comment', 'math', 'nowiki', 'pre']) text = pywikibot.replaceExcept(text, u' viz.', u' viz', ['comment', 'math', 'nowiki', 'pre'])

I think, that r in the beginning means regular expression, but u means simple text. But bot takes dot like any character, so i got e.g.

revize ->reviz má na mále -> m na mále

http://cs.wikipedia.org/w/index.php?title=Ferrari_FF&action=historysubmi...

How to channge it to correct output?

JAnD

-- Ing. Jan Dudík

Pywikipedia-l mailing list Pywikipedia-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/pywikipedia-l

-- -- Ing. Jan Dudík

Bináris

10:25 a.m.

2011/8/18 Jan Dudík jan.dudik@gmail.com

...

These are *my own* lines (czech spelling) in one part:

Some errors (I don't promise to find all)

...

   text = pywikibot.replaceExcept(text, r'<br\>', r'<br />',

This is equivalent to r' ', r' ', If you want to find , write r'<br\> for a literal backslash.

...

['comment', 'math', 'nowiki', 'pre']) text = pywikibot.replaceExcept(text, r'<\br>', r' ',

\b means the start or end of a word. I don't think this is what you meant. If you are looking for <\br> strings, write r'<\br>'

...

m.', ['comment', 'math', 'nowiki', 'pre']) text = pywikibot.replaceExcept(text, u' m. n. m.', u' m n.

Write . instead of . This caused your error in Ferrari.

...

['comment', 'math', 'nowiki', 'pre']) text = pywikibot.replaceExcept(text, u' viz.', u' viz',

Again, write . This is the other error. Whether or not the bot handles your expression as regex, is not sotred within the expression itself, rather somewhere else among the settings, but I never used cc. Read the description of replaceExcept for a regex switch (it is in textlib.py in pywikipedia directory).

-- Bináris

Bináris

10:28 a.m.

Jan, one more word: try to use replace.py wit user-fixes.py or fixes.py for spelling, it is MUCH more flexible. Cc is not designed for spelling corrections. As you go on and find more and more errors you have to correct, you will feel the difference more and more.

-- Bináris

Jan Dudík

19 Aug 19 Aug

8:11 a.m.

Thanks, I repaired bugs and now works it as expected.

replace.py is useful, if I want to replace all 'misspelled' in one time. But some of them are very common, so I have them in cosmetic changes for quick repair when working on interwiki....

JAnD

2011/8/18 Bináris wikiposta@gmail.com:

...

Jan, one more word: try to use replace.py wit user-fixes.py or fixes.py for spelling, it is MUCH more flexible. Cc is not designed for spelling corrections. As you go on and find more and more errors you have to correct, you will feel the difference more and more.

-- Bináris

-- -- Ing. Jan Dudík

Bináris

18 Aug 18 Aug

10:31 a.m.

2011/8/18 Bináris wikiposta@gmail.com

...

Whether or not the bot handles your expression as regex, is not *sotred *within the expression itself,

stored, of course

-- Bináris

Bináris

10:15 a.m.

2011/8/18 Jan Dudík jan.dudik@gmail.com

...

in my cosmetic_changes.py I have following lines:

I think, that r in the beginning means regular expression, but u means simple text.

Not really. r means raw text that means you don't have to double your backslashes. For example, you write r'.' for a literal dot (since '.' means any character), but without r you should have to write '\.' u means unicode, you must use it when there are non-Ascii characters in the expression. I like to use it always, it is safer. See fixes.py for examples in which these are combine: ur'blahblah'.

-- Bináris

4872

Age (days ago)

4873

Last active (days ago)

pywikipedia-l@lists.wikimedia.org

8 comments

4 participants

tags (0)

participants (4)

Alaric Favier
Bináris
Ivaylo Balabanov
Jan Dudík