Bugs item #3488657, was opened at 2012-02-17 11:16 Message generated for change (Tracker Item Submitted) made by You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=603138&aid=3488657...
Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: other Group: None Status: Open Resolution: None Priority: 5 Private: No Submitted By: Yevhen Movsesov () Assigned to: Nobody/Anonymous (nobody) Summary: Cosmetic: Sign "_" replaced with space for [[http://links]]
Initial Comment: 1. In some article you can find incorrect hyperlink formatting, like this one - [[http://google.com/some_page]] 2. In this case cosmtetic_changes.py replaces this link as [[http://google.com/some page]] (space instead underscore) 3. I think it's incorrect, even if article has incorrect formatted hyperlinks 4. Example you can find in this edit as a result
python cosmetic_changes.py -lang:ru -always -page:"Микаелян, Сергей Абгарович"
http://ru.wikipedia.org/w/index.php?title=%D0%9C%D0%B8%D0%BA%D0%B0%D0%B5%D0%...
5. It lloks, that you can avoid this situation just with moving string text = self.cleanUpLinks(text) under text = self.fixSyntaxSave(text)
6. So, it looks, that correct calls should be
text = self.fixSelfInterwiki(text) text = self.standardizePageFooter(text) text = self.cleanUpSectionHeaders(text) text = self.putSpacesInLists(text) text = self.translateAndCapitalizeNamespaces(text) text = self.replaceDeprecatedTemplates(text) text = self.resolveHtmlEntities(text) text = self.validXhtml(text) text = self.removeUselessSpaces(text) text = self.removeNonBreakingSpaceBeforePercent(text) text = self.fixSyntaxSave(text) text = self.cleanUpLinks(text) text = self.fixHtml(text) text = self.fixStyle(text) text = self.fixTypo(text) text = self.fixArabicLetters(text)
7. Pywikipedia [http] trunk/pywikipedia (r9901, 2012/02/16, 22:44:36) Python 2.6.7 (r267:88850, Sep 19 2011, 13:25:28) [GCC 4.5.2] config-settings: use_api = True use_api_login = True unicode test: ok
----------------------------------------------------------------------
You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=603138&aid=3488657...