jayvdb created this task. jayvdb added subscribers: pywikipedia-bugs, jayvdb, XZise. jayvdb added a project: pywikibot-core.
TASK DESCRIPTION generating family files can create break site.interwiki .
``` $ python pwb.py generate_family_file.py http://wiki-commons.genealogy.net/Hauptseite genealogy2 Generating family file from http://wiki-commons.genealogy.net/Hauptseite
================================== api url: http://wiki-commons.genealogy.net/w/api.php MediaWiki version: 1.14.1 ==================================
Determining other languages...de en nl
There are 4 languages available. Do you want to generate interwiki links? This might take a long time. ([y]es/[N]o/[e]dit)y Loading wikis... * de... 'utf8' codec can't decode byte 0xfc in position 26478: invalid start byte * en... downloaded * nl... downloaded * de... in cache Writing pywikibot/families/genealogy2_family.py... pywikibot/families/genealogy2_family.py already exists. Overwrite? (y/n)y [jayvdb@localhost new]$ cat pywikibot/families/genealogy2_family.py # -*- coding: utf-8 -*- """ This family file was auto-generated by $Id: 2dd21e4aaf7a93cf8749be841552881a80684b52 $ Configuration parameters: url = http://wiki-commons.genealogy.net/Hauptseite name = genealogy2
Please do not commit this to the Git repository! """
from pywikibot import family
class Family(family.Family): def __init__(self): family.Family.__init__(self) self.name = 'genealogy2' self.langs = { 'nl': 'wiki-nl.genealogy.net', 'de': 'wiki-commons.genealogy.net', 'en': 'wiki-en.genealogy.net', }
def scriptpath(self, code): return { 'nl': '/w', 'de': '/w', 'en': '/w', }[code]
def version(self, code): return { 'nl': u'1.14.1', 'de': u'1.14.1', 'en': u'1.14.1', }[code] ```
That family has three different hostnames, and the keys are different to the subdomain. That might be relevant.
When I alter APISite._cache_interwikimap to re-raise the Error it catches, we see
``` $ python -m unittest tests.link_tests.TestFullyQualifiedNoLangFamilyImplicitLinkParser.test_fully_qualified_NS1_family max_retries reduced from 25 to 1 for tests ====================================================================== ERROR: test_fully_qualified_NS1_family (tests.link_tests.TestFullyQualifiedNoLangFamilyImplicitLinkParser) Test 'wikidata:testwiki:Talk:Q6' on enwp is namespace 1. ---------------------------------------------------------------------- Traceback (most recent call last): File "tests/link_tests.py", line 813, in test_fully_qualified_NS1_family link.parse() File "pywikibot/page.py", line 4189, in parse newsite = self._site.interwiki(prefix) File "pywikibot/site.py", line 692, in interwiki self._cache_interwikimap() File "pywikibot/site.py", line 676, in _cache_interwikimap site = (pywikibot.Site(url=iw['url']), 'local' in iw) File "pywikibot/__init__.py", line 564, in Site code = family.from_url(url) File "pywikibot/family.py", line 1076, in from_url '$1'.format(self._get_path_regex()), url) File "pywikibot/family.py", line 1058, in _get_path_regex 'family.'.format(self.name)) Error: Pywikibot is unable to generate an automatic path regex for the family genealogy2. It is recommended to overwrite "_get_path_regex" in that family.
---------------------------------------------------------------------- Ran 1 test in 2.645s
FAILED (errors=1) ```
TASK DETAIL https://phabricator.wikimedia.org/T85658
REPLY HANDLER ACTIONS Reply to comment or attach files, or !close, !claim, !unsubscribe or !assign <username>.
EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: jayvdb Cc: Aklapper, jayvdb, XZise, pywikipedia-bugs
gerritbot added a project: Patch-For-Review. gerritbot added a comment.
Change 182406 had a related patch set uploaded (by John Vandenberg): Fix Family._get_path_regex
https://gerrit.wikimedia.org/r/182406
https://phabricator.wikimedia.org/tag/patch-for-review/
TASK DETAIL https://phabricator.wikimedia.org/T85658
REPLY HANDLER ACTIONS Reply to comment or attach files, or !close, !claim, !unsubscribe or !assign <username>.
EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: gerritbot Cc: Aklapper, jayvdb, XZise, pywikipedia-bugs
XZise added a comment.
Ehm what is exactly broken? If the family file is not well defined interwiki links to __that__ family will cause an exception. But I don't see how it would usually cause an exception in that test.
TASK DETAIL https://phabricator.wikimedia.org/T85658
REPLY HANDLER ACTIONS Reply to comment or attach files, or !close, !claim, !unsubscribe or !assign <username>.
EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: XZise Cc: Aklapper, jayvdb, XZise, pywikipedia-bugs
Omegat added a subscriber: Omegat.
TASK DETAIL https://phabricator.wikimedia.org/T85658
REPLY HANDLER ACTIONS Reply to comment or attach files, or !close, !claim, !unsubscribe or !assign <username>.
EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: Omegat Cc: Aklapper, jayvdb, XZise, Omegat, pywikipedia-bugs
jayvdb triaged this task as "High" priority. jayvdb added a comment.
In https://phabricator.wikimedia.org/T85658#952041, @XZise wrote:
Ehm what is exactly broken? If the family file is not well defined interwiki links to __that__ family will cause an exception. But I don't see how it would usually cause an exception in that test.
The biggest problem is `__init__`'s Site() method's use of `code = family.from_url(url)` without a try block. That means that since any generated family caused from_url to fail, Site() would also fail instead of continuing to use from_url with other families.
TASK DETAIL https://phabricator.wikimedia.org/T85658
REPLY HANDLER ACTIONS Reply to comment or attach files, or !close, !claim, !unsubscribe or !assign <username>.
EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: jayvdb Cc: jayvdb, Aklapper, XZise, Omegat, pywikipedia-bugs
gerritbot added a subscriber: gerritbot. gerritbot added a comment.
Change 182406 merged by jenkins-bot: Add Family.from_url support for generated families
https://gerrit.wikimedia.org/r/182406
TASK DETAIL https://phabricator.wikimedia.org/T85658
REPLY HANDLER ACTIONS Reply to comment or attach files, or !close, !claim, !unsubscribe or !assign <username>.
EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: gerritbot Cc: gerritbot, jayvdb, Aklapper, XZise, Omegat, pywikipedia-bugs
pywikipedia-bugs@lists.wikimedia.org