https://bugzilla.wikimedia.org/show_bug.cgi?id=73534
Bug ID: 73534 Summary: missing language in family causes exception in Page.langlinks Product: Pywikibot Version: core (2.0) Hardware: All OS: All Status: NEW Severity: critical Priority: Unprioritized Component: General Assignee: Pywikipedia-bugs@lists.wikimedia.org Reporter: jayvdb@gmail.com Web browser: --- Mobile Platform: ---
On en.wowwiki:
ERROR: testLinks (tests.page_tests.TestPageObject) ---------------------------------------------------------------------- Traceback (most recent call last): File "tests/page_tests.py", line 469, in testLinks for p in mainpage.langlinks(): File "pywikibot/page.py", line 1189, in langlinks self._langlinks = list(self.iterlanglinks(include_obsolete=True)) File "pywikibot/site.py", line 2987, in pagelanglinks source=self) File "pywikibot/page.py", line 4386, in langlinkUnsafe link._site = pywikibot.Site(lang, source.family.name) File "pywikibot/__init__.py", line 573, in Site _sites[key] = interface(code=code, fam=fam, user=user, sysop=sysop) File "pywikibot/site.py", line 1422, in __init__ BaseSite.__init__(self, code, fam, user, sysop) File "pywikibot/site.py", line 451, in __init__ % (self.__code, self.__family.name)) UnknownSite: Language nn does not exist in family wowwiki
https://bugzilla.wikimedia.org/show_bug.cgi?id=73534
John Mark Vandenberg jayvdb@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Blocks| |70936
https://bugzilla.wikimedia.org/show_bug.cgi?id=73534
--- Comment #1 from John Mark Vandenberg jayvdb@gmail.com --- On en.vikidia
ERROR: testLinks (tests.page_tests.TestPageObject) ---------------------------------------------------------------------- Traceback (most recent call last): File "tests/page_tests.py", line 469, in testLinks for p in mainpage.langlinks(): File "pywikibot/page.py", line 1185, in langlinks self._langlinks = list(self.iterlanglinks(include_obsolete=True)) File "pywikibot/site.py", line 2983, in pagelanglinks source=self) File "pywikibot/page.py", line 4382, in langlinkUnsafe link._site = pywikibot.Site(lang, source.family.name) File "pywikibot/__init__.py", line 573, in Site _sites[key] = interface(code=code, fam=fam, user=user, sysop=sysop) File "pywikibot/site.py", line 1422, in __init__ BaseSite.__init__(self, code, fam, user, sysop) File "pywikibot/site.py", line 451, in __init__ % (self.__code, self.__family.name)) UnknownSite: Language de does not exist in family vikidia
https://bugzilla.wikimedia.org/show_bug.cgi?id=73534
Fabian CommodoreFabianus@gmx.de changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |CommodoreFabianus@gmx.de
--- Comment #2 from Fabian CommodoreFabianus@gmx.de --- Well in the case of vikidia it assumes that the language code 'de' is a part of the family, but the Special:Interwiki page reveals that it's actually another wiki (http://grundschulwiki.zum.de/wiki/Hauptseite , although I don't know if they are affiliated): http://en.vikidia.org/wiki/Special:Interwiki
And the mainpage of the English Vikidia shows an interwiki link to 'de:': http://en.vikidia.org/w/index.php?title=Main_Page&action=edit
I'm not sure what the solution there is, because it seems like a method which analyses it without any prior checking. The most flexible way would be to use 'APISite.interwiki()' which would tell that there is no family for grundschulwiki.zum.de.
https://bugzilla.wikimedia.org/show_bug.cgi?id=73534
--- Comment #3 from John Mark Vandenberg jayvdb@gmail.com --- (In reply to Fabian from comment #2)
Well in the case of vikidia it assumes that the language code 'de' is a part of the family, but the Special:Interwiki page reveals that it's actually another wiki (http://grundschulwiki.zum.de/wiki/Hauptseite , although I don't know if they are affiliated): http://en.vikidia.org/wiki/Special:Interwiki
They are affiliated.
Wowwiki is the same: the 'family' is split across several domains - see our family file for wowwiki.
https://bugzilla.wikimedia.org/show_bug.cgi?id=73534
--- Comment #4 from Fabian CommodoreFabianus@gmx.de --- Okay I'm checking the source code about the support for 'from_url'. It currently supports all pages which uses the "<code>.url" scheme which vikidia is not following.
But otherwise it should be possible to just add an entry 'de' to the langs dictionary of vikidata.
It also appears that the wowwiki family also needs to be checked if they support 'from_url'.
https://bugzilla.wikimedia.org/show_bug.cgi?id=73534
--- Comment #5 from Fabian CommodoreFabianus@gmx.de --- Okay nevermind, from_url is flexible enough for that, so are there any caveats by simply adding the missing languages?
https://bugzilla.wikimedia.org/show_bug.cgi?id=73534
--- Comment #6 from John Mark Vandenberg jayvdb@gmail.com --- The problem is that adding missing languages isnt possible after the library is released into pypi. Options include: 1. automatically find new subdomains in the Family layer (e.g. https://gerrit.wikimedia.org/r/#/c/171616/), or 2. Load Site objects which are not in Family.langs (https://gerrit.wikimedia.org/r/#/c/170931/), or 3. dont package family files with the library on pypi. They could be a separate package. (this sounds like it should be done anyway).
https://bugzilla.wikimedia.org/show_bug.cgi?id=73534
--- Comment #7 from Fabian CommodoreFabianus@gmx.de --- Well we (or pywikibot for that matter) can't know if a site is a part of a family. How could we know that the site linked with 'de' on en.vikidia is in the same family as vikidia itself.
We could assume that ISO language codes are part of the family, as those are by default shown on the sidbar. But apart from that we don't: e.g. test.wikidata which is in the wikidata family.
https://bugzilla.wikimedia.org/show_bug.cgi?id=73534
--- Comment #8 from John Mark Vandenberg jayvdb@gmail.com --- Assuming ISO language codes are part of the family would be quite a sophisticated strategy. That type of logic would be easily applied when creating the Link object; e.g. creating a Link object even if no Site object can be created.
A dumber version of that is for the family to register multiple domains / regexes in a class variables, and the family class then assumes any matching domain name is a member of the family, and create Site objects accordingly.
I expect we want to add a few classes to help us group types of families, and the functionality they contain. The most distinct type of family is the one with ISO codes for different languages of the project. MutlilangFamily / ISOLangFamily ? those families usually include a non-ISO-code project, e.g. meta.anarchopedia.org, beta.wikiversity.org, and www.wikisource.org , however the last two could be/should be given the 'mul' ISO language code, and treated differently of course. mul.wikiversity.org (doesnt work) and mul.wikisource.org (redirecter only). bug 41807/ bug 62717/ etc.
pywikipedia-bugs@lists.wikimedia.org