Hi Daniel,

Changing the loop to the below tells me the first problematic pageid is 28644448, which is the character \x85.

>>> for each_article in cat.articles(namespaces=(0)):
...     try:
...         print(each_article.title(withNamespace=True), each_article.pageid)
...     except pywikibot.exceptions.InvalidTitle:
...         print(each_article.pageid)
...         raise

str.strip() removes this character resulting an empty string, so the exception is raised. (page.py#L5666-L5670)


On Mon, Jun 18, 2018 at 1:23 PM Daniel Glus <danielhglus@gmail.com> wrote:
Hi all,

I'm getting a strange InvalidTitle error while iterating through each of the articles in the English Wikipedia's "Unprintworthy redirects" category using the .articles() function.

In particular, if you run this code:

import pywikibot
site = pywikibot.Site("en", "wikipedia"); site.login()
cat = pywikibot.Category(site, "Category:Unprintworthy redirects")
for each_article in cat.articles(namespaces=(0)):
    print(each_article.title(withNamespace=True), each_article.pageid)

Then it'll run for a while, printing out a bunch of titles and page IDs, and then crash:

Traceback (most recent call last):
  File "/data/project/apersonbot/test-redir-bann.py", line 5, in <module>
    print(each_article.title(withNamespace=True), each_article.pageid)
  File "/shared/pywikipedia/core/pywikibot/tools/__init__.py", line 1446, in wrapper
    return obj(*__args, **__kw)
  File "/shared/pywikipedia/core/pywikibot/page.py", line 322, in title
    title = self._link.canonical_title()
  File "/shared/pywikipedia/core/pywikibot/page.py", line 5737, in canonical_title
    if self.namespace != Namespace.MAIN:
  File "/shared/pywikipedia/core/pywikibot/page.py", line 5698, in namespace
  File "/shared/pywikipedia/core/pywikibot/page.py", line 5669, in parse
    raise pywikibot.InvalidTitle("The link does not contain a page "
pywikibot.exceptions.InvalidTitle: The link does not contain a page title
CRITICAL: Closing network session.

Any ideas? I don't think this is expected behavior, but I could be wrong.

- Daniel
pywikibot mailing list