I have submitted a patch that clarifies what uFFFD is: https://gerrit.wikimedia.org/r/80865

Apart from that, I think this is a site-specific issue, and thus consider this issue solved :-)


[bugs:#1658] “Title contains illegal char (\uFFFD)” with existing page

Status: closed-wont-fix
Labels: character encoding
Created: Sun Aug 25, 2013 08:00 AM UTC by Adrián Chaves Fernández
Last Updated: Sun Aug 25, 2013 03:30 PM UTC
Owner: nobody

This is happening with the following existing page: http://techbase.kde.org/Localization/fy/Fryske_kompjûterwurden

Traceback (most recent call last):
File "maintenance.py", line 81, in
main()
File "maintenance.py", line 77, in main
bot.run()
File "/home/gallaecio/fontes/rodela/scripts/replace.py", line 326, in run
for page in self.generator:
File "/home/gallaecio/fontes/rodela/pywikibot/pagegenerators.py", line 799, in PreloadingGenerator
for page in generator:
File "/home/gallaecio/fontes/rodela/pywikibot/pagegenerators.py", line 749, in DuplicateFilterPageGenerator
for page in generator:
File "/home/gallaecio/fontes/rodela/pywikibot/data/api.py", line 706, in iter
yield self.result(item)
File "/home/gallaecio/fontes/rodela/pywikibot/data/api.py", line 780, in result
p = pywikibot.Page(self.site, pagedata['title'], pagedata['ns'])
File "/home/gallaecio/fontes/rodela/pywikibot/init.py", line 249, in wrapper
return method(args, **kw)
File "/home/gallaecio/fontes/rodela/pywikibot/init.py", line 249, in wrapper
return method(
args, **kw)
File "/home/gallaecio/fontes/rodela/pywikibot/page.py", line 77, in init
self._link = Link(title, source=source, defaultNamespace=ns)
File "/home/gallaecio/fontes/rodela/pywikibot/page.py", line 2958, in init
raise pywikibot.Error("Title contains illegal char (\uFFFD)")


Sent from sourceforge.net because Pywikipedia-bugs@lists.wikimedia.org is subscribed to https://sourceforge.net/p/pywikipediabot/bugs/

To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/pywikipediabot/admin/bugs/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.