Daviskr created this task. Daviskr added a subscriber: Daviskr. Daviskr added a project: Pywikibot-pagegenerators.
TASK DESCRIPTION When `-titleregex` is called with a namespace other than 0 (with GeneratorFactory), no pages are returned. Only pages with namespace 0 are ever returned.
In GeneratorFactory, `-titleregex` calls `RegexFilterPageGenerator` (wich is really `RegexFilter.titlefilter`) with an argument of `site.allpages()`. `allpages` defaults to namespace 0. When `getCombinedGenerator()` is called, the generator (already filled with namespaces of 0) is not considered a `pywikibot.data.api.QueryGenerator` and is filtered out with `NamespaceFilterPageGenerator` for the appropriate namespaces, either resulting empty results or only with namespaces 0.
TASK DETAIL https://phabricator.wikimedia.org/T85389
REPLY HANDLER ACTIONS Reply to comment or attach files, or !close, !claim, !unsubscribe or !assign <username>.
EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: Daviskr Cc: Aklapper, Daviskr, pywikipedia-bugs
Daviskr added a project: pywikibot-core. Daviskr set Security to none.
TASK DETAIL https://phabricator.wikimedia.org/T85389
REPLY HANDLER ACTIONS Reply to comment or attach files, or !close, !claim, !unsubscribe or !assign <username>.
EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: Daviskr Cc: Aklapper, Daviskr, jayvdb, pywikipedia-bugs
Daviskr added a comment.
Another side effect of the current setup is that it fetches all pages before it applies `limit`. This causes extreme slowdown as seen in this build https://travis-ci.org/wikimedia/pywikibot-core/jobs/45256940#L1119.
TASK DETAIL https://phabricator.wikimedia.org/T85389
REPLY HANDLER ACTIONS Reply to comment or attach files, or !close, !claim, !unsubscribe or !assign <username>.
EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: Daviskr Cc: Aklapper, Daviskr, jayvdb, pywikipedia-bugs
Ricordisamoa added a subscriber: Ricordisamoa.
TASK DETAIL https://phabricator.wikimedia.org/T85389
REPLY HANDLER ACTIONS Reply to comment or attach files, or !close, !claim, !unsubscribe or !assign <username>.
EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: Ricordisamoa Cc: Aklapper, Daviskr, Ricordisamoa, jayvdb, pywikipedia-bugs
Mpaa added a subscriber: Mpaa. Mpaa added a comment.
In https://phabricator.wikimedia.org/T85389#945688, @Daviskr wrote:
Another side effect of the current setup is that it fetches all pages (that match the regex) before it applies `limit`. This causes extreme slowdown as seen in this build https://travis-ci.org/wikimedia/pywikibot-core/jobs/45256940#L1119.
One clarification: only if a namespace different from 0 is specified, as it is done in the test, for the reason explained above: no pages will be yielded at all, so the test will end only when all the pages in the specified ns have been fetched by allpages().
TASK DETAIL https://phabricator.wikimedia.org/T85389
REPLY HANDLER ACTIONS Reply to comment or attach files, or !close, !claim, !unsubscribe or !assign <username>.
EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: Mpaa Cc: Aklapper, Daviskr, Ricordisamoa, Mpaa, jayvdb, pywikipedia-bugs
Mpaa added a comment.
https://gerrit.wikimedia.org/r/#/c/181993/
TASK DETAIL https://phabricator.wikimedia.org/T85389
REPLY HANDLER ACTIONS Reply to comment or attach files, or !close, !claim, !unsubscribe or !assign <username>.
EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: Mpaa Cc: Aklapper, Daviskr, Ricordisamoa, Mpaa, jayvdb, pywikipedia-bugs
gerritbot added a project: Patch-For-Review. gerritbot added a comment.
Change 181993 had a related patch set uploaded (by Mpaa): Pagegenerators.py: ns handling for titleregex option
https://gerrit.wikimedia.org/r/181993
https://phabricator.wikimedia.org/tag/patch-for-review/
TASK DETAIL https://phabricator.wikimedia.org/T85389
REPLY HANDLER ACTIONS Reply to comment or attach files, or !close, !claim, !unsubscribe or !assign <username>.
EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: gerritbot Cc: Aklapper, Daviskr, Ricordisamoa, Mpaa, jayvdb, pywikipedia-bugs
Ricordisamoa added subscribers: Ladsgroup, Legoktm. Ricordisamoa merged a task: T57226: -titleregex only searches mainspace.
TASK DETAIL https://phabricator.wikimedia.org/T85389
REPLY HANDLER ACTIONS Reply to comment or attach files, or !close, !claim, !unsubscribe or !assign <username>.
EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: Ricordisamoa Cc: Aklapper, Daviskr, Ricordisamoa, Mpaa, Ladsgroup, Legoktm, jayvdb, pywikipedia-bugs
jayvdb added a subscriber: jayvdb. jayvdb added a comment.
After https://phabricator.wikimedia.org/T57226 is merged, we can re-purpose this task to track the underlying problem; I suspect we want to wait until argparse has landed before fixing the real problem, as it should be much simpler then.
TASK DETAIL https://phabricator.wikimedia.org/T85389
REPLY HANDLER ACTIONS Reply to comment or attach files, or !close, !claim, !unsubscribe or !assign <username>.
EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: jayvdb Cc: Aklapper, Daviskr, Ricordisamoa, Mpaa, Ladsgroup, Legoktm, jayvdb, pywikipedia-bugs
pywikipedia-bugs@lists.wikimedia.org