jayvdb created this task.
jayvdb added a subscriber: jayvdb.
jayvdb added projects: pywikibot-core, Possible-Tech-Projects.
Restricted Application added subscribers: Aklapper, pywikipedia-bugs.
TASK DESCRIPTION
There are many features of MediaWiki that are not directly supported in #pywikibot-core.
A list of deployed extensions can be found at [[https://www.mediawiki.org/wiki/Category:Extensions_used_on_Wikimedia|Catego… used on Wikimedia]]. Some of these extensions provide functionality which is mission-critical to some of the projects, and is not yet able to be accessed via Pywikibot.
Flow is new technology which is only deployed in trials, but it will be such a large critical component that Pywikibot needs to commence implementation of it now in order to be ready for when it is deployed. See T67119
Other examples:
* T85656 : Abuse Filter
* Liquid Threads
* T57081 : Flagged Revs
* Proofread Page
* Translate
* PageTriage
* Checkuser
* ULS
* Parsoid
However, before implementing functionality in pywikibot, it is important to gather requirements for how it might be used in an automated manner, if available via pywikibot.
Project goals:
* Talk to the project communities to identify which tasks they could automate if pywikibot supported additional components/extensions.
* Add support for large MediaWiki component/extension.
* Write a bot script which automates a task for a Wikimedia community.
* Skills: Python
* Suggested micro-task:
* Possible mentors: @jayvdb
TASK DETAIL
https://phabricator.wikimedia.org/T89067
REPLY HANDLER ACTIONS
Reply to comment or attach files, or !close, !claim, !unsubscribe or !assign <username>.
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: jayvdb
Cc: pywikipedia-bugs, jayvdb, Aklapper, Qgil
XZise created this task.
XZise added a subscriber: XZise.
XZise added a project: pywikibot-core.
Restricted Application added subscribers: Aklapper, pywikipedia-bugs.
TASK DESCRIPTION
In Python 3.4.2 when `pywikibot.input(…, password=True)` is done, it doesn't output the text immediately but only after the user has entered something:
```
>>> import pywikibot
>>> x = pywikibot.input('AA', password=True)
AA >>> x
'Ha'
>>>
```
In comparison Python 2.7.8:
```
>>> import pywikibot
>>> x =pywikibot.input('AA', password=True)
AA
>>> x
u'Ha'
```
TASK DETAIL
https://phabricator.wikimedia.org/T90338
REPLY HANDLER ACTIONS
Reply to comment or attach files, or !close, !claim, !unsubscribe or !assign <username>.
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: XZise
Cc: pywikipedia-bugs, Aklapper, XZise, jayvdb
jayvdb created this task.
jayvdb added subscribers: pywikipedia-bugs, jayvdb, XZise.
jayvdb added a project: pywikibot-core.
TASK DESCRIPTION
generating family files can create break site.interwiki .
```
$ python pwb.py generate_family_file.py http://wiki-commons.genealogy.net/Hauptseite genealogy2
Generating family file from http://wiki-commons.genealogy.net/Hauptseite
==================================
api url: http://wiki-commons.genealogy.net/w/api.php
MediaWiki version: 1.14.1
==================================
Determining other languages...de en nl
There are 4 languages available.
Do you want to generate interwiki links? This might take a long time. ([y]es/[N]o/[e]dit)y
Loading wikis...
* de... 'utf8' codec can't decode byte 0xfc in position 26478: invalid start byte
* en... downloaded
* nl... downloaded
* de... in cache
Writing pywikibot/families/genealogy2_family.py...
pywikibot/families/genealogy2_family.py already exists. Overwrite? (y/n)y
[jayvdb@localhost new]$ cat pywikibot/families/genealogy2_family.py
# -*- coding: utf-8 -*-
"""
This family file was auto-generated by $Id: 2dd21e4aaf7a93cf8749be841552881a80684b52 $
Configuration parameters:
url = http://wiki-commons.genealogy.net/Hauptseite
name = genealogy2
Please do not commit this to the Git repository!
"""
from pywikibot import family
class Family(family.Family):
def __init__(self):
family.Family.__init__(self)
self.name = 'genealogy2'
self.langs = {
'nl': 'wiki-nl.genealogy.net',
'de': 'wiki-commons.genealogy.net',
'en': 'wiki-en.genealogy.net',
}
def scriptpath(self, code):
return {
'nl': '/w',
'de': '/w',
'en': '/w',
}[code]
def version(self, code):
return {
'nl': u'1.14.1',
'de': u'1.14.1',
'en': u'1.14.1',
}[code]
```
That family has three different hostnames, and the keys are different to the subdomain. That might be relevant.
When I alter APISite._cache_interwikimap to re-raise the Error it catches, we see
```
$ python -m unittest tests.link_tests.TestFullyQualifiedNoLangFamilyImplicitLinkParser.test_fully_qualified_NS1_family
max_retries reduced from 25 to 1 for tests
======================================================================
ERROR: test_fully_qualified_NS1_family (tests.link_tests.TestFullyQualifiedNoLangFamilyImplicitLinkParser)
Test 'wikidata:testwiki:Talk:Q6' on enwp is namespace 1.
----------------------------------------------------------------------
Traceback (most recent call last):
File "tests/link_tests.py", line 813, in test_fully_qualified_NS1_family
link.parse()
File "pywikibot/page.py", line 4189, in parse
newsite = self._site.interwiki(prefix)
File "pywikibot/site.py", line 692, in interwiki
self._cache_interwikimap()
File "pywikibot/site.py", line 676, in _cache_interwikimap
site = (pywikibot.Site(url=iw['url']), 'local' in iw)
File "pywikibot/__init__.py", line 564, in Site
code = family.from_url(url)
File "pywikibot/family.py", line 1076, in from_url
'\$1'.format(self._get_path_regex()), url)
File "pywikibot/family.py", line 1058, in _get_path_regex
'family.'.format(self.name))
Error: Pywikibot is unable to generate an automatic path regex for the family genealogy2. It is recommended to overwrite "_get_path_regex" in that family.
----------------------------------------------------------------------
Ran 1 test in 2.645s
FAILED (errors=1)
```
TASK DETAIL
https://phabricator.wikimedia.org/T85658
REPLY HANDLER ACTIONS
Reply to comment or attach files, or !close, !claim, !unsubscribe or !assign <username>.
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: jayvdb
Cc: Aklapper, jayvdb, XZise, pywikipedia-bugs
XZise created this task.
XZise added a subscriber: XZise.
XZise added a project: pywikibot-core.
Restricted Application added subscribers: Aklapper, pywikipedia-bugs.
TASK DESCRIPTION
As the overall trend should be towards using AutoFamily I want to list everything which looks unnecessary as it can be fetched via the API here.
The following could be replaced by API calls:
* `namespacesWithSubpage`: This should be already possible via the `Namespace` class, as it's in [[https://www.mediawiki.org/w/api.php?action=query&meta=siteinfo&siprop=names…]] returned as `subpages=""`. Maybe the Namespace class should get properties like `has_subpages` which make it easier to use.
* `linktrails` and `linktrail()`: At least in newer wikis it is reported via the API [[https://www.mediawiki.org/w/api.php?action=query&meta=siteinfo&siprop=gener…]] although I'm not sure how much a “MediaWiki:Linktrail” does change/overwrite it. Main problem there is to parse it into a Python regex (see also [[https://gerrit.wikimedia.org/r/184216/|Gerrit 184216]])
* `known_families` and `get_known_families()`: Could be replaced by using the interwiki map. There is only one usage in the library which could be easily replaced.
* `nocapitalize`: This is namespace specific and already represented in the `Namespace` class (see `Link.parse`). The primary use of it, is when creating a APISite instance that the username is not capitalized. But according to [[https://www.mediawiki.org/wiki/Manual:$wgCapitalLinkOverrides|Manual:$wgCap…]] the User namespace is never affected by that (and thus always False).
* `interwiki_forward` and `interwiki_forwarded_from`: This is can be done via the API to determine to which project `en` for
example redirects (on commons for example to the Wikipedia).
* `obsolete`: This is an odd beast with an ambiguous definition. There is a patch to make it obsolete [[https://gerrit.wikimedia.org/r/187358/|Gerrit 187358]].
* `languages_by_size`: There is a patch, but that only works for some families efficiently. There is also a patch to do that manually which would work on any but is relatively slow as it needs to contact every code.
* `protocol()`: The AutoFamily automatically defines it. Maybe there should be a simpler approach which just reads a `use_https` boolean attribute. So whenever someone needs a normal Family class they can use `use_https = True`. Alternatively `generate_family_file.py` should add that always (and then with the correct defined protocol from the URL) so the user easily sees what needs to be done.
* `ignore_certificate_error()`: Should be similar when a normal Family class is used (boolean attribute and `generate_family_file` does add it correctly set)
* `scriptpath()`: Is in the siteinfo (like the linktrail) but obviously to get to the API that needs to be defined. AutoFamily (with the complete URL) already supply it.
* `versionnumber()` and `version()`: These is already deprecated, and if it needs to be configured, `force_version()` should be used.
* `shared_image_repository()`: There is a patch ([[https://gerrit.wikimedia.org/r/181416/|Gerrit 181416]]) to make it more dynamic, but unfortunately it doesn't work always, so there is still some dynamic configuration needed.
* `shared_data_repository()`: There is already a bug report here (TODO: get ID) and depends on how multiple repositories are represented in the future.
* `server_time()`: Already deprecated with a site method.
There also some configuration variables. These should be moved into config2.py with a “global default” a possibility to overwrite it for each family with a specific setting. One problem could be when they need to be dynamic and executable code.
* `interwiki_attop`
* `interwiki_on_one_line`
* `interwiki_text_separator`
* `category_attop`
* `category_on_one_line`
* `category_text_separator`
* `categories_last`
* `interwiki_putfirst`
* `interwiki_putfirst_doubled`
* `ssl_pathprefix()`: Although it depends on how the siteinfo then changes, it could be retrieved from there (same problem as `scriptpath()`).
* `nicepath()`
* `rcstream_host()`
* `_get_path_regex(self)`: That needs to change especially if a site is accessible via multiple hostnames or it should be never defined.
* `maximum_GET_length()`
* `force_version()`
* `code2encoding()` and `encoding()`: It depends what encoding is meant. The communication with the server on HTTP level? If so shouldn't the server answer accordingly if there is no valid encoding. It could then use that encoding. If it is really required (and not UTF-8) we could still implement it via a configuration variable.
* `post_get_convert()` and `pre_put_convert()`: This should be probably rewritten into a list of converters and then via a configuration some converters could enabled.
Some of the methods are static and don't need to be changed/overwritten and thus don't need to be removed:
* `language_groups`: Although this could be probably statically defined and doesn't change with other families
* `hostname()` and `ssl_hostname()`: Those are set correctly in AutoFamily and the question is, if they need to be overridden in normal Family instances.
* `path()`, `querypath()`, `apipath()`, `nice_get_address()`: Those probably never change and are always relative to `scriptpath()`/`nicepath()`
* `from_url()`
I'm not sure about these however:
* `category_redirect_templates`, `category_redirects()`, `get_cr_templates()`
* `use_hard_category_redirects`
* `disambiguation_templates`, `disambig()`
* `cross_projects`
* `cross_projects_cookies`
* `cross_projects_cookie_username`
* `cross_allowed`
* `disambcatname`
* `ldapDomain`
* `crossnamespace`
* `iw_keys()`: This basically list all codes and the codes of from `interwiki_forward`. Could be probably replaced by a better interwiki map implementation which allows to get the complete mapping (instead of the current way to get only one definition).
* `_addlang()`
* `dbName()`
* `code2encodings()` and `encodings()`: Those two are somewhat strange, because they return by default the same value as the singular variants (not even wrapping them in a list). But even then why does it need to define multiple encodings?
* `isPublic()`
TASK DETAIL
https://phabricator.wikimedia.org/T89451
REPLY HANDLER ACTIONS
Reply to comment or attach files, or !close, !claim, !unsubscribe or !assign <username>.
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: XZise
Cc: pywikipedia-bugs, Aklapper, XZise, jayvdb
jayvdb created this task.
jayvdb claimed this task.
jayvdb added a subscriber: jayvdb.
jayvdb added a project: Pywikibot-tests.
TASK DESCRIPTION
aspects.py provides similar functionality to the package testscenarios. https://pypi.python.org/pypi/testscenarios
Features of aspects that are missing should be added to testscenarios, so that we could adopt it by essentially renaming TestCase.sites to TestCase,scenarios , plus any parts of aspects that may not be able to be easily generalised.
TASK DETAIL
https://phabricator.wikimedia.org/T85899
REPLY HANDLER ACTIONS
Reply to comment or attach files, or !close, !claim, !unsubscribe or !assign <username>.
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: jayvdb
Cc: Aklapper, jayvdb, pywikipedia-bugs
jayvdb created this task.
jayvdb added subscribers: Evanontario, jayvdb, pywikipedia-bugs, Jsalsman, Halfak.
jayvdb added a project: pywikibot-core.
Restricted Application added a subscriber: Aklapper.
TASK DESCRIPTION
wikiwho currently depends on https://bitbucket.org/halfak/wikimedia-utilities , which is great at xml dump processing , with limited API support.
it would be useful to integrate wikiwho with pywikibot to work on live revisions from the wiki.
TASK DETAIL
https://phabricator.wikimedia.org/T89763
REPLY HANDLER ACTIONS
Reply to comment or attach files, or !close, !claim, !unsubscribe or !assign <username>.
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: jayvdb
Cc: Halfak, Jsalsman, jayvdb, Aklapper, Evanontario, pywikipedia-bugs
jayvdb created this task.
jayvdb claimed this task.
jayvdb added subscribers: pywikipedia-bugs, valhallasw, siebrand, Nemo_bis, jayvdb.
jayvdb added projects: Pywikibot-i18n, i18n.
TASK DESCRIPTION
JSON files have been added to pywikibot/i18n, which now has python and JSON files with the same messages. The JSON files are not used yet, as the code changes to enable JSON have exposed packaging problems that are the subject of RFC https://www.mediawiki.org/wiki/Requests_for_comment/pywikibot_2.0_packaging
We need syntax validation of these JSON files for gerrit submissions, as message changes typically need to be approved quickly (and without errors) so they can be merged and the core & compat i18n submodule updated, before the messages can be used in code changes.
TASK DETAIL
https://phabricator.wikimedia.org/T85335
REPLY HANDLER ACTIONS
Reply to comment or attach files, or !close, !claim, !unsubscribe or !assign <username>.
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: jayvdb
Cc: Aklapper, valhallasw, siebrand, Nemo_bis, jayvdb, Gryllida, Shizhao, Arrbee, pywikipedia-bugs
jayvdb created this task.
jayvdb claimed this task.
jayvdb added subscribers: jayvdb, Legoktm, hashar.
jayvdb added projects: Continuous-Integration, pywikibot-core.
Restricted Application added subscribers: Aklapper, pywikipedia-bugs.
TASK DESCRIPTION
As jenkins will no longer automatically run rules in tox.ini, we need to hardwire pep8 and pep257 into jenkins, like the mediawiki lint tests.
pep8 runs without arguments, using the config in tox.ini for the exclude and ignore list.
pep257 doesn't have an exclude parameter (yet), but looks like they are interested in that feature:
https://github.com/GreenSteam/pep257/pull/22#issuecomment-70471875
While waiting for feedback on adding exclude to pep257, it would be nice if we didnt need an exclude list .. ;-)
date.py is being ignored by flake8, but is processed by pep257, so I've tackled that.
https://gerrit.wikimedia.org/r/#/c/185815/
pep257 is validating ez_setup.py , which I've tried to fix at the source of the problem:
https://bitbucket.org/pypa/setuptools/pull-request/117/pep8-and-pep257-comp…
pep257 doesnt ignore items with '# noqa' , so it is emitting errors for interwiki.py (easy) and pywikibot/exceptions.py (messy).
Also if the job is hardwired into jenkins, and we can only maintain one match list in tox.ini , we cant reimplement the 'mandatory docstring' list without some serious cleanup first.
TASK DETAIL
https://phabricator.wikimedia.org/T87169
REPLY HANDLER ACTIONS
Reply to comment or attach files, or !close, !claim, !unsubscribe or !assign <username>.
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: jayvdb
Cc: hashar, Legoktm, jayvdb, Aklapper, greg, pywikipedia-bugs
XZise created this task.
XZise added a subscriber: XZise.
XZise added a project: pywikibot-core.
TASK DESCRIPTION
Similar to T74847 the wikibase settings are hardcoded into the family files. There is [[https://en.wikipedia.org/w/api.php?action=query&meta=wikibase|`action=query…]] which would allow to make dynamically if the API plays along. In T74847 a very nasty problem surfaced which made it at least very hard to determine the Site object parameter (basically which family and code it uses) based on the result (see T85153).
So part of this task is also to determine if this API call has the same problem so other wikibase installations in the wild should be queried (or more exact: installations which use another wikibase). At least from the result of the English Wikipedia it should be easy to get a Site object because it's possible to construct the URL similar to the interwiki map URL for which we already have an implementation.
TASK DETAIL
https://phabricator.wikimedia.org/T85331
REPLY HANDLER ACTIONS
Reply to comment or attach files, or !close, !claim, !unsubscribe or !assign <username>.
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: XZise
Cc: Aklapper, XZise, jayvdb, pywikipedia-bugs