Happy Monday,
There are strange people who make such links (kindof urlencoded?):
[[Második világháború#Partrasz.C3.A1ll.C3.A1s Szic.C3.ADli.C3.A1ban
.28Huskey hadm.C5.B1velet.29|Huskey hadműveletben]]
So the section title must have been copied from the URL.
Do we have a ready tool to fix these?
--
Bináris
Hello all
>From one of my assignments as a bot operator I have some code which
does template parsing and general text parsing (e.g. Image/File tags).
It is not using regex and thus able to correctly parse nested
templates and other such nasty things. I have written those as library
classes and written tests for them which cover almost all of the code.
I would now really like to contribute that code back to the community.
Would you be interested in adding this code to the pywikibot
framework? If yes, can I send the code to someone for code review or
how do you usually operate?
Greetings
Hannes
PS: wiki userpage is http://en.wikipedia.org/wiki/User:Hannes_R%C3%B6st
Hi,
I just pulled the latest pwb from upstream and this commit is giving
me some headakes. The associated bug says "Add script integration
tests", and the description says "Miscellaneous pwb improvements".
However, when running with python 2.7.8, all I get is a bunch of
warnings (see below). Have you guys tested this with python 2? Have
you considered that some users might not use folders as modules, but
rather have related scripts in the same folder?
Thanks,
Strainu
Parent module monumente not found: No module named monumente
./monumente/parse_monument_article.py:10: RuntimeWarning: Parent
module 'monumente' not found while handling absolute import
import sys
./monumente/parse_monument_article.py:11: RuntimeWarning: Parent
module 'monumente' not found while handling absolute import
import time, datetime
./monumente/parse_monument_article.py:12: RuntimeWarning: Parent
module 'monumente' not found while handling absolute import
import warnings
./monumente/parse_monument_article.py:13: RuntimeWarning: Parent
module 'monumente' not found while handling absolute import
import json
./monumente/parse_monument_article.py:14: RuntimeWarning: Parent
module 'monumente' not found while handling absolute import
import string
./monumente/parse_monument_article.py:15: RuntimeWarning: Parent
module 'monumente' not found while handling absolute import
import cProfile
./monumente/parse_monument_article.py:16: RuntimeWarning: Parent
module 'monumente' not found while handling absolute import
import re
./monumente/parse_monument_article.py:18: RuntimeWarning: Parent
module 'monumente' not found while handling absolute import
import pywikibot
./monumente/parse_monument_article.py:19: RuntimeWarning: Parent
module 'monumente' not found while handling absolute import
from pywikibot import pagegenerators
./monumente/parse_monument_article.py:20: RuntimeWarning: Parent
module 'monumente' not found while handling absolute import
from pywikibot import config as user
./monumente/parse_monument_article.py:21: RuntimeWarning: Parent
module 'monumente' not found while handling absolute import
from pywikibot import catlib
./monumente/parse_monument_article.py:24: RuntimeWarning: Parent
module 'monumente' not found while handling absolute import
import strainu_functions as strainu
Hi,
I'm trying to run a robot the calls page.coordinates on every article
it processes and about 10% of the requests end up in timeout. I don't
think this is a connection issue, since all the other API calls are
successful.
Is anyone else seeing such issues with the coordinates?
Thanks,
Strainu
Hi, over the last weeks (and especially yesterday/today) I've been
working on a little script which goes through all the branches (except
the master branch) and checks the last commit of each branch about
whether it is an already merged script on gerrit.
Previously it worked completely offline (and it is still available by
using --offline) and was checking for each branch whether the
change-id of the latest commit could be found as the change-id in the
master branch.
Now yesterday I discovered it is possible to query gerrit via SSH and
get the data as JSON back so it now has an online mode (per default)
and checks for each change-id if it's open. So this also covers now
abandoned branches and also shows whether a branch has been even
submitted. There is an advanced online mode (via
--load-additional-data) which (after the first query) queries on the
open changes if the branch is up to date. I was able to query all
information in only one request so it's doing one (or with the
additional information two) request(s) which take for 47 change-ids
about 100 ms (according to the answer, but it feels longer, probably
overhead of the SSH connection).
By default it doesn't delete any branches but it's possible to also
delete those branches automatically, only those beginning with review/
or asking for all of them. I'm planning on an automatic update (if
request) if a branch could be updated.
You can find the script in the following gist
https://gist.github.com/xZise/975251c90e531347fee7 . It should work on
any git repository, although the server/port are currently hardcoded
(I might have an idea to fix that soon).
I'm not sure how much energy I put in this in case if we are not
staying with gerrit when the WMF does. But I originally didn't intent
to grow it so “big”.
Have fun with it,
Fabian
Is there a styleguide or was there a discussion already, because I
think it would be good to have at least some set of rules. Some are
already handled by flake8 so jenkins will prevent that some style
guides are broken but there is for example:
* PEP-8 compatible names. Should all new attributes/method/variables
be PEP-8 compatible? There is a task on phabricator to be compliant
with the 2.0 release (so with 3.0 the other code can be removed)
https://phabricator.wikimedia.org/T85328
* Another is whether we want to use %-notation or str.format. I
personally prefer the str.format because it doesn't require any magic
but many submissions use %-notation and I previously didn't care about
that. But should we only use one mode and which of those then? And
when we allow both and others ask what they should use, should I
recommend what we like or do we want to have one recommendation?
* Last week (afaik) I noted on this list the line lengths (I probably
should update them now where Amir's patch was merged). Should we
enforce a specific limit (like 100 chars?). I also saw that gerrit
supported to show a line to indicate the 80 character width. Is it
possible to enable that on our gerrit (to make it easier for us to
see)?
If you have additional points which could be defined in a style guide
please let me know.
Fabian
Hello,
Yesterday I have restricted the tox Jenkins jobs to only be run by
whitelisted person. That is rather inconvenient :-(
To whitelist a person, its Gerrit email address would have to be added
to integration/config.git zuul/layout.yaml in a list and in a huge regex.
I have a patch pending that I have yet to test, which would cause test
to run when a whitelisted person votes CR+1 and the patch hasn't been
tested yet. That would help a lot. The patch is:
https://gerrit.wikimedia.org/r/#/c/184886/
I have no idea when I will get to polish it up and deploy it though :(
--
Antoine "hashar" Musso
pywikibot/core has a 'python3' branch which has been last updated in
October 2013.
Since the 'master' branch supports py3k fairly well now, it looks like
the old branch should be deleted.
Thoughts?
PyPy <http://pypy.org> is an alternative implementation of Python
primarily focused on performance.
During a very rough benchmark I made with CosmeticChangesToolkit and a
few pages, it didn't provide significant speed improvements, but it did
work.
Have you ever tested PyPy or used it in production?
Given the importance that MediaWiki people have recently given to HHVM,
what do you think we could/should do on the performance side of the
framework?