Happy Monday,
There are strange people who make such links (kindof urlencoded?):
[[Második világháború#Partrasz.C3.A1ll.C3.A1s Szic.C3.ADli.C3.A1ban
.28Huskey hadm.C5.B1velet.29|Huskey hadműveletben]]
So the section title must have been copied from the URL.
Do we have a ready tool to fix these?
--
Bináris
Hello all
>From one of my assignments as a bot operator I have some code which
does template parsing and general text parsing (e.g. Image/File tags).
It is not using regex and thus able to correctly parse nested
templates and other such nasty things. I have written those as library
classes and written tests for them which cover almost all of the code.
I would now really like to contribute that code back to the community.
Would you be interested in adding this code to the pywikibot
framework? If yes, can I send the code to someone for code review or
how do you usually operate?
Greetings
Hannes
PS: wiki userpage is http://en.wikipedia.org/wiki/User:Hannes_R%C3%B6st
Hello,
I've created a ohloh project to get stats about all those bots and
scripts, often based on PWB, that can be found around. For now I added
50. https://www.ohloh.net/p/wikibots
Please add yours, or those you know of! It only takes a couple clicks.
https://www.ohloh.net/p/wikibots/enlistments/new
I know that publishing and listing your code is boring, but the
incentive here is that you get pretty graphs. :)
Nemo
Dear Pywikipedia team
I have pushed a few of my coding projects using pywikipedia (the compat
version) to github and I thought that some of you might be interested in the
code. I had some time recently to clean up the code and bring it into a
(hopefully) useable format and I would be willing to make further adjustments
if you think the code would be more useful to you if I changed a few things.
Ultimately my hope would be that the code will find its home in the pywikipedia
repository. Also some of the code that I wrote might be duplicated and already
present in the core, if so I would apologize and you can happily ignore it.
== Template parser ==
https://github.com/hroest/pywikibot-compat/tree/feature/template_parser
For one bot project on the German Wikipedia I had to parse rather complex
templates and replace specific fields. The templates would contain nested
templates, math formulas and references inside. I thus wrote a template parser
which would parse these templates and return them as key-value pairs which
would make it easy to query specific keys and replace their values. The code
worked well on several thousand templates of the German chemistry project and
should be rather straightforward to use. This is library code, so there is no
bot associated with it, see templateparser.py and tests/test_templateparser.py
In order to correctly handle nesting and properly differentiate equal signs
belonging to key-value pairs from those in mathematical formulas etc, I also
had to write a partial wikimedia syntax parser which would recognize such
syntax in wikitext. This code is in textrange_parser.py and allows to extract
specific parts of a text (e.g. wikitables, templates, wikilinks, weblinks),
tests are in tests/test_textrange_parser.py
== Spellchecking ==
https://github.com/hroest/pywikibot-compat/tree/feature/spellcheck
I added two new spellchecking bots, one based on hunspell which is the same
spellchecker that also libreoffice uses (spellcheck_hunspell.py) and another
one based on a negative list (spellcheck_blacklist.py). They run from the
commandline, both parse the given wiki text, skip text ranges that usually do
not only contain human-readable text (templates, tables etc) and check each
word against a spellchecking engine (again, either a simple blacklist or a
full-blown spellchecker that has stemming and morphological analysis like
hunspell). These spellcheckers may turn out to be useful since the understand
part of the Wiki markup and know which parts of a text to spellcheck and which
parts not.
The wrong words can be processed interactively and each word can be confirmed
individually and then sent to Wikipedia to be corrected. I have a bot with
which I do this semi-automatically and I have so far corrected 3000+ spelling
mistakes on the German Wikipedia
https://de.wikipedia.org/wiki/Spezial:Beitr%C3%A4ge/HRoestTypo
For large scale processing, one can process a complete Wikipeda XML dump and
for small-scale processing one can use the Wikipedia web-search functionality
to search for articles with a specific spelling error and then only process
these pages.
== Review edits ==
https://github.com/hroest/pywikibot-compat/tree/feature/review_pages
In the German Wikipedia, there is considerable work done reviewing individual
edits and marking them as reviewed. In the above feature/review_pages branch
there is a script called review_pages which allows to perform reviews of
revisions semi-automatically. It fetches for a given page the revision history
up to the last reviewed change and displays the changes between the current and
the last reviewed version of the article on the command line. The user can then
interactively decide to accept the review, undo the change or go to the next
unreviewed change.
For this bot, a mediawiki APIs are used and thus it may not actually be
suitable for the compat version of pywikipedia. Reviewing, undoing and
retrieving full version histories are done through the APIs and can be
performed fully asynchronous. This allows relatively fast interactive response
while the bot in the background fetches the revision histories and performs the
review/undo actions requested by the user.
== Summary ==
I provide this code in the hope that it is useful for people and if somebody
thinks that the described functionality could be provided from the pywikibot
project, I would be willing to work to make necessary adjustments for the code
to be merged.
Best regards
Hannes
Hi Daniel,
Daniel Kinzler schreef op 15-4-2014 19:16:
> Today, Thiemo merged my patch introducing the ApiErrorReporter class:
> <https://gerrit.wikimedia.org/r/#/c/124323/>. This should help us with
> providing error reports from the API in a consistent manner; This way, we will
> hopefully soon be able to provide more localized error messages too.
>
> However, this means that some of the error codes used by the API may have
> changed, and more will change when more API modules start using this module.
> Also, it means that localized messages are included in a slightly different way.
> If you rely on error codes or localized error messages, please keep an eye out
> for breakage in that regard.
Did you document this somewhere? I assume we have to modify Pywikibot a
bit so would be nice to have a good overview.
Maarten
Hello all,
I just finished setting up a system to update codes of core and compat (for
now just updating language by size but developing this system is very easy
and very soon I'll add other things)
As a result you can see this commit
<https://gerrit.wikimedia.org/r/125009>that has been done completely
automatically and I just reviewed it, if
anything goes well and if there is no objection I'll give the bot +2 access
and set up a system to do the review automatically as well (like l10-bot)
I set to cron for day of 1 and 3 of every week to do this for core and
compat
I will run this bot with xqt, thank you xqt!
What do you think about this?
Best
--
Amir
I can reproduce the error given by traceback if I use a wrong path to start a script. But I cannot reproduce the syntax error. This means the source might be corrupt and the object file cannot be created. This may cause the import error. Because version.py and pywikibot.__init__ gives a rigth traceback message I bet on the wikipedia.py. Seems it has illegal characters on the first line or something like this.
Binariz: where did you downloaded the framework? I'll try to investigate into that matter.
You may use the path setting for compat or core release but it may also be omitted.
You also my use git or svn repository. I use it both: git for development, svn for the bot.
The nightly dump is also available which works without these version control Systems but it is hard to use with local changes at the framework's scripts because you have to merge new code yourself.
What does it mean "back to the roots?"
Greetings
xqt
----- Original Nachricht ----
Von: John <phoenixoverride(a)gmail.com>
An: Pywikipedia discussion list <pywikipedia-l(a)lists.wikimedia.org>
Datum: 05.04.2014 17:50
Betreff: Re: [Pywikipedia-l] versionHistories
> Its not a path issue, compat doesnt use it, odds are someone fucked up the
> code, especially with the changes in the last two years or so where we
> moved away from what pywiki's origin. At the point where we require either
> git, or svn to be installed just begs for something to break. Instead of
> depending on the repo lets get back to our roots.
>
> On Sat, Apr 5, 2014 at 11:35 AM, Bináris <wikiposta(a)gmail.com> wrote:
>
> >
> >
> >
> > 2014-04-05 15:16 GMT+02:00 <info(a)gno.de>:
> >
> > The monkey-patch for importing "pywikibot" directly instead of
> "wikipedia"
> >> was done in November 2012. It should run as expected.
> >>
> >> Do you have any Setting für PYWIKIBOT_DIR?
> >>
> > I don't know about it. Should I have? I didn't have als for the prevoius
> > version which worked.
> >
> >
> >> The error sound like you have a wrong path to the Framework or a
> absolute
> >> path at your command line.
> >>
> >> Could you tell me your framework path and the whole command line
> invoking
> >> the script?
> >>
> >
> > c:\Pywikipedia, since 2006 December. I open this directory in Total
> > Commander, then start a command line directly from this dir, then I type
> > the script name.
> >
> > _______________________________________________
> > Pywikipedia-l mailing list
> > Pywikipedia-l(a)lists.wikimedia.org
> > https://lists.wikimedia.org/mailman/listinfo/pywikipedia-l
> >
> >
>
>
> --------------------------------
>
> _______________________________________________
> Pywikipedia-l mailing list
> Pywikipedia-l(a)lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/pywikipedia-l
>
Did you change the frameworks path with your update?
xqt
----- Ursprüngliche Nachricht -----
Von: Bináris
Gesendet: 05.04.2014 08:52
An: Pywikipedia discussion list
Betreff: Re: [Pywikipedia-l] versionHistories
What kind of configuration this mess wants me?
As far as I know trunk never needed any configuration except user files.
Tell me only one reason not to be angry.
2014-04-05 8:39 GMT+02:00 Bináris <wikiposta(a)gmail.com>:
Next stage. It becomes more and more horroristic.
I downloaded the latest "compat" from nightlies and unpacked.
It is one third in size than that of 11 January.
After unpacking version.py says:
c:\Pywikipedia>version.py
syntax error: line 1, column 0
Serious import error; pywikibot not available - was it configured?
Traceback (most recent call last):
File "C:\Pywikipedia\version.py", line 20, in <module>
pywikibot.output('Pywikibot: %s' % getversion())
AttributeError: 'module' object has no attribute 'output'
version.py may have perhaps misunderstood something, I didn't want to configure anything, i just wanted to update my WORKING copy.
Then I tried to restore the previous version but the result is the same, my Pywikibot does not work any more. I had a working version and tried to update it in a normal way before somebody cleverly asks me to do so, and I only have broken ruins.
I am totally fed up with this whole mess!
Give us back Pywikibot!
--
Bináris
Cannot follow. In which way does it fix the import error?
xqt
----- Ursprüngliche Nachricht -----
Von: Amir Ladsgroup
Gesendet: 05.04.2014 10:39
An: Pywikipedia discussion list
Betreff: Re: [Pywikipedia-l] versionHistories
I made a patch to fix it
https://gerrit.wikimedia.org/r/121891
Review it please :)
On 4/5/14, Bináris <wikiposta(a)gmail.com> wrote:
> What kind of configuration this mess wants me?
> As far as I know trunk never needed any configuration except user files.
> Tell me only one reason not to be angry.
>
>
> 2014-04-05 8:39 GMT+02:00 Bináris <wikiposta(a)gmail.com>:
>
>> Next stage. It becomes more and more horroristic.
>> I downloaded the latest "compat" from nightlies and unpacked.
>> It is one third in size than that of 11 January.
>> After unpacking version.py says:
>> c:\Pywikipedia>version.py
>> syntax error: line 1, column 0
>> Serious import error; pywikibot not available - was it configured?
>> Traceback (most recent call last):
>> File "C:\Pywikipedia\version.py", line 20, in <module>
>> pywikibot.output('Pywikibot: %s' % getversion())
>> AttributeError: 'module' object has no attribute 'output'
>>
>> version.py may have perhaps misunderstood something, I didn't want to
>> configure anything, i just wanted to update my WORKING copy.
>> Then I tried to restore the previous version but the result is the same,
>> my Pywikibot does not work any more. I had a working version and tried to
>> update it in a normal way before somebody cleverly asks me to do so, and
>> I
>> only have broken ruins.
>>
>>
>> *I am totally fed up with this whole mess!*
>> Give us back Pywikibot!
>>
>
>
>
> --
> Bináris
>
--
Amir
_______________________________________________
Pywikipedia-l mailing list
Pywikipedia-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/pywikipedia-l