Happy Monday,
There are strange people who make such links (kindof urlencoded?):
[[Második világháború#Partrasz.C3.A1ll.C3.A1s Szic.C3.ADli.C3.A1ban
.28Huskey hadm.C5.B1velet.29|Huskey hadműveletben]]
So the section title must have been copied from the URL.
Do we have a ready tool to fix these?
--
Bináris
Hello all
>From one of my assignments as a bot operator I have some code which
does template parsing and general text parsing (e.g. Image/File tags).
It is not using regex and thus able to correctly parse nested
templates and other such nasty things. I have written those as library
classes and written tests for them which cover almost all of the code.
I would now really like to contribute that code back to the community.
Would you be interested in adding this code to the pywikibot
framework? If yes, can I send the code to someone for code review or
how do you usually operate?
Greetings
Hannes
PS: wiki userpage is http://en.wikipedia.org/wiki/User:Hannes_R%C3%B6st
We now have a lot of branches in gerrit.
2.0 correction HEAD master nexqt PropertyPage textdata
We have previously only have branches for 2.0 HEAD and master in gerrit.
I think this happens when we push directly to the gerrit repo, i.e.
not using git review.
They are replicated to github, and as a result trigger builds on
Travis and Appveyor-CI.
https://github.com/wikimedia/pywikibot-core/branches
I guess this could be a good thing, if a developer doesnt want to have
a github.com account, but wants to do test builds on Travis/Appveyor.
The most recent one is `nexqt` , which is
https://gerrit.wikimedia.org/r/#/c/283940/ (newly uploaded today by
xqt) , so it can be deleted I guess.
The other three are not in Gerrit review system, all by Andre Engels
<andreengels(a)gmail.com>
textdata is Ic753f1b2727d5142705041a296241a04274e65da / 2146a2d475a
PropertyPage is Id988417bfe67119aed2773a6becd6c4bd229c0c0 / 595742edb24
correction is I229e05a2cd47059a1682a5b6c6a353af04968139 / e56d112fd
These have useful changes in them, so we shouldnt delete these branches.
One strange aspect is that `correction` is attracting commits from
l10n-bot(a)translatewiki.net , while the other ones are not.
--
John Vandenberg
Andrea Zanni, 15/04/2016 09:03:
> I remember Alex Brollo was working with the djvu_xml layer
The XML output from ABBYY is still being published, AFAIK.
Nemo
Hi.
Is there any preference for a python pdf library, in case one would like to
add pdf file processing to pywikibot?
Or is it good enough, if possible, to rely on pdfinfo (which I guess is
linux-only)?
Suggestions/comments appreciated.
Mpaa
Hello,
I use Pywikibot for a Wikia wiki I run, mostly for deleting multiple
articles. I'm still really new at this since I only use my bot for
mass-categorizing and deleting.
Today while I was deleting pages my bot ran into an error. I unfortunately
didn't save the message but whenever I execute login.py, I get the following
message:
C:\Pywikibot>login.py
Traceback (most recent call last):
File "C:\Pywikibot\login.py", line 59, in <module>
import query
File "C:\Pywikibot\query.py", line 30, in <module>
import wikipedia as pywikibot
File "C:\Pywikibot\wikipedia.py", line 162, in <module>
from BeautifulSoup import BeautifulSoup, BeautifulStoneSoup, SoupStrainer
ImportError: No module named BeautifulSoup
I've switched from Python 2.7 to 3.4 and back, and I've also updated my
Pywikibot files. I am officially stumped. Again, I'd like to remind you all,
I'm not well-versed in this so I'll try my best to follow directions.
Thanks for your time.
-KP
Hey all,
I'm working on extending the pywikibot.Page object to add methods to get an
article's assessment rating, predicted rating (through ORES[1]), and number
of page views (through the page view API). SuggestBot uses these three
pieces of information when posting suggestions on the English Wikipedia.
I have preliminary code that works[2], and am now trying to extend it with
a couple of generators and bulk loading to increase efficiency. Assessment
ratings are grabbed from talk pages, so that part appears straightforward.
ORES uses revisions IDs, however, so I'm looking for an efficient way to
get "lastrevid" for a list of pages (pywikibot.Page objects).
I can use a PreloadingGenerator for it, as that sets `_revid`, but since I
don't need the page content that seems excessive. Modifying the
`preloadpages` method from site.py is of course possible, but perhaps
there's another alternative here?
References:
1: https://meta.wikimedia.org/wiki/Objective_Revision_Evaluation_Service
2:
https://github.com/nettrom/suggestbot/blob/newpopqual/suggestbot/utilities/…
Cheers,
Morten
Hi pywikibot experts,
How can I create new properties via pywikibot? (I'm trying to do it via
bot, because I'm doing some experiments on a dedicated wikibase
installation with - possibly - hundreds of properties to be created... and
Pywikibot would certainly be my favorite tool!)
In case I have, instead, to directly wrap the action "wbeditentity" from
mediawiki API (
https://www.wikidata.org/w/api.php?action=help&modules=wbeditentity ), are
there some Python examples?
And, in case I have to use the php script "importProperties.php" (
https://github.com/JeroenDeDauw/Wikibase/blob/master/repo/maintenance/impor…
),
how can I manage properties more complex than the ones contained in the
example (
https://github.com/JeroenDeDauw/Wikibase/blob/master/repo/maintenance/en-el…
)?
Using pywikibot, I'm able to MODIFY existing properties with instructions
like the following ones (which let me generate the object-content in one
shot via json ... as I need):
In [1]: import pywikibot ; site = pywikibot.Site() ; repo =
site.data_repository()
In [2]: property_page = pywikibot.PropertyPage(repo, u"P2")
In [3]: myjson = {u'descriptions': {u'en': {u'language': u'en', u'value':
u'invented description'}}, u'labels': {u'en': {u'language': u'en',
u'value': u'test property'}}}
In [4]: property_page.editEntity(myjson)
...but I cannot CREATE new properties (instantiating a PropertyPage
object), because pywikibot asks for the identifier of an existing instance:
In [11]: p_page = pywikibot.PropertyPage(repo)
---------------------------------------------------------------------------
InvalidTitle Traceback (most recent call last)
<ipython-input-11-e72b812b8cd3> in <module>()
----> 1 p_page = pywikibot.PropertyPage(repo)
/home/user/src/pywikibot_repo/pywikibot/page.pyc in __init__(self, source,
title)
4027 if not title or not self.id.startswith('P'):
4028 raise pywikibot.InvalidTitle(
-> 4029 u"'%s' is not an property page title" % title)
4030 Property.__init__(self, source, self.id)
4031
InvalidTitle: '' is not an property page title
In [12]:
In fact, as I understand, in the source code of the "WikibasePage" class, I
see that
while for the Item type, a "Special case for empty item" is mentioned (
https://github.com/wikimedia/pywikibot-core/blob/master/pywikibot/page.py#L…
)
# Special case for empty item.
if title is None or title == '-1':
super(ItemPage, self).__init__(site, u'-1', ns=ns)
assert self.id == '-1'
return
...for the Property type, instead, an empty object is NOT allowed (
https://github.com/wikimedia/pywikibot-core/blob/master/pywikibot/page.py#L…
)
if not title or not self.id.startswith('P'):
raise pywikibot.InvalidTitle(
u"'%s' is not an property page title" % title)
Property.__init__(self, source, self.id)
Thanks a lot for your attention!