Bugs item #2627537, was opened at 2009-02-22 12:01
Message generated for change (Tracker Item Submitted) made by Item Submitter
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=2627537&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: None
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Nobody/Anonymous (nobody)
Assigned to: Nobody/Anonymous (nobody)
Summary: Missing #REDIRECT localisation for pdc- and de-wiki
Initial Comment:
There is no localisation for the pdc-wiki in the actual family.py yet but
there should be a new line as follows:
redirect = {
'de': [u'WEITERLEITUNG']
'pdc': [u'WEITERLEITUNG']
but 'REDIRECT' should also allowed
see [[w:de:Wikipedia:Weiterleitung]]
and http://pdc.wikipedia.org/w/index.php?title=Kallitsch&action=edit
<w:de:User:Xqt>
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=2627537&group_…
Bugs item #2619054, was opened at 2009-02-20 09:04
Message generated for change (Comment added) made by nicdumz
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=2619054&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: rewrite
Group: None
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: NicDumZ — Nicolas Dumazet (nicdumz)
Assigned to: Russell Blau (russblau)
Summary: clarify between limit, number, batch and step parameters
Initial Comment:
I had a strange behavior of replace.py -weblink: that I couldn't quite diagnose: some pages were not treated.
First of all, those detailed logs are a great gift. They are a bit messy to understand at first, but thanks to those I found the bug and fixed it in r6386 ( http://svn.wikimedia.org/viewvc/pywikipedia?view=rev&revision=6386 ).
I believe that this parameter confusion is a very bad habit we have from the old framework. (the only reason there we have those bugs is because we merged pagegenerators from trunk.) We need to agree on common parameters for generators that have a global meaning, and stick to it.
I personally think that -limit might be a bit confusing (is it an api limit, a limit enforced by the local application on a huge fetched set, etc ?), while -number appears a bit more clear. But it's a personal opinion =)
What about -number for "number of items to retrieve", and -step, or -maxstep for the maximum number of items to retrieve at once ?
Actually, I don't mind about the names; we just need to agree on something meaningful enough, and document them in the file headings.
On a sidenote, replace.py -fix:yu-tld -weblink:*.yu is actually running on fr.wp. No issues sighted. =)
----------------------------------------------------------------------
>Comment By: NicDumZ — Nicolas Dumazet (nicdumz)
Date: 2009-02-22 05:50
Message:
Well I think that one of the first steps here is to consider what is
currently done in the old pagegenerators =)
Here's a small summary of the "limits" enforced by our old
pagegenerators.
The overall internal naming consistency factor is quite low for now, not
to mention the surprising facts I found :s
I've considered for each generator, the pagegenerators function, and its
Site/Page/Image/Category counterpart: unless noted, both function parameter
namings are consistent.
* shortpages, new(pages|images), unusedfiles, withoutinterwiki,
uncategorized(images|categories|pages), unwatchedpages, ancientpages,
deadendpages, longpages, shortpages, search
They use "number" (meant as "batch"/"max") + boolean "repeat". Overall,
you can get either "number" items, or all.
* random(page|redirect) are good examples of inconsistencies:
they use number (batch/max) + repeat, but since Special:Random gives only
one page at a time, the actual "batch" parameter is always 1. (behavior is
"for _ in range(number), fetch one page")
And if repeat=True ... those functions never stop, if I'm right.
irrrk !!
* filelinks, imagelinks, interwiki
they scrap the article wikipage, and yield everything in one step from the
wikitext
* categorymembers, subcategories
they scrap category pages. No parameter is available, since the UI doesn't
let us customize the number of displayed links. Follows the (next) links on
the category page. Stops when all the items have been retrieved.
* allpages, prefixindex, getReferences
no function parameters. They use config.special_page_limit as "batch/max",
and all items are retrieved through repeated queries.
if special_page_limit > 999, getReferences sets it back to 999. (?!)
* linksearch
pagegenerators has a "step=500" parameter, the corresponding Site function
uses "limit=500". Meant as "batch/batch": all the links are retrieved
through repeated queries
* usercontribs
number=250, meant as "batch/max". All the contribs are retrieved through
repeated queries. if number>500, sets it back to 500
It seems that the most common used combination is number+repeat. But I
really don't think that it is the way to go, since you cannot accurately
describe the total number of items you want to retrieve: either number,
either all items...
I think a "batch" + "total" integer parameters could be more useful here
(namings are illustrative)
On the other hand, users should be able to say "I want to retrieve all the
items": looking into the code, I see that a "-1" convention is used now. If
I understand things correctly, it is used in a "batch" context: if we call
set_maximum_items(-1), in most of the cases, the API uses its default
xxlimit number. We could use such a convention for our "total" parameter
too. Be it -1, or None, whatever, but I think that with such a policy, we
should cover all the use cases.
Given what I found, I really don't think that backwards compatibility
should be a priority here. I would rather introduce a breaking change in
namings, so that people don't expect the new limits to work "as in the old
framework"... because in the old framework, limit behaviors were not even
internally consistent...
----------------------------------------------------------------------
Comment By: Russell Blau (russblau)
Date: 2009-02-20 16:00
Message:
A good point. A query can have two different types of limits: the limit on
the number of pages/links/whatever retrieved from the API in a single
request (defaults to "max"), and the limit on the total number of items to
be retrieved from a repeated query. We should do this in a way that is (a)
internally consistent among all generators, and (b) as much as possible,
backwards-compatible with the old pagegenerators module (but this is
secondary to getting something that works).
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=2619054&group_…
Revision: 6404
Author: filnik
Date: 2009-02-21 22:03:59 +0000 (Sat, 21 Feb 2009)
Log Message:
-----------
<li> -> <li class=> -.-' fixed
Modified Paths:
--------------
trunk/pywikipedia/welcome.py
Modified: trunk/pywikipedia/welcome.py
===================================================================
--- trunk/pywikipedia/welcome.py 2009-02-21 21:14:31 UTC (rev 6403)
+++ trunk/pywikipedia/welcome.py 2009-02-21 22:03:59 UTC (rev 6404)
@@ -442,7 +442,9 @@
#FIXME: It counts the first 50 edits
# if number > 50, it won't work
# (not *so* useful, it should be enough).
- contribnum = contribs.count('<li>')
+ contribnum = contribs.count('<li class="">')
+ if contribnum == 0:
+ contribnum = contribs.count('<li>')
if contribnum >= number:
wikipedia.output(u'%s has enough edits to be welcomed' % userpage.titleWithoutNamespace() )
@@ -500,7 +502,7 @@
data = query.GetData(params,
useAPI = True, encodeTitle = False)
- # If there's not the blockedby parameter (that means the user isn't blocked), it will return False otherwise True.
+ # If there's not the blockedby parameter (that means the user isn't blocked), it will return False otherwise True.
try:
blockedBy = data['query']['users'][0]['blockedby']
except KeyError:
Revision: 6403
Author: russblau
Date: 2009-02-21 21:14:31 +0000 (Sat, 21 Feb 2009)
Log Message:
-----------
Use API namespace parameters instead of post-retrieval namespace filters, where possible
Modified Paths:
--------------
branches/rewrite/pywikibot/pagegenerators.py
Modified: branches/rewrite/pywikibot/pagegenerators.py
===================================================================
--- branches/rewrite/pywikibot/pagegenerators.py 2009-02-21 21:02:57 UTC (rev 6402)
+++ branches/rewrite/pywikibot/pagegenerators.py 2009-02-21 21:14:31 UTC (rev 6403)
@@ -77,7 +77,7 @@
config.py for instructions.
Argument can also be given as "-google:searchstring".
--namespace Filter the page generator to only yield pages in the
+-namespace -ns Filter the page generator to only yield pages in the
specified namespaces. Separate multiple namespace
numbers with commas.
@@ -159,17 +159,21 @@
Only call this after all arguments have been parsed.
"""
+ if self.namespaces:
+ namespaces = [int(n) for n in self.namespaces]
+ for i in xrange(len(self.gens)):
+ if isinstance(self.gens[i], pywikibot.data.api.QueryGenerator):
+ self.gens[i].set_namespace(namespaces)
+ else:
+ self.gens[i] = NamespaceFilterPageGenerator(
+ self.gens[i], namespaces)
if len(self.gens) == 0:
return None
elif len(self.gens) == 1:
gensList = self.gens[0]
else:
gensList = CombinedPageGenerator(self.gens)
- genToReturn = DuplicateFilterPageGenerator(gensList)
- if self.namespaces:
- genToReturn = NamespaceFilterPageGenerator(
- genToReturn, map(int, self.namespaces))
- return genToReturn
+ return DuplicateFilterPageGenerator(gensList)
def getCategoryGen(self, arg, length, recurse = False):
if len(arg) == length:
@@ -268,6 +272,13 @@
else:
self.namespaces.extend(arg[len('-namespace:'):].split(","))
return True
+ elif arg.startswith('-ns'):
+ if len(arg) == len('-ns'):
+ self.namespaces.append(
+ pywikibot.input(u'What namespace are you filtering on?'))
+ else:
+ self.namespaces.extend(arg[len('-ns:'):].split(","))
+ return True
elif arg.startswith('-catr'):
gen = self.getCategoryGen(arg, len("-catr"), recurse = True)
elif arg.startswith('-category'):
@@ -394,13 +405,14 @@
return False
-def AllpagesPageGenerator(start ='!', namespace=None, includeredirects=True,
+def AllpagesPageGenerator(start ='!', namespace=0, includeredirects=True,
site=None):
"""
- Using the Allpages special page, retrieve all articles' titles, and yield
- page objects.
+ Iterate Page objects for all titles in a single namespace.
+
If includeredirects is False, redirects are not included. If
includeredirects equals the string 'only', only redirects are added.
+
"""
if site is None:
site = pywikibot.getSite()
Revision: 6402
Author: russblau
Date: 2009-02-21 21:02:57 +0000 (Sat, 21 Feb 2009)
Log Message:
-----------
New interface for setting maximum query size.
Modified Paths:
--------------
branches/rewrite/pywikibot/data/api.py
branches/rewrite/pywikibot/site.py
Modified: branches/rewrite/pywikibot/data/api.py
===================================================================
--- branches/rewrite/pywikibot/data/api.py 2009-02-21 20:54:03 UTC (rev 6401)
+++ branches/rewrite/pywikibot/data/api.py 2009-02-21 21:02:57 UTC (rev 6402)
@@ -359,12 +359,17 @@
if self.query_limit is None or limit < self.query_limit:
self.query_limit = int(limit)
- def set_query_item_limit(self, value):
+ def set_maximum_items(self, value):
"""Set the maximum number of items to be retrieved from the wiki.
If not called, most queries will continue as long as there is
more data to be retrieved from the API.
+ If set to -1 (or any negative value), the "limit" parameter will be
+ omitted from the request. For some request types (such as
+ prop=revisions), this is necessary to signal that only current
+ revision is to be returned.
+
"""
self.limit = int(value)
Modified: branches/rewrite/pywikibot/site.py
===================================================================
--- branches/rewrite/pywikibot/site.py 2009-02-21 20:54:03 UTC (rev 6401)
+++ branches/rewrite/pywikibot/site.py 2009-02-21 21:02:57 UTC (rev 6402)
@@ -1006,7 +1006,7 @@
and p._pageid > 0]
cache = dict((p.title(withSection=False), p) for p in sublist)
rvgen = api.PropertyGenerator("revisions|info", site=self)
- rvgen.limit = -1
+ rvgen.set_maximum_items(-1) # suppress use of "rvlimit" parameter
if len(pageids) == len(sublist):
# only use pageids if all pages have them
rvgen.request["pageids"] = "|".join(pageids)
@@ -1148,7 +1148,7 @@
"""
plgen = api.PageGenerator("links", site=self)
if isinstance(limit, int):
- plgen.limit = limit
+ plgen.set_maximum_items(limit)
if hasattr(page, "_pageid"):
plgen.request['pageids'] = str(page._pageid)
else:
@@ -1212,7 +1212,7 @@
if namespaces is not None:
cmgen.set_namespace(namespaces)
if isinstance(limit, int):
- cmgen.limit = limit
+ cmgen.set_maximum_items(limit)
return cmgen
def loadrevisions(self, page=None, getText=False, revids=None,
@@ -1307,9 +1307,9 @@
if section is not None:
rvgen.request[u"rvsection"] = unicode(section)
if latest or "revids" in rvgen.request:
- rvgen.limit = -1 # suppress use of rvlimit parameter
+ rvgen.set_maximum_items(-1) # suppress use of rvlimit parameter
elif isinstance(limit, int):
- rvgen.limit = limit
+ rvgen.set_maximum_items(limit)
if rvdir:
rvgen.request[u"rvdir"] = u"newer"
elif rvdir is not None:
@@ -1448,7 +1448,7 @@
if isinstance(protect_level, basestring):
apgen.request["gapprlevel"] = protect_level
if isinstance(limit, int):
- apgen.limit = limit
+ apgen.set_maximum_items(limit)
if reverse:
apgen.request["gapdir"] = "descending"
return apgen
@@ -1494,7 +1494,7 @@
if prefix:
algen.request["alprefix"] = prefix
if isinstance(limit, int):
- algen.limit = limit
+ algen.set_maximum_items(limit)
if unique:
algen.request["alunique"] = ""
if fromids:
@@ -1526,7 +1526,7 @@
if prefix:
acgen.request["gacprefix"] = prefix
if isinstance(limit, int):
- acgen.limit = limit
+ acgen.set_maximum_items(limit)
if reverse:
acgen.request["gacdir"] = "descending"
return acgen
@@ -1565,7 +1565,7 @@
if group:
augen.request["augroup"] = group
if isinstance(limit, int):
- augen.limit = limit
+ augen.set_maximum_items(limit)
return augen
def allimages(self, start="!", prefix="", minsize=None, maxsize=None,
@@ -1590,7 +1590,7 @@
if prefix:
aigen.request["gaiprefix"] = prefix
if isinstance(limit, int):
- aigen.limit = limit
+ aigen.set_maximum_items(limit)
if isinstance(minsize, int):
aigen.request["gaiminsize"] = str(minsize)
if isinstance(maxsize, int):
@@ -1643,7 +1643,7 @@
if users:
bkgen.request["bkusers"] = users
if isinstance(limit, int):
- bkgen.limit = limit
+ bkgen.set_maximum_items(limit)
return bkgen
def exturlusage(self, url, protocol="http", namespaces=None,
@@ -1664,7 +1664,7 @@
if namespaces is not None:
eugen.set_namespace(namespaces)
if isinstance(limit, int):
- eugen.limit = limit
+ eugen.set_maximum_items(limit)
return eugen
def imageusage(self, image, namespaces=None, filterredir=None,
@@ -1685,7 +1685,7 @@
if namespaces is not None:
iugen.set_namespace(namespaces)
if isinstance(limit, int):
- iugen.limit = limit
+ iugen.set_maximum_items(limit)
if filterredir is not None:
iugen.request["giufilterredir"] = (filterredir and "redirects"
or "nonredirects")
@@ -1730,7 +1730,7 @@
if reverse:
legen.request["ledir"] = "newer"
if isinstance(limit, int):
- legen.limit = limit
+ legen.set_maximum_items(limit)
return legen
def recentchanges(self, start=None, end=None, reverse=False, limit=None,
@@ -1781,7 +1781,7 @@
if reverse:
rcgen.request["rcdir"] = "newer"
if isinstance(limit, int):
- rcgen.limit = limit
+ rcgen.set_maximum_items(limit)
if namespaces is not None:
rcgen.set_namespace(namespaces)
if pagelist:
@@ -1840,7 +1840,7 @@
if getredirects:
srgen.request["gsrredirects"] = ""
if isinstance(limit, int):
- srgen.limit = limit
+ srgen.set_maximum_items(limit)
return srgen
def usercontribs(self, user=None, userprefix=None, start=None, end=None,
@@ -1888,7 +1888,7 @@
if reverse:
ucgen.request["ucdir"] = "newer"
if isinstance(limit, int):
- ucgen.limit = limit
+ ucgen.set_maximum_items(limit)
if namespaces is not None:
ucgen.set_namespace(namespaces)
if showMinor is not None:
@@ -1936,7 +1936,7 @@
if reverse:
wlgen.request["wldir"] = "newer"
if isinstance(limit, int):
- wlgen.limit = limit
+ wlgen.set_maximum_items(limit)
if namespaces is not None:
wlgen.set_namespace(namespaces)
filters = {'minor': showMinor,
@@ -2012,7 +2012,7 @@
if reverse:
drgen.request["drdir"] = "newer"
if isinstance(limit, int):
- drgen.limit = limit
+ drgen.set_maximum_items(limit)
return drgen
def users(self, usernames):
@@ -2041,7 +2041,7 @@
"""
rngen = api.PageGenerator("random", site=self)
- rngen.limit = limit
+ rngen.set_maximum_items(limit)
if namespaces is not None:
rngen.set_namespace(namespaces)
if redirects: