Patches item #2800449, was opened at 2009-06-03 11:28 Message generated for change (Settings changed) made by sf-robot You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=603140&aid=2800449...
Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: None
Status: Closed
Resolution: None Priority: 5 Private: No Submitted By: Hannes Rst (hroest) Assigned to: Nobody/Anonymous (nobody) Summary: read more tags from xmldump
Initial Comment: I changed the classes XmlDump and XmlEntry so that they now also have information about namespaces and minor edits. I parse the information of the xml dump header with the function _parseSiteinfo and I added the new field siteinfo to XmlDump as well as the fields: "sitename, base, generator, case, namespaces".
----------------------------------------------------------------------
Comment By: SourceForge Robot (sf-robot)
Date: 2009-10-23 02:20
Message: This Tracker item was closed automatically by the system. It was previously set to a Pending status, and the original submitter did not respond within 14 days (the time period specified by the administrator of this Tracker).
----------------------------------------------------------------------
Comment By: siebrand (siebrand) Date: 2009-09-25 00:06
Message: Cannot process diff.
----------------------------------------------------------------------
Comment By: Hannes Rst (hroest) Date: 2009-06-03 12:19
Message: ok, I used diff -u, is this better? sry, I haven't done this before, greetings hroest
----------------------------------------------------------------------
Comment By: Francesco Cosoleto (cosoleto) Date: 2009-06-03 12:06
Message: Please use "diff -u" or "svn diff" to create patches. Unified diffs are more readable and thus easier to review
----------------------------------------------------------------------
Comment By: Hannes Rst (hroest) Date: 2009-06-03 11:36
Message: diff: 59c59 < def __init__(self, title, id, text, username, ipedit, timestamp, editRestriction, moveRestriction, revisionid, comment): ---
def __init__(self, title, id, text, username, ipedit, timestamp,
editRestriction, moveRestriction, revisionid, comment, minor, namespace): 70a71,72
self.minor = minor self.namespace = namespace
284a287
self.siteinfo = None
292a296,297
if event == 'start' and elem.tag == "{%s}siteinfo" %
self.uri:
self._parseSiteinfo(elem)
295a301,311
def _parseSiteinfo(self, elem): self.sitename = elem.findtext( "{%s}sitename" % self.uri ) self.base = elem.findtext( "{%s}base" % self.uri ) self.generator = elem.findtext( "{%s}generator" % self.uri ) self.case = elem.findtext( "{%s}case" % self.uri ) self.namespaces = {} for infoElement in elem: if infoElement.tag == "{%s}namespaces" % self.uri: for name in infoElement: self.namespaces[name.text] = name.attrib['key']
327c343,344 < # could get comment, minor as well ---
if revision.findtext("{%s}minor" % self.uri) == '': minor =
True
else: minor = False
330a348,359
#here we get the namespace which is in a format like this
"ns:title"
#note that we can find namespace zero in the dictionary under
"None"
match = re.search('([^:]*):\w*', self.title) try: if match: nameSp = self.namespaces[match.group(1)] else: nameSp = self.namespaces[match] except KeyError: #this means we dont have this one stored as a namespace or
its an
#article like "2001: A Space Odyssey (film)" #we assume that the namespace is zero nameSp = 0
337c366,368 < comment=comment ---
comment=comment, minor = minor, namespace = nameSp
407d437 <
----------------------------------------------------------------------
You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=603140&aid=2800449...
pywikipedia-bugs@lists.wikimedia.org