Bugs item #2531935, was opened at 2009-01-23 23:29 Message generated for change (Comment added) made by nobody You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=603138&aid=2531935...
Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: interwiki Group: None Status: Open Resolution: None Priority: 5 Private: No Submitted By: Nobody/Anonymous (nobody) Assigned to: Purodha B Blissenbach (purodha) Summary: -hintfile: option
Initial Comment: The newly introduced version -hintfile: is not well-documented or it's not working as expected.
It asks for a page to be checked (see below) while (according to [2284955] interwiki hints from file) it's supposed to read both a local page and a hint page from file. Please fix it. Thanks!
python interwiki.py -hintfile: Please enter the hint filename: hints.txt Which page to check:
Pywikipedia [http] trunk/pywikipedia (r6291, Jan 23 2009, 16:08:14) Python 2.5.1 (r251:54863, Apr 18 2007, 08:51:08) [MSC v.1310 32 bit (Intel)]
----------------------------------------------------------------------
Comment By: Nobody/Anonymous (nobody) Date: 2010-04-27 20:42
Message: anyone out there to take care of this?
----------------------------------------------------------------------
Comment By: Nobody/Anonymous (nobody) Date: 2009-06-26 13:25
Message: this simple code should be working for this purpose
f = codecs.open(hintfilename, 'r', config.textfile_encoding) R = re.compile(ur'[[:?(.*?)]]\s+[[:?(.*)]]') for line in R.findall(f.read()): pageTitle = line[0] hintTitle = line[1]
just make a proper call to
yield wikipedia.Page(site, pageTitle)
and
hints.append(hintTitle)
----------------------------------------------------------------------
Comment By: Nobody/Anonymous (nobody) Date: 2009-06-25 16:07
Message: I guess we need to combine "TextfilePageGenerator" from pagegenerators.py and "hintfile" from interwiki.py, so that both the page title and the hint are read, line by line, from the same hintfilename - page title from the first pair of brackets [[]], and the hint - from the second pair of brackets in the same line within hintfile. Is it possible to implement this, please?
----------------------------------------------------------------------
Comment By: Nobody/Anonymous (nobody) Date: 2009-03-07 09:52
Message: No, it's not exactly what I asked for. In the original feature request #2284955 [http://sourceforge.net/tracker/index.php?func=detail&aid=2284955&gro...], as far as I can see, the idea was to read both starting pages and hints from the same file, line per line, and to make an array of pages to be processed and relevant hints.
# [[:xx:page_without_interwiki]] [[:en:English_page_used_as_a_hint]]
Working on a single page with -hintfile option doesn't seem to be that useful.
----------------------------------------------------------------------
Comment By: Purodha B Blissenbach (purodha) Date: 2009-03-03 09:44
Message: What you want to have, in the above example, can be had with:
python interwiki.py -v -hintfile: -file: Pywikipediabot (r6439 (wikipedia.py), Feb 24 2009, 21:48:26) Python 2.5.2 (r252:60911, Jan 4 2009, 21:59:32) [GCC 4.3.2] Please enter the hint filename: hints.txt Please enter the local file name: local-page-title.txt
There is no documentation saying that -hintfile: was overriding or altering the processing of any other parameter (and in fact, it does not)
Be aware that it is hardly useful to have a file with several page titles given via -file: when -hintfile: is being used, since hints would apply to each of those pages, provoking interwiki conflicts. Thus -hintfile: is likely more often used with a singe page title on the command line. That does not preclude, however, a single page title being read from a file using -file:
If, and only if, the file given via -hintfile: has only unspecific hints, such as [[10:]] or [[en:]] or [[latin:]], (or all specific hinted pages do not exist) then supplying a list of pages via -file: would be likely free of conflicts.
There is a difference between hints and the page being processed. While for the outcome, in properly preset cases, it is often irrelevant where the bot starts processing, and which pages are then added because hinted, for the paths the bot follows while collecting links, it does make a huge difference sometimes. We can have hintless processing, but we cannot have a bot run on hints alone, without a starting page.
Maybe we should add some of these to the documentation? Is that, which you are asking for?
----------------------------------------------------------------------
Comment By: siebrand (siebrand) Date: 2009-01-27 08:54
Message: Assigned to committer.
----------------------------------------------------------------------
Comment By: siebrand (siebrand) Date: 2009-01-27 08:50
Message: Assigned to committer.
----------------------------------------------------------------------
Comment By: siebrand (siebrand) Date: 2009-01-27 08:45
Message: Assigned to committer.
----------------------------------------------------------------------
Comment By: siebrand (siebrand) Date: 2009-01-27 08:36
Message: Assigned to committer.
----------------------------------------------------------------------
You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=603138&aid=2531935...
pywikipedia-bugs@lists.wikimedia.org