Feature Requests item #1876050, was opened at 2008-01-20 22:58
Message generated for change (Comment added) made by purodha
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603141&aid=1876050&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
>Category: interwiki
Group: None
>Status: Closed
Priority: 5
Private: No
Submitted By: Purodha B Blissenbach (purodha)
>Assigned to: Purodha B Blissenbach (purodha)
Summary: Interwiki.py - earlier abort of a page wanted
Initial Comment:
When interwiki.py says:
WARNING: [[zxx:pagetitle1]] doesn't seem to be a disambiguation page, but [[und:pagetitle2]] is one. Follow it anyway? ([y]es, [n]o, [a]dd an alternative)
, one may be unable to decide, e.g. having insufficient skills reading the languages or scripts. It does not matter which answer one gives here, since later, when the same pages comes up to be finalized, one would usually have to give up on it.
I suggest, to add a choice to abandon processing the page at the earlier stage already, when the above message is shown. So, it could be extended as:
... Follow it anyway? ([y]es, [n]o, [a]dd an alternative, [g]ive up)
----
The same holds for the message:
WARNING: [[zxx:namespace1:pagetitle1]] is in namespace 1, but [[und:namespace2:pagetitle2]] is in namespace 2. Follow it anyway? ([y]es, [n]o)
----------------------------------------------------------------------
>Comment By: Purodha B Blissenbach (purodha)
Date: 2009-01-23 17:27
Message:
The features have been meanwhile implemented.
----------------------------------------------------------------------
Comment By: Purodha B Blissenbach (purodha)
Date: 2008-11-21 14:16
Message:
2nd suggestion only: done with r6111.
(1st suggestion had been implemented earlier by someone else)
----------------------------------------------------------------------
Comment By: Purodha B Blissenbach (purodha)
Date: 2008-11-21 12:05
Message:
The 1st suggestion (disambiguation mismatch) has already been taken care
of.
I am working now on the 2nd suggestion (namespace mismatch).
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603141&aid=1876050&group_…
Feature Requests item #1771986, was opened at 2007-08-10 20:49
Message generated for change (Settings changed) made by purodha
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603141&aid=1771986&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
>Category: interwiki
Group: None
Status: Open
Priority: 5
Private: No
Submitted By: Aurimas Fischer (ebola_rulez)
Assigned to: Nobody/Anonymous (nobody)
Summary: interwiky.py trusted language:page
Initial Comment:
When I try to fix interwiki conflicts, I usually check my native language wiki and remove/fix incorrect interwiki links. Then I use interwiki.py to manually choose correct interwiki links when presented with choice.
This sometimes causes to choose from as much as 14 (!) different variants and shows hundreds of rows. This slows down the process:
Try to find if the page from native wiki is in one of these groups. If it is, then choose this group. If not, then analyze different languages or view pages in a browser.
I suggest to add command line argument -trusted (only in interactive mode).
When used, this should cause interwiki.py to automatically choose correct variant number based on initial language:page combination.
For example:
interwiki.py -lang:en -trusted Cat
...
(1) Found link to [[eo:Pantero]] in:
[[da:Panter]]
[[en:Panther]]
(2) Found link to [[eo:Hejma kato]] in:
[[da:Kat]]
[[en:Cat]]
...
Should automatically choose variant 2, because all interwiki links in en:Cat are trusted.
----------------------------------------------------------------------
>Comment By: Purodha B Blissenbach (purodha)
Date: 2009-01-23 17:06
Message:
I believe, the -localright parameter would - at least to a large degree -
provide what you want and fulfil your demand. Sorry. I cannot translate
that to french.
-- Babelfish translation, altered --
Je considère, le paramètre -localright ( au moins en grande partie ) de
fournir ce que vous voulez et accomplissez votre demande. Je sui désolé.
Je ne peux pas traduire cela au Français.
----------------------------------------------------------------------
Comment By: Daniel Herding (wikipedian)
Date: 2007-09-04 12:19
Message:
Logged In: YES
user_id=880694
Originator: NO
This sounds quite similar to the -localright parameter that Andre Engels
has recently added.
----------------------------------------------------------------------
Comment By: Aurimas Fischer (ebola_rulez)
Date: 2007-08-11 10:34
Message:
Logged In: YES
user_id=959303
Originator: YES
I'm not a python programmer but managed
to hack a working prototype of this functionality.
File Added: interwiki_trusted.patch
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603141&aid=1771986&group_…
Feature Requests item #1500288, was opened at 2006-06-04 03:09
Message generated for change (Comment added) made by purodha
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603141&aid=1500288&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: None
>Status: Closed
Priority: 5
Private: No
Submitted By: Nobody/Anonymous (nobody)
Assigned to: Nobody/Anonymous (nobody)
Summary: Have weblinkchecker.py check the Internet Archive for backup
Initial Comment:
weblinkchecker.py apparently has an option to take action on finding a broken link (currently only to add something to a talk page; I haven't been able to get this to work, though). But it would be even better if it could insert, in a comment or perhaps an addendum after the broken link, a link to backups of that page in the Internet Archive/Wayback Machine.
I don't think this enhancement would be backbreakingly difficult and troublesome. The script would have to prepend "http://web.archive.org/web/" to the original URL, check whether the string "Not in Archive." (or whatever the current error message is) appears in the Internet Archive page. If it does, then simply carry on with the rest of the links to be checked; if not, if the Archive *does* have something backed up, then take some boilerplate like "The preceding URL appeared to be invalid to weblinkchecker.py; however, backups of the URL can be found in the [[Internet Archive]] $HERE. You may want to consider amending the original link to point to the archived copies and not the live one.", replace $HERE with the URL prepended with the Archive bit, and insert in a comment.
-maru
----------------------------------------------------------------------
>Comment By: Purodha B Blissenbach (purodha)
Date: 2009-01-23 16:53
Message:
The described feature has been implemented meanwhile.
----------------------------------------------------------------------
Comment By: Daniel Herding (wikipedian)
Date: 2008-01-31 01:03
Message:
Logged In: YES
user_id=880694
Originator: NO
By the way, I have already implemented Internet Archive lookup long ago.
webcitation.org is not yet supported yet, though.
----------------------------------------------------------------------
Comment By: Nobody/Anonymous (nobody)
Date: 2008-01-30 20:09
Message:
Logged In: NO
Isn't it possible to create a bot that checks when the external links
works again? In this uses the category with inaccessible external links.
When an external link is accessible again the bod removes the message from
the talkpage, the bot marks the talkpage with the template for speedy
deletion.
My apologise if I'm adding this message on the wrong page.
Regards,
Kenny (from the Dutch Wikipedia
http://nl.wikipedia.org/wiki/Gebruiker:Ken123 )
----------------------------------------------------------------------
Comment By: Nobody/Anonymous (nobody)
Date: 2007-06-24 18:42
Message:
Logged In: NO
In the same vein, it would be good if WebCite
<http://www.webcitation.org/> archived pages were included as well. There's
apparently some nice programmatic ways of looking for archived URLs
according to
<http://www.webcitation.org/doc/WebCiteBestPracticesGuide.pdf>.
While I'm writing, it'd also be good if the bot would proactively archive
pages when they disappear and come back. Variable uptime to me bespeaks a
page that is likely to disappear permanently. It isn't hard either - it's
just
"www.webcitation.org/archive?url=" ++ url ++ "&email=foo(a)bar.com"
----------------------------------------------------------------------
Comment By: Nobody/Anonymous (nobody)
Date: 2007-06-24 18:33
Message:
Logged In: NO
In the same vein, it would be good if WebCite
<http://www.webcitation.org/> archived pages were included as well. There's
apparently some nice programmatic ways of looking for archived URLs
according to
<http://www.webcitation.org/doc/WebCiteBestPracticesGuide.pdf>.
While I'm writing, it'd also be good if the bot would proactively archive
pages when they disappear and come back. Variable uptime to me bespeaks a
page that is likely to disappear permanently. It isn't hard either - it's
just
"www.webcitation.org/archive?url=" ++ url ++ "&email=foo(a)bar.com"
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603141&aid=1500288&group_…
Feature Requests item #2284955, was opened at 2008-11-14 16:45
Message generated for change (Comment added) made by purodha
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603141&aid=2284955&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: interwiki
Group: None
Status: Open
Priority: 5
Private: No
Submitted By: Nobody/Anonymous (nobody)
Assigned to: Nobody/Anonymous (nobody)
Summary: interwiki hints from file
Initial Comment:
It would be helpful if interwiki.py could read hints not only from console but also from a file, one per line, e.g. like this:
# [[:xx:page_without_interwiki]] [[:en:English_page_used_as_a_hint]]
please add this option. thanks
----------------------------------------------------------------------
Comment By: Purodha B Blissenbach (purodha)
Date: 2009-01-23 16:35
Message:
This artifact has been marked as a duplicate of artifact 2528275 with
reason:
No explanation provided.
----------------------------------------------------------------------
Comment By: Nobody/Anonymous (nobody)
Date: 2008-11-14 17:12
Message:
I think the best syntax for this option would be:
interwiki.py -hint:file:[hints_file.txt]
if no filename is specified after "file", ask for it from console.
it would make this option consistent with other existing hint options like
number of languages, "all" or "latin".
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603141&aid=2284955&group_…
Feature Requests item #2326967, was opened at 2008-11-22 13:48
Message generated for change (Comment added) made by purodha
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603141&aid=2326967&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: interwiki
Group: None
Status: Open
Priority: 5
Private: No
Submitted By: Woo-Jin Kim (kwj2772)
Assigned to: Nobody/Anonymous (nobody)
Summary: interwiki.py -wiktionary
Initial Comment:
interwiki.py -wiktionary take too much time to process.
Currently, interwiki bot retreives 60 en article, next 60 fr article. (language code is just an example)
I suggest for bot to retreive all wiki entries with same name at the same time to reduce time to retreive.
Thank you.
----------------------------------------------------------------------
>Comment By: Purodha B Blissenbach (purodha)
Date: 2009-01-23 16:31
Message:
I am pretty confident that, the current way of retrieval is usually much
faster, because it is using the [[Special:Export]] special page so as to
retrieve those up to 60 (unless -array is used) pages at once, in a single
http request.
Pages of different language wikis are retrieved from URLs having different
domains, thus there is no way known to me to have them bundeld into a
single http request. Thus they would approximately take (up to) 60 times as
long. A http request consumes almost the same clock time, while increasing
its length only marginally adds to its duration.
Thus I suggest closing this suggestion as "wontfix". I leave it open
though so as to give others an opportunity to possibly disagree with me :-)
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603141&aid=2326967&group_…
Feature Requests item #2528275, was opened at 2009-01-22 10:46
Message generated for change (Comment added) made by purodha
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603141&aid=2528275&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
>Category: interwiki
Group: None
>Status: Closed
Priority: 5
Private: No
Submitted By: Nobody/Anonymous (nobody)
>Assigned to: Purodha B Blissenbach (purodha)
Summary: wikipedia.py / global bot flag
Initial Comment:
The bot flag warning if _getUserData should be suppressed if the bot has a global bot flag.
----------------------------------------------------------------------
>Comment By: Purodha B Blissenbach (purodha)
Date: 2009-01-23 16:19
Message:
Solved with release 6291.
See:
http://svn.wikimedia.org/viewvc/pywikipedia/trunk/pywikipedia/interwiki.py?…
Please make the bot flag request for GhalyBot at:
http://meta.wikimedia.org/wiki/Steward_requests/Bot_status#Global_bot_reque…
----------------------------------------------------------------------
Comment By: Nobody/Anonymous (nobody)
Date: 2009-01-22 12:20
Message:
I would like to apply for a global bot flag for GhalyBot
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603141&aid=2528275&group_…
Revision: 6291
Author: purodha
Date: 2009-01-23 16:08:14 +0000 (Fri, 23 Jan 2009)
Log Message:
-----------
Typing error.
Modified Paths:
--------------
trunk/pywikipedia/interwiki.py
Modified: trunk/pywikipedia/interwiki.py
===================================================================
--- trunk/pywikipedia/interwiki.py 2009-01-23 16:06:51 UTC (rev 6290)
+++ trunk/pywikipedia/interwiki.py 2009-01-23 16:08:14 UTC (rev 6291)
@@ -1617,7 +1617,7 @@
hints.append(arg[6:])
elif arg.startswith('-hintfile:'):
hintfilename = arg[10:]
- if (hintfilename is None) or (hintfilenname == ''):
+ if (hintfilename is None) or (hintfilename == ''):
hintfilename = wikipedia.input(u'Please enter the hint filename:')
f = codecs.open(hintfilename, 'r', config.textfile_encoding)
R = re.compile(ur'\[\[(.+?)(?:\]\]|\|)') # hint or title ends either before | or before ]]
Revision: 6290
Author: purodha
Date: 2009-01-23 16:06:51 +0000 (Fri, 23 Jan 2009)
Log Message:
-----------
Forgotten case added.
Modified Paths:
--------------
trunk/pywikipedia/interwiki.py
Modified: trunk/pywikipedia/interwiki.py
===================================================================
--- trunk/pywikipedia/interwiki.py 2009-01-23 16:03:32 UTC (rev 6289)
+++ trunk/pywikipedia/interwiki.py 2009-01-23 16:06:51 UTC (rev 6290)
@@ -1617,7 +1617,7 @@
hints.append(arg[6:])
elif arg.startswith('-hintfile:'):
hintfilename = arg[10:]
- if hintfilename is None:
+ if (hintfilename is None) or (hintfilenname == ''):
hintfilename = wikipedia.input(u'Please enter the hint filename:')
f = codecs.open(hintfilename, 'r', config.textfile_encoding)
R = re.compile(ur'\[\[(.+?)(?:\]\]|\|)') # hint or title ends either before | or before ]]
Revision: 6289
Author: purodha
Date: 2009-01-23 16:03:32 +0000 (Fri, 23 Jan 2009)
Log Message:
-----------
Typing error.
Modified Paths:
--------------
trunk/pywikipedia/interwiki.py
Modified: trunk/pywikipedia/interwiki.py
===================================================================
--- trunk/pywikipedia/interwiki.py 2009-01-23 16:00:52 UTC (rev 6288)
+++ trunk/pywikipedia/interwiki.py 2009-01-23 16:03:32 UTC (rev 6289)
@@ -1616,10 +1616,10 @@
elif arg.startswith('-hint:'):
hints.append(arg[6:])
elif arg.startswith('-hintfile:'):
- hintfile = arg[10:]
- if filename is None:
- filename = wikipedia.input(u'Please enter the hint filename:')
- f = codecs.open(filename, 'r', config.textfile_encoding)
+ hintfilename = arg[10:]
+ if hintfilename is None:
+ hintfilename = wikipedia.input(u'Please enter the hint filename:')
+ f = codecs.open(hintfilename, 'r', config.textfile_encoding)
R = re.compile(ur'\[\[(.+?)(?:\]\]|\|)') # hint or title ends either before | or before ]]
for pageTitle in R.findall(f.read()):
hints.append(pageTitle)
Revision: 6288
Author: purodha
Date: 2009-01-23 16:00:52 +0000 (Fri, 23 Jan 2009)
Log Message:
-----------
Cannot use page generator for hintfile.
Modified Paths:
--------------
trunk/pywikipedia/interwiki.py
Modified: trunk/pywikipedia/interwiki.py
===================================================================
--- trunk/pywikipedia/interwiki.py 2009-01-23 14:58:03 UTC (rev 6287)
+++ trunk/pywikipedia/interwiki.py 2009-01-23 16:00:52 UTC (rev 6288)
@@ -1617,10 +1617,13 @@
hints.append(arg[6:])
elif arg.startswith('-hintfile:'):
hintfile = arg[10:]
- hintPageGen = pagegenerators.TextfilePageGenerator(hintfile)
- for page in hintPageGen:
- hints.append(page.title())
- del hintPageGen
+ if filename is None:
+ filename = wikipedia.input(u'Please enter the hint filename:')
+ f = codecs.open(filename, 'r', config.textfile_encoding)
+ R = re.compile(ur'\[\[(.+?)(?:\]\]|\|)') # hint or title ends either before | or before ]]
+ for pageTitle in R.findall(f.read()):
+ hints.append(pageTitle)
+ f.close()
elif arg == '-force':
globalvar.force = True
elif arg == '-same':