Feature Requests item #3603329, was opened at 2013-02-04 12:52
Message generated for change (Tracker Item Submitted) made by valhallasw
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603141&aid=3603329&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: None
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Merlijn S. van Deen (valhallasw)
Assigned to: Nobody/Anonymous (nobody)
Summary: Extended version information in user-agent
Initial Comment:
See the discussion at https://www.mediawiki.org/wiki/Special:Code/pywikipedia/11027#c33303
Implementation notes:
Hash of a file:
>>> import hashlib
>>> m = hashlib.sha1()
>>> m.hexdigest()
'93ae86148e74a7c3a3d63f7810b48c51889fba46'
Classes used in stack trace:
>> import inspect
>> [(x.__module__, x.__name__) for x in (s[0].f_locals.get('self', None).__class__ for s in inspect.stack())]
Example result:
[('wikipedia_family', 'Family'), ('pdb', 'Pdb'), ('pdb', 'Pdb'), ('pdb', 'Pdb'), ('pdb', 'Pdb'), ('pdb', 'Pdb'), ('pdb', 'Pdb'), ('pdb', 'Pdb'), ('pdb', 'Pdb'), ('wikipedia_family', 'Family'), ('wikipedia', 'Site'), ('wikipedia', 'Site'), ('wikipedia', 'Site'), ('wikipedia', 'Site'), ('wikipedia', 'Site'), ('wikipedia', 'Page'), ('wikipedia', 'Page'), ('__main__', 'Subject'), ('__main__', 'Subject'), ('__main__', 'InterwikiBot'), ('__main__', 'InterwikiBot'), ('__builtin__', 'NoneType'), ('__builtin__', 'NoneType'), ('__builtin__', 'NoneType'), ('pdb', 'Pdb'), ('pdb', 'Pdb'), ('__builtin__', 'NoneType'), ('__builtin__', 'NoneType'), ('__builtin__', 'NoneType'), ('__builtin__', 'NoneType')]
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603141&aid=3603329&group_…
Patches item #3603143, was opened at 2013-02-02 16:46
Message generated for change (Settings changed) made by yurochek
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603140&aid=3603143&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: None
Status: Closed
>Resolution: Fixed
Priority: 5
Private: No
Submitted By: Ori Livneh (atdt)
Assigned to: Yuri Astrakhan (yurochek)
Summary: Make the user-agent string more descriptive
Initial Comment:
The current default (and widely-used) user-agent string for Pywikipedia bot is "PythonWikipediaBot/1.0". This is both inconsistent with the documentation (which claims it is "Pywikipediabot/1.0") and not as informative as it could be. This patch changes the default user-agent string format to "Pywikipediabot/1.0 (r<revId>; <scriptName>)", where 'revId' is the SVN revision of Pywikipedia and 'scriptName' is the tail path component and file name of the currently executing script. Here is a full example: "Pywikipediabot/1.0 (r11026; pywikipedia/wikipedia.py)"
The name of the currently executing script could help developers and ops engineers at the Wikimedia Foundation pinpoint client implementation issues. For example, some implementations do not efficiently batch requests for multiple titles, but without a more descriptive user-agent string it is hard to know whom to notify or where to submit a patch.
----------------------------------------------------------------------
Comment By: Yuri Astrakhan (yurochek)
Date: 2013-02-03 23:37
Message:
merged
----------------------------------------------------------------------
Comment By: Ori Livneh (atdt)
Date: 2013-02-03 15:59
Message:
Well, OK. Updated patch.
----------------------------------------------------------------------
Comment By: Yuri Astrakhan (yurochek)
Date: 2013-02-03 13:04
Message:
Yes, this is what i what i think would be the best format.
Browsers do it out for historical reason, and the value is inside the
parens - so it is all parts of the first value. The RFC describes each
value to be separated by space, and I think we should follow that, not the
hacks that browsers introduce one on top of the other. The "official" RFC
position is for values to be separated by spaces, with a version followed
the slash.
Example:
Mozilla/5.0 (Windows NT 5.1; U; en; rv:1.8.1) Gecko/20061208 Firefox/2.0.0
Opera 9.50
----------------------------------------------------------------------
Comment By: Ori Livneh (atdt)
Date: 2013-02-03 00:31
Message:
Is this what you have in mind?
mylib-myscript.py/r1234 Pywikipediabot/1.0
It seems odd to tack the SVN revision ID of Pywikipediabot to
mylib-myscript.py, which would be versioned differently.
Detailed implementation information in parens, with items separated by
semicolon, has been the norm since around Netscape 2.0, and is the current
practice of all major browsers:
http://www.nczonline.net/blog/2010/01/12/history-of-the-user-agent-string/
----------------------------------------------------------------------
Comment By: Yuri Astrakhan (yurochek)
Date: 2013-02-02 23:08
Message:
Sorry, there shouldn't be a semicolon per spec - only a space. Also, since
the patch includes parent dir, it should probably be replaced with a '-'.
----------------------------------------------------------------------
Comment By: Yuri Astrakhan (yurochek)
Date: 2013-02-02 23:04
Message:
Per http://tools.ietf.org/html/rfc2616#section-14.43 I think the UA should
be "<scriptname>/<revNumber>; Pywikipediabot/1.0"
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603140&aid=3603143&group_…
Patches item #3603143, was opened at 2013-02-02 16:46
Message generated for change (Comment added) made by yurochek
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603140&aid=3603143&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: None
>Status: Closed
>Resolution: Accepted
Priority: 5
Private: No
Submitted By: Ori Livneh (atdt)
>Assigned to: Yuri Astrakhan (yurochek)
Summary: Make the user-agent string more descriptive
Initial Comment:
The current default (and widely-used) user-agent string for Pywikipedia bot is "PythonWikipediaBot/1.0". This is both inconsistent with the documentation (which claims it is "Pywikipediabot/1.0") and not as informative as it could be. This patch changes the default user-agent string format to "Pywikipediabot/1.0 (r<revId>; <scriptName>)", where 'revId' is the SVN revision of Pywikipedia and 'scriptName' is the tail path component and file name of the currently executing script. Here is a full example: "Pywikipediabot/1.0 (r11026; pywikipedia/wikipedia.py)"
The name of the currently executing script could help developers and ops engineers at the Wikimedia Foundation pinpoint client implementation issues. For example, some implementations do not efficiently batch requests for multiple titles, but without a more descriptive user-agent string it is hard to know whom to notify or where to submit a patch.
----------------------------------------------------------------------
>Comment By: Yuri Astrakhan (yurochek)
Date: 2013-02-03 23:37
Message:
merged
----------------------------------------------------------------------
Comment By: Ori Livneh (atdt)
Date: 2013-02-03 15:59
Message:
Well, OK. Updated patch.
----------------------------------------------------------------------
Comment By: Yuri Astrakhan (yurochek)
Date: 2013-02-03 13:04
Message:
Yes, this is what i what i think would be the best format.
Browsers do it out for historical reason, and the value is inside the
parens - so it is all parts of the first value. The RFC describes each
value to be separated by space, and I think we should follow that, not the
hacks that browsers introduce one on top of the other. The "official" RFC
position is for values to be separated by spaces, with a version followed
the slash.
Example:
Mozilla/5.0 (Windows NT 5.1; U; en; rv:1.8.1) Gecko/20061208 Firefox/2.0.0
Opera 9.50
----------------------------------------------------------------------
Comment By: Ori Livneh (atdt)
Date: 2013-02-03 00:31
Message:
Is this what you have in mind?
mylib-myscript.py/r1234 Pywikipediabot/1.0
It seems odd to tack the SVN revision ID of Pywikipediabot to
mylib-myscript.py, which would be versioned differently.
Detailed implementation information in parens, with items separated by
semicolon, has been the norm since around Netscape 2.0, and is the current
practice of all major browsers:
http://www.nczonline.net/blog/2010/01/12/history-of-the-user-agent-string/
----------------------------------------------------------------------
Comment By: Yuri Astrakhan (yurochek)
Date: 2013-02-02 23:08
Message:
Sorry, there shouldn't be a semicolon per spec - only a space. Also, since
the patch includes parent dir, it should probably be replaced with a '-'.
----------------------------------------------------------------------
Comment By: Yuri Astrakhan (yurochek)
Date: 2013-02-02 23:04
Message:
Per http://tools.ietf.org/html/rfc2616#section-14.43 I think the UA should
be "<scriptname>/<revNumber>; Pywikipediabot/1.0"
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603140&aid=3603143&group_…
Patches item #3603143, was opened at 2013-02-02 16:46
Message generated for change (Comment added) made by atdt
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603140&aid=3603143&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: None
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Ori Livneh (atdt)
Assigned to: Nobody/Anonymous (nobody)
Summary: Make the user-agent string more descriptive
Initial Comment:
The current default (and widely-used) user-agent string for Pywikipedia bot is "PythonWikipediaBot/1.0". This is both inconsistent with the documentation (which claims it is "Pywikipediabot/1.0") and not as informative as it could be. This patch changes the default user-agent string format to "Pywikipediabot/1.0 (r<revId>; <scriptName>)", where 'revId' is the SVN revision of Pywikipedia and 'scriptName' is the tail path component and file name of the currently executing script. Here is a full example: "Pywikipediabot/1.0 (r11026; pywikipedia/wikipedia.py)"
The name of the currently executing script could help developers and ops engineers at the Wikimedia Foundation pinpoint client implementation issues. For example, some implementations do not efficiently batch requests for multiple titles, but without a more descriptive user-agent string it is hard to know whom to notify or where to submit a patch.
----------------------------------------------------------------------
Comment By: Ori Livneh (atdt)
Date: 2013-02-03 15:59
Message:
Well, OK. Updated patch.
----------------------------------------------------------------------
Comment By: Yuri Astrakhan (yurochek)
Date: 2013-02-03 13:04
Message:
Yes, this is what i what i think would be the best format.
Browsers do it out for historical reason, and the value is inside the
parens - so it is all parts of the first value. The RFC describes each
value to be separated by space, and I think we should follow that, not the
hacks that browsers introduce one on top of the other. The "official" RFC
position is for values to be separated by spaces, with a version followed
the slash.
Example:
Mozilla/5.0 (Windows NT 5.1; U; en; rv:1.8.1) Gecko/20061208 Firefox/2.0.0
Opera 9.50
----------------------------------------------------------------------
Comment By: Ori Livneh (atdt)
Date: 2013-02-03 00:31
Message:
Is this what you have in mind?
mylib-myscript.py/r1234 Pywikipediabot/1.0
It seems odd to tack the SVN revision ID of Pywikipediabot to
mylib-myscript.py, which would be versioned differently.
Detailed implementation information in parens, with items separated by
semicolon, has been the norm since around Netscape 2.0, and is the current
practice of all major browsers:
http://www.nczonline.net/blog/2010/01/12/history-of-the-user-agent-string/
----------------------------------------------------------------------
Comment By: Yuri Astrakhan (yurochek)
Date: 2013-02-02 23:08
Message:
Sorry, there shouldn't be a semicolon per spec - only a space. Also, since
the patch includes parent dir, it should probably be replaced with a '-'.
----------------------------------------------------------------------
Comment By: Yuri Astrakhan (yurochek)
Date: 2013-02-02 23:04
Message:
Per http://tools.ietf.org/html/rfc2616#section-14.43 I think the UA should
be "<scriptname>/<revNumber>; Pywikipediabot/1.0"
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603140&aid=3603143&group_…
Patches item #3603143, was opened at 2013-02-02 16:46
Message generated for change (Comment added) made by yurochek
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603140&aid=3603143&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: None
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Ori Livneh (atdt)
Assigned to: Nobody/Anonymous (nobody)
Summary: Make the user-agent string more descriptive
Initial Comment:
The current default (and widely-used) user-agent string for Pywikipedia bot is "PythonWikipediaBot/1.0". This is both inconsistent with the documentation (which claims it is "Pywikipediabot/1.0") and not as informative as it could be. This patch changes the default user-agent string format to "Pywikipediabot/1.0 (r<revId>; <scriptName>)", where 'revId' is the SVN revision of Pywikipedia and 'scriptName' is the tail path component and file name of the currently executing script. Here is a full example: "Pywikipediabot/1.0 (r11026; pywikipedia/wikipedia.py)"
The name of the currently executing script could help developers and ops engineers at the Wikimedia Foundation pinpoint client implementation issues. For example, some implementations do not efficiently batch requests for multiple titles, but without a more descriptive user-agent string it is hard to know whom to notify or where to submit a patch.
----------------------------------------------------------------------
Comment By: Yuri Astrakhan (yurochek)
Date: 2013-02-03 13:04
Message:
Yes, this is what i what i think would be the best format.
Browsers do it out for historical reason, and the value is inside the
parens - so it is all parts of the first value. The RFC describes each
value to be separated by space, and I think we should follow that, not the
hacks that browsers introduce one on top of the other. The "official" RFC
position is for values to be separated by spaces, with a version followed
the slash.
Example:
Mozilla/5.0 (Windows NT 5.1; U; en; rv:1.8.1) Gecko/20061208 Firefox/2.0.0
Opera 9.50
----------------------------------------------------------------------
Comment By: Ori Livneh (atdt)
Date: 2013-02-03 00:31
Message:
Is this what you have in mind?
mylib-myscript.py/r1234 Pywikipediabot/1.0
It seems odd to tack the SVN revision ID of Pywikipediabot to
mylib-myscript.py, which would be versioned differently.
Detailed implementation information in parens, with items separated by
semicolon, has been the norm since around Netscape 2.0, and is the current
practice of all major browsers:
http://www.nczonline.net/blog/2010/01/12/history-of-the-user-agent-string/
----------------------------------------------------------------------
Comment By: Yuri Astrakhan (yurochek)
Date: 2013-02-02 23:08
Message:
Sorry, there shouldn't be a semicolon per spec - only a space. Also, since
the patch includes parent dir, it should probably be replaced with a '-'.
----------------------------------------------------------------------
Comment By: Yuri Astrakhan (yurochek)
Date: 2013-02-02 23:04
Message:
Per http://tools.ietf.org/html/rfc2616#section-14.43 I think the UA should
be "<scriptname>/<revNumber>; Pywikipediabot/1.0"
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603140&aid=3603143&group_…
Patches item #3603143, was opened at 2013-02-02 16:46
Message generated for change (Comment added) made by atdt
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603140&aid=3603143&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: None
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Ori Livneh (atdt)
Assigned to: Nobody/Anonymous (nobody)
Summary: Make the user-agent string more descriptive
Initial Comment:
The current default (and widely-used) user-agent string for Pywikipedia bot is "PythonWikipediaBot/1.0". This is both inconsistent with the documentation (which claims it is "Pywikipediabot/1.0") and not as informative as it could be. This patch changes the default user-agent string format to "Pywikipediabot/1.0 (r<revId>; <scriptName>)", where 'revId' is the SVN revision of Pywikipedia and 'scriptName' is the tail path component and file name of the currently executing script. Here is a full example: "Pywikipediabot/1.0 (r11026; pywikipedia/wikipedia.py)"
The name of the currently executing script could help developers and ops engineers at the Wikimedia Foundation pinpoint client implementation issues. For example, some implementations do not efficiently batch requests for multiple titles, but without a more descriptive user-agent string it is hard to know whom to notify or where to submit a patch.
----------------------------------------------------------------------
Comment By: Ori Livneh (atdt)
Date: 2013-02-03 00:31
Message:
Is this what you have in mind?
mylib-myscript.py/r1234 Pywikipediabot/1.0
It seems odd to tack the SVN revision ID of Pywikipediabot to
mylib-myscript.py, which would be versioned differently.
Detailed implementation information in parens, with items separated by
semicolon, has been the norm since around Netscape 2.0, and is the current
practice of all major browsers:
http://www.nczonline.net/blog/2010/01/12/history-of-the-user-agent-string/
----------------------------------------------------------------------
Comment By: Yuri Astrakhan (yurochek)
Date: 2013-02-02 23:08
Message:
Sorry, there shouldn't be a semicolon per spec - only a space. Also, since
the patch includes parent dir, it should probably be replaced with a '-'.
----------------------------------------------------------------------
Comment By: Yuri Astrakhan (yurochek)
Date: 2013-02-02 23:04
Message:
Per http://tools.ietf.org/html/rfc2616#section-14.43 I think the UA should
be "<scriptname>/<revNumber>; Pywikipediabot/1.0"
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603140&aid=3603143&group_…
Patches item #3603143, was opened at 2013-02-02 16:46
Message generated for change (Comment added) made by yurochek
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603140&aid=3603143&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: None
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Ori Livneh (atdt)
Assigned to: Nobody/Anonymous (nobody)
Summary: Make the user-agent string more descriptive
Initial Comment:
The current default (and widely-used) user-agent string for Pywikipedia bot is "PythonWikipediaBot/1.0". This is both inconsistent with the documentation (which claims it is "Pywikipediabot/1.0") and not as informative as it could be. This patch changes the default user-agent string format to "Pywikipediabot/1.0 (r<revId>; <scriptName>)", where 'revId' is the SVN revision of Pywikipedia and 'scriptName' is the tail path component and file name of the currently executing script. Here is a full example: "Pywikipediabot/1.0 (r11026; pywikipedia/wikipedia.py)"
The name of the currently executing script could help developers and ops engineers at the Wikimedia Foundation pinpoint client implementation issues. For example, some implementations do not efficiently batch requests for multiple titles, but without a more descriptive user-agent string it is hard to know whom to notify or where to submit a patch.
----------------------------------------------------------------------
Comment By: Yuri Astrakhan (yurochek)
Date: 2013-02-02 23:08
Message:
Sorry, there shouldn't be a semicolon per spec - only a space. Also, since
the patch includes parent dir, it should probably be replaced with a '-'.
----------------------------------------------------------------------
Comment By: Yuri Astrakhan (yurochek)
Date: 2013-02-02 23:04
Message:
Per http://tools.ietf.org/html/rfc2616#section-14.43 I think the UA should
be "<scriptname>/<revNumber>; Pywikipediabot/1.0"
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603140&aid=3603143&group_…
Patches item #3603143, was opened at 2013-02-02 16:46
Message generated for change (Comment added) made by yurochek
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603140&aid=3603143&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: None
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Ori Livneh (atdt)
Assigned to: Nobody/Anonymous (nobody)
Summary: Make the user-agent string more descriptive
Initial Comment:
The current default (and widely-used) user-agent string for Pywikipedia bot is "PythonWikipediaBot/1.0". This is both inconsistent with the documentation (which claims it is "Pywikipediabot/1.0") and not as informative as it could be. This patch changes the default user-agent string format to "Pywikipediabot/1.0 (r<revId>; <scriptName>)", where 'revId' is the SVN revision of Pywikipedia and 'scriptName' is the tail path component and file name of the currently executing script. Here is a full example: "Pywikipediabot/1.0 (r11026; pywikipedia/wikipedia.py)"
The name of the currently executing script could help developers and ops engineers at the Wikimedia Foundation pinpoint client implementation issues. For example, some implementations do not efficiently batch requests for multiple titles, but without a more descriptive user-agent string it is hard to know whom to notify or where to submit a patch.
----------------------------------------------------------------------
>Comment By: Yuri Astrakhan (yurochek)
Date: 2013-02-02 23:04
Message:
Per http://tools.ietf.org/html/rfc2616#section-14.43 I think the UA should
be "<scriptname>/<revNumber>; Pywikipediabot/1.0"
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603140&aid=3603143&group_…
Patches item #3603143, was opened at 2013-02-02 16:46
Message generated for change (Tracker Item Submitted) made by atdt
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603140&aid=3603143&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: None
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Ori Livneh (atdt)
Assigned to: Nobody/Anonymous (nobody)
Summary: Make the user-agent string more descriptive
Initial Comment:
The current default (and widely-used) user-agent string for Pywikipedia bot is "PythonWikipediaBot/1.0". This is both inconsistent with the documentation (which claims it is "Pywikipediabot/1.0") and not as informative as it could be. This patch changes the default user-agent string format to "Pywikipediabot/1.0 (r<revId>; <scriptName>)", where 'revId' is the SVN revision of Pywikipedia and 'scriptName' is the tail path component and file name of the currently executing script. Here is a full example: "Pywikipediabot/1.0 (r11026; pywikipedia/wikipedia.py)"
The name of the currently executing script could help developers and ops engineers at the Wikimedia Foundation pinpoint client implementation issues. For example, some implementations do not efficiently batch requests for multiple titles, but without a more descriptive user-agent string it is hard to know whom to notify or where to submit a patch.
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603140&aid=3603143&group_…