Is it possible to get duplicate images of the original image (scaled up) by giving input the smaller version in a program? Any Python Wikipedia API like hashlib?
Thanks
Y. Jenith
Sent from BlackBerry® on Airtel
Hi,
I use replace.py and the Euro sign appears as a gray question mark (?)
(that means the default in transliteration.py, but not even yellow) instead
of a yellow "*E*". I checked with copy and paste that it is literally shown
in 230th line of transliteration.py. I also checked Japanese yen that
appears correctly (just 2 lines below Euro in transliteration.py).
Ukrainian гривня (₴) is not listed, it appears as a *yellow ?*. Help me,
where to begin debugging?
--
Bináris
Hi,
I just want to share my new experience. Maybe this is trivial for everybody
except me, but was new for me.
So I started a Python command line interpreter from the Pywikipedia
dictionary, and it began to work immediately.
import wikipedia
site=wikipedia.getSite()
and you may begin to do anything interactively. It's a big fun! :-))
My task was to delete every second section title from my subpage, and it
could be done by a few commands without saving any script.
I was never aware of what a plenty of information Page.put() returns but in
the interpreter environment returned values are displayed automatically
without print so I know it now. :-)
That's what I always wanted for quick one-time tasks. I don't know why I
never experimented wit this earlier.
--
Bináris
Good morning (to whom it may concern in their time zone),
last night Merlijn had a maintenance day and closed so many bugs that I
could not even read the mails in less than three sessions. :-) Thank you!
Merlijn copied "open in browser" from replace.py to add_text.py:
http://www.mediawiki.org/wiki/Special:Code/pywikipedia/10034
Just a day before I was to do the same with solve_disambiguation.py, but I
had no time yet. What about rather introducing a browseropen() method to
class Page?
I think any error message during this process shall appear in webbrowser,
not the pywiki script, am I right?
--
Bináris
Currently, we have some places in the code where we
wikipedia.output("%s" % e) with e and exception. This breaks if the
exception (whose information is of type str) is printed through
.output (which requires unicode).
See my comments on this issue below; is there a reason to print errors
through .output instead of using the built-in python functions?
---------- Forwarded message ----------
Subject: [Pywikipedia-bugs] [ pywikipediabot-Patches-3092870 ] non
ascii in system messages and max retry
https://sourceforge.net/tracker/?func=detail&atid=603140&aid=3092870&group_…
>Comment By: Merlijn S. van Deen (valhallasw)
Date: 2012-03-21 09:49
Message:
I think we should either
a) skip the entire output() machinery and use traceback.print_exc()
instead
or
b) write a wrapper for that does what you propose here (but which can also
be used for traceback.format_exc).
and replace all exception printing with one of those two options.
----------------------------------------------------------------------
I'm having trouble with this script, which I'm running on Appropedia.org...
it's not a huge deal if it doesn't work, but I'd appreciate if anyone has
the patience to help me understand how to debug this, or *why* it doesn't
work.
I've narrowed it down to the \2 in the replace term, as the problem
disappears when I remove it:
python replace.py -regex '(?si)\b(WordPress)\b(.*$)'
'\1\2\n[[Category:Appropedia WordPress site]]'
-excepttext:'(?si)\[\[\s*Category:\s*Appropedia WordPress site'
-excepttext:'(?si)(\#redirect\s*\[\[)' -namespace:4 -namespace:12
-summary:'add [[Category:Appropedia WordPress site]] based on search and
manual check.' -log:CategoryAdd -xml:currentdump.xml
Output is:
Reading XML dump...
Traceback (most recent call last):
File "/home/cwg23/pwb/pagegenerators.py", line 1182, in __iter__
for page in self.wrapped_gen:
File "/home/cwg23/pwb/pagegenerators.py", line 1039, in
NamespaceFilterPageGenerator
for page in generator:
File "/home/cwg23/pwb/pagegenerators.py", line 1084, in
DuplicateFilterPageGenerator
for page in generator:
File "replace.py", line 217, in __iter__
new_text = pywikibot.replaceExcept(new_text, old, new, self.excsInside,
self.site)
File "/home/cwg23/pwb/pywikibot/textlib.py", line 175, in replaceExcept
match.group(groupID) + \
IndexError: no such group
no such group
0 pages were changed.
And then it gets interesting... to speed things up while debugging, I made
a modified replace script called replace2.py which only loads 2 pages at a
time (by setting "maxquerysize = 2" in that file). Funny thing - I can run
exactly the same command but with "replace2.py" and it works... up until it
gets to a particular page. Then I press n and get the error. (Btw, I've run
versions of this bot in the past with only the match & replace text
changed, with no problems, so it makes sense that the error only occurs in
specific conditions.)
The last page that it gives me is Appropedia:A Humourless Lot staging
area<http://www.appropedia.org/Appropedia:A_Humourless_Lot_staging_area>-
I assume the page where the problem occurs is one of the next 2 being
loaded, and I don't know how to tell which pages they are. I can't see how
the order of pages is determined, as it changed during my debugging/testing.
Thanks for any ideas.
--
Chris Watkins
Appropedia.org - Sharing knowledge to build rich, sustainable lives.
PyWikipedians,
Regardless of whether this talk gets accepted at Wikimania, we would like
to talk with anyone interested in building this kind of bot. Please
contact me off-list if this is your kind of thing.
-jrf
http://wikimania2012.wikimedia.org/wiki/Submissions/TREC-KBA-Mining-Content…
TREC KBA - Mining Content Streams to Recommend Page Updates to Editors
Abstract: We have organized a new session in NIST's Text Retrieval
Conference (TREC) called Knowledge Base Acceleration (KBA). TREC KBA
challenges computer science researchers to develop algorithms that mine
content streams, such as news and blogs, to recommend edits to knowledge
bases (KB), such as Wikipedia. We consider a KB to be "large" if the
number of entities described by the KB is larger than the number of humans
maintaining the KB. As entities change and evolve in the real world, large
KBs often lag behind by months or years. Such large KBs are an
increasingly important tool in several industries, including biomedical
research, law enforcement, and financial services. TREC KBA aims to
develop algorithms for helping KB editors stay abreast of changes to the
organizations, people, proteins, and other entities described by their
KBs. In this talk, we will give an overview of the TREC KBA data sets and
tasks for 2012 and future years. In addition to developing text analytics,
we are also working on a wikipedia bot for connecting KBA-type systems to
users' talk pages in mediawiki. After presenting the current state of our
bot development, we hope to engage the audience in an open discussion
about how such algorithms might be most fruitfully employed in the
Wikipedia community.
http://trec-kba.org/
(If you want this talk to get accepted in Wikimania this July, consider
putting your name on the "interested" list in the wiki linked above.)