Wikibots-l

wikibots-l@lists.wikimedia.org

146 discussions

by FlaBot

Every Monday i public a new weekly botting from from my Interwiki link checker tool. After publishing i bot the list with my FlaBot-Bot. At the end of the week i have an tool to find out if an articel is now present with both links in both languages in the wiki. If this is not, perhaps because my Bot cant do it in autonoumous-mode i will post a list with all the still missing entries of my Database. Here the batch-list for botting : python interwiki.py -warnfile:warning_bot_rebot_need.log -lang:af python interwiki.py -warnfile:warning_bot_rebot_need.log -lang:es python interwiki.py -warnfile:warning_bot_rebot_need.log -lang:fi python interwiki.py -warnfile:warning_bot_rebot_need.log -lang:tr python interwiki.py -warnfile:warning_bot_rebot_need.log -lang:ca python interwiki.py -warnfile:warning_bot_rebot_need.log -lang:da python interwiki.py -warnfile:warning_bot_rebot_need.log -lang:de python interwiki.py -warnfile:warning_bot_rebot_need.log -lang:nds python interwiki.py -warnfile:warning_bot_rebot_need.log -lang:en python interwiki.py -warnfile:warning_bot_rebot_need.log -lang:nl python interwiki.py -warnfile:warning_bot_rebot_need.log -lang:no python interwiki.py -warnfile:warning_bot_rebot_need.log -lang:sv The Data you can get here : http://www.flacus.de/wikipedia/Interwiki-Link-Checker/bot-reb.php At the moment we have week 49. My bot is botting week 48. The List above has entries until week 47, See you next week ;-) -- [[:de:Benutzer:Flacus]][[:de:Benutzer:FlaBot]] http://www.flacus.de/wikipedia/Interwiki-Link-Checker/

18 years, 4 months

Moving deprecated bots to an archive directory

by Daniel Herding

Hi, the main directory keeps filling up with scripts. To keep some order, I would suggest moving some older scripts to an archive directory. That's why I want to know which of the following scripts are still in use. I don't want to offend anyone; if you think one of the scripts in the list is still in use or under development, please simply say so. are-identical.py http://tools.wikimedia.de/~flacus/IWLC/start.php works much better brackethttp.py I don't think anyone still uses it check_extern.py Replaced by weblinkchecker.py copy_table.py Too much work to maintain it editarticle.py No longer maintained, but we should re-use parts of it for other scripts. extract_names.py Doesn't write the file format expected by most scripts. find.py Never worked; we might also consider deleting it. getimages.py imagetransfer.py can do everything this can do pagefromfile.py should either be updated or moved away saveHTML.py No longer maintained and maybe also no longer used sqldump.py All scripts have been changed and now only support xmldump.py translator.py Part of copy_table.py us-states.py Unless someone is still using it vertexgen.py Needs commenting, also in interwiki.py. Now it's unclear what's its purpose. WdT.py and WdTXMLParser.py No longer used/maintained. windows_chars.py I don't think there are still any more ISO 8859-1 wikis left anywhere, are they? Daniel

18 years, 4 months

Regarding getReferences() in wikipedia.py

by Jason Y. Lee

The getReferences() function needs to be re-written due to the new change in the What links here page since the addition of '(inclusion)' marking things as templates. The reason is that the current regex will count that as a redirect. I am in the current process of re-writing this function, but in case anyone wants to beat me to it, I suggest the following all encompassing regular expression to use: re.compile('<li><a href=".*?" title=".*?">(.*?)</a> *\(*(inclusion|redirect page)*\)*.*?</li>') group(1) will give you the title, and group(2) of the search will be either: '', 'inclusion', 'redirect page' -- Jason Y. Lee AKA AllyUnion

18 years, 4 months

Error: No text area found in Non-existing page

by Leonardo Gregianin

[[Wikipedia:Sandbox]] does not function in other projects beyond the wikipedia. ** 1 ** with [[Wikipedia:Sandbox]] ======Post-processing [[pt:Alexis de Tocqueville]]====== Updating links on page [[he:?????? ??-??????]]. Changes to be made: Adding: it + [[it:Alexis de Tocqueville]] NOTE: Updating live wiki... Getting a page to check if we're logged in on wikiquote:he Getting page to get a token. Getting page [[he:Wikipedia:Sandbox]] Sleeping for 8.5 seconds Retrieving MediaWiki messages for wikiquote:he Parsing MediaWiki messages WARNING: No text area found on he.wikiquote.org/w/index.php?title=Wikipedia%3ASandbox&action=edit. Maybe the server is down. Retrying in 1 minutes... Traceback (most recent call last): File "interwiki.py", line 1330, in ? bot.run() File "interwiki.py", line 1114, in run self.queryStep() File "interwiki.py", line 1093, in queryStep subj.finish(self) File "interwiki.py", line 762, in finish if self.replaceLinks(page, new, sa): File "interwiki.py", line 858, in replaceLinks status, reason, data = pl.put(newtext, comment = wikipedia.translate(pl.site ().lang, msg)[0] + mods) File "C:\Python24\wikipedia.py", line 677, in put return self.putPage(newtext, comment, watchArticle, minorEdit, newPage, self .site().getToken(sysop = sysop), sysop = sysop) File "C:\Python24\wikipedia.py", line 2559, in getToken Page(self, "Wikipedia:Sandbox").get(force = True, sysop = sysop) File "C:\Python24\wikipedia.py", line 351, in get self._contents, self._isWatched, self.editRestriction = self.getEditPage(get _redirect = get_redirect, throttle = throttle, sysop = sysop) File "C:\Python24\wikipedia.py", line 448, in getEditPage i2 = re.search('</textarea>', text).start() AttributeError: 'NoneType' object has no attribute 'start' ** 2 ** with Non-existing page ======Post-processing [[pt:Alexis de Tocqueville]]====== Updating links on page [[he:?????? ??-??????]]. Changes to be made: Adding: it + [[it:Alexis de Tocqueville]] NOTE: Updating live wiki... Getting a page to check if we're logged in on wikiquote:he Getting page to get a token. Getting page [[he:Non-existing page]] Changing page [[he:?????? ??-??????]] Updating links on page [[en:Alexis de Tocqueville]]. Changes to be made: Adding: it + [[it:Alexis de Tocqueville]] NOTE: Performing a recursive query first to save time.... NOTE: Nothing left to do 2 NOTE: Updating live wiki... Getting a page to check if we're logged in on wikiquote:en ... Leonardo Gregianin

18 years, 4 months

Q) Bot usage

by WonYong

Q) Bot usage Help!! (admin) bot usages!! 1. user:A tagged *{{no license|~~~~~}}* to "image:sample.jpg" uploaded by user:B 2. user:Cbot tag *{{subst:Image copyright|Image:sample.jpg}} --~~~~* to user_talk:B, and change *{{no license|~~~~~}}* to *{{no license notified by bot|~~~~~}}* automatically. What is bot command?? 2. user:Cbot insert speedy deletion tag to Image:sample.jpgafter 7 days automatically. What is bot command?? 3. user:Dbot(admin) delete all "image:..." in the certain speedy del category automatically. What is bot command?? --WonYong<http://en.wikipedia.org/wiki/User:WonYong>13:41, 30 December 2005 (UTC) 원본 주소 - 'http://en.wikipedia.org/wiki/Wikipedia:Village_pump_%28technical%29' -- WonYong

18 years, 5 months

Wikilinks in regards to transwiki linking

by Jason Y. Lee

I've been wondering about some kind of parser to add to the python wikipedia project such that it knows how to handle transwiki links as well as trans-interwiki links. This email contains my thoughts on the matter, please feel free to correct me, add to it, or elaborate anything further. I felt it necessary to send this email, so that we could possibly all agree on several points, and that will help me or someone else pin down the problem properly. ---- I've come to realize some assumptions: * Each wikilink will not typically exceed the usage of 2-3 colons, although anything above that is usually redundant. Example: [[q:fr:w:en:Test]] on the English Wikipedia does process properly, and following such a link will take you to the English Wikiquote then to the French Wikiquote then back to the English Wikipedia. Problem: Is there a simple and easy way to test for a namespace? Discussion: Not easily... At least, I think so. * The last part of the wikilink is ALWAYS the article title. ---- Most language codes are two characters long, but there are some exceptions: als, ang, arc, ast, cho, chr, chy, csb, fur, haw, jbo, mus, nah, nds, roa-rup, simple, tlh, tokipona, tpi, tpi, tum, zh-min-nan, zh-cn, zh-tw, minnan, & zh-cfr I noticed in the family file there is another one: bug There maybe more on that list that I'm unaware of... It's likely to safe to assume the same language families of the other projects, even those may not exist yet. If they don't, they will likely exist in the future. ---- Okay, here is a list of cases, based on colon count: 1 colon: 1) Article namespace with no leading character in front 2) Interwiki link 3) Namespace preceeding the colon 2 colons: 1a) Interwiki link + Namespace (untranslated in English) 1b) Interwiki link + Namespace (translated in the Interwiki link's language) 2) Transwiki link + Interwiki link 3) Transwiki link + Namespace (may have to consider about different names for the "Project" namespace) 3 colons: 1) Transwiki link + Interwiki link + Namespace (translated/untranslated) 2) Interwiki link + Transwiki link + Interwiki link (stupid, but possible) 3) Interwiki link + Transwiki link + Namespace link (stupid, but possible) 4) Transwiki link + Interwiki link + Interwiki link (stupid, but possible) 5) Transwiki link X3 (stupid, but possible) 6) Transwiki link + Transwiki link + Namespace link (stupid, but possible) 4+ colons: Any combination above Possible solution(s): * Create a function to specifically to determine transwiki links * Create a function to specifically to determine interwiki links based on transwiki link information * Create a function to specifically to determine namespace links based on transwiki and interwiki link information * Develop a class that uses the information from: http://meta.wikimedia.org/wiki/Interwiki_map * Develop a class for conversion only for the current available families -- ignore the rest ---- If we split anything between '[[' and ']]' using the ':' as the separator, we know the following to be true: If the list is size of 1, then it has no interwiki links, no category links, and no transwiki links. We also know that [0] is the name of the article. No matter what the situation of the split, index of -1 will always point to the name of the article. Now, the matter is: In what order should we proceed? Should we scan forwards or backwards? In what order should we look for links? 1) Transwiki, interwiki, namespace 2) namespace, interwiki, transwiki 3) interwiki, namespace, transwiki etc. ---- One thing is for certain, the regular expression regarding this will be extensively long. If we do manage to resolve this, then our parser for wikilinks should be able to handle anything we throw at it, and would make any related bugs regarding linkedPages(), and getRedirectPage() easier to fix. One thing I have a problem with is that getRedirectPage() returns a string object, rather than a Page object. But it is obvious that it should return a string object, because it could have any of the number of situations I've described above. The principle reason behind why I'm concerned over this matter is that I'm in the process of developing a Notification bot. Unfortunately, I've run into several user pages who have, in their wisdom, decided to redirect their pages to either a different project or a different language, and sometimes it is a combination of both. So I've been thinking of a way to properly parse the information from getRedirectPage() such that I can pass the correct parameters to the Site class. Thoughts, anyone? -- Jason Y. Lee AKA AllyUnion

18 years, 6 months

Request for a -namespace option to interwiki.py

by Ævar Arnfjörð Bjarmason

I'd like to request a -namespace option to the interwiki.py script, I see that the allpages() function in wikipedia.py supports this but I don't know python well enough (or at all) to add this myself. This would be useful e.g. to add/update interwiki links to/on templates & categories.

18 years, 6 months

getVersionHistory improvements

by Jason Y. Lee

I have made several getVersionHistory function improvements. Please see: http://sourceforge.net/tracker/index.php?func=detail&aid=1356344&group_id=9… For more details. -- Jason Y. Lee AKA AllyUnion

18 years, 6 months

Spelling error

by Jason Y. Lee

In the comment in wikipedia.py what the heck is "Dinamic pages"? -- Jason Y. Lee AKA AllyUnion

18 years, 7 months

pywikipediabot interwiki.py should not move content after interwiki links below it

by Ævar Arnfjörð Bjarmason

At the Icelandic Wikipedia we put <noinclude> after the first paragraph of an article and </noinclude> at the very end of it so that we can put {{:Article}} at the articles top category to get a summary of the topic, however interwiki.py converts something like: " [[en:Topic]] </noinclude> " into " </noinclude> [[fr:Topico]] [[en:Topic]] ... " which means that the interwikis get transcluded to the category page, is there a way to make the bot place the added interwiki links where it found the first one instead of putting them all at the bottom?

18 years, 7 months

← Newer
1
...
7
8
9
10
11
12
13
14
15
Older →

Jump to page:

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

Wikibots-l