Hello all,
while I was under the shower today, I got the following weird idea: We have several interwiki-bots on the toolserver, that do more or less the same, but by different ts-users. Some bots run old versions of the software, some stop for unknown reason and were never restarted (properly because the ts-user left) and some just work how they should. Often I read in the wikimedia- projects (most time in my homewiki dewp of corse), that there are problems with a bot (because it add a wrong interwiki-link again) and the users don't know how to contact the bot-owner and what they should do.
So my idea: Amalgamate all interwiki-bots (in the same programming-language of corse) into 1 multi-maintainer-project. The advantages would be, that we would use lesser resources, the bot-maintainer could work together, they could use a database together, it would be easier for wikimedia-project-user to contact us (in jira for example or with a mailinglist), if something is wrong, it would be easier to contact the bot-software-maintainer and so on.
Any thoughts about that? Good idea or a "you had too much hot water in the shower"-idea?
Sincerly, DaB.
So my idea: Amalgamate all interwiki-bots (in the same programming-language of corse) into 1 multi-maintainer-project. The advantages would be, that we would use lesser resources, the bot-maintainer could work together, they could use a database together, it would be easier for wikimedia-project-user to contact us (in jira for example or with a mailinglist), if something is wrong, it would be easier to contact the bot-software-maintainer and so on.
Any thoughts about that? Good idea or a "you had too much hot water in the shower"-idea?
This has already happened for the main namespace on Wiktionary, see http://en.wiktionary.org/wiki/User:Interwicket. Obviously it's easier for that bot because wiktionary interwiki links are always between identical titles.
I think it is very sensible to have "the toolserver" be responsible for all interwiki linking on projects (including other namespaces on wiktionary). Doing so would allow a shared database of "correct" links to be built, particularly for articles that are known to be problematic (hey, there could even be a web interface for reporting errors).
Whether the toolserver should be responsible for making "all" interwiki edits is more debatable, but it would certainly be useful to create a more central place for this to happen.
Conrad
2010/1/7 DaB. WP@daniel.baur4.info:
Any thoughts about that? Good idea or a "you had too much hot water in the shower"-idea?
I think it is an “obvious idea”—with which I mean “great idea”. I would _love_ to see something like that; or, at the very least, something going in that direction (i.e. reduction of the number of bots). The multiplication of interwiki bots is a complete waste of everyone’s time and space, IMHO.
-- [[cs:User:Mormegil | Petr Kadlec]]
On Thu, Jan 7, 2010 at 4:22 PM, Petr Kadlec petr.kadlec@gmail.com wrote:
2010/1/7 DaB. WP@daniel.baur4.info:
Any thoughts about that? Good idea or a "you had too much hot water in
the
shower"-idea?
I think it is an “obvious idea”—with which I mean “great idea”. I would _love_ to see something like that; or, at the very least, something going in that direction (i.e. reduction of the number of bots). The multiplication of interwiki bots is a complete waste of everyone’s time and space, IMHO.
Optimally, they should not exist at all and interwiki linking handled by MW itself. Until this happens, the idea of one multi-maintainer bot doing *ALL* IW linking would be great.
Marco
Keep in mind that running a multi-maintainer bot would be problematic since local policies of many wikis (at least ru: and en:) prohibit sharing single account between different users.
--vvv
Hi Victor,
Victor Vasiliev schreef:
Keep in mind that running a multi-maintainer bot would be problematic since local policies of many wikis (at least ru: and en:) prohibit sharing single account between different users.
I don't that rule applies to bots. Take for example CommonsDelinker. It is run from a multi-maintainer account and runs at about every Wikimedia wiki.
Maarten
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 10-01-07 11:23 AM, Victor Vasiliev wrote:
Keep in mind that running a multi-maintainer bot would be problematic since local policies of many wikis (at least ru: and en:) prohibit sharing single account between different users.
--vvv
If that policy prevents multi-maintainer bots, then the policy is wrong and must be ignored. Mocking whoever came up with it would be optional.
- -Mike
I think it would be a good idea to replace the fully automated iw-bots with a multi-maintainer-bot.
You can also do iw "manually" where you (as human) choose with iw-link to accept. Therefore it should also be possible to have bots you can run/operate "manually".
I think there are many things that are done at the different projects where it would be possible to share bots. But let's start with iw and see how it works.
So if it was because of the hot water please also take a hot shower tomorrow ;-)
:-) MGA73
-----Oprindelig meddelelse----- Fra: toolserver-l-bounces@lists.wikimedia.org [mailto:toolserver-l-bounces@lists.wikimedia.org] På vegne af DaB. Sendt: 7. januar 2010 16:14 Til: toolserver-l@lists.wikimedia.org Emne: [Toolserver-l] Interwiki-Bots
Hello all,
while I was under the shower today, I got the following weird idea: We have several interwiki-bots on the toolserver, that do more or less the same, but by different ts-users. Some bots run old versions of the software, some stop for unknown reason and were never restarted (properly because the ts-user left) and some just work how they should. Often I read in the wikimedia- projects (most time in my homewiki dewp of corse), that there are problems with a bot (because it add a wrong interwiki-link again) and the users don't know how to contact the bot-owner and what they should do.
So my idea: Amalgamate all interwiki-bots (in the same programming-language of corse) into 1 multi-maintainer-project. The advantages would be, that we would use lesser resources, the bot-maintainer could work together, they could use a database together, it would be easier for wikimedia-project-user to contact us (in jira for example or with a mailinglist), if something is wrong, it would be easier to contact the bot-software-maintainer and so on.
Any thoughts about that? Good idea or a "you had too much hot water in the shower"-idea?
Sincerly, DaB.
-- wp-blog.de
Hello all,
ok after I got no "stupid idea!"-responses from yours, I will create a Multimaintainer-project for the start. The problem: How should I name it? Interwikibot? Have we still interwiki-bots that dont't use the python- framework? Should I better name it python-interwikibot or a complete other name? Suggestions?
Sincerly, DaB.
Hello, Am Samstag 09 Januar 2010 23:18:13 schrieb Nakor:
what about putting TS in it's name to point out it runs from the toolserver?
I not spoke of the name of the bot, but of the name of the multi-maintainer- project. For the bot-name I would suggest ts-interwikibot or something (or we could also use a allready existing botname of a ts-user which join the multimaintaincer-project).
Sincerly, DaB.
DaB. wrote:
Hello, Am Samstag 09 Januar 2010 23:18:13 schrieb Nakor:
what about putting TS in it's name to point out it runs from the toolserver?
I not spoke of the name of the bot, but of the name of the multi-maintainer- project. For the bot-name I would suggest ts-interwikibot or something (or we could also use a allready existing botname of a ts-user which join the multimaintaincer-project).
Sincerly, DaB.
The project name should be the same as the bot, since both should be one.
The name should be as simple as possible. We already have two compound nouns (inter+wiki, tool+server), and if we try to construct a meta-compound noun out of them, it starts to horrify everything.
My suggestion: Interbot.
This clear, sense-making and recognizable name is free on all Wikimedia projects and is now registered to DaB.’s email.
— Kalan
Great idea.
There is are also two interesting points which have not been mentioned.
First, having a single bot avoid "interwiki bot wars" on pages. It is not so uncommon to have several bots editing back and forth on the same page, simply because they dont share the same settings/run the same software.
And second, a more personal/controversial remark: At pywikipedia, and more generally in the bot communities on each projects, we have regularly young power-users showing up, and trying out bots, because, well... mostly e-fame I would say, and the excitement of playing with a new toy. Depending on the bot policies, those rather Python-unskilled people can get bot flags on some wikis, and end up running tools they dont understand and that they dont update. That hurts, in many contexts. Perhaps you'll remember the WikiDreamer case, which had me running around all projects. And periodically, we have other troubles, with users not able to communicate with the projects they run their bot on, etc, etc...
That would give us a good opportunity to clear those issues, and to improve bot images in general by having a single entity doing those changes.
2010/1/10 Kalan kalan.001@gmail.com:
The name should be as simple as possible. We already have two compound nouns (inter+wiki, tool+server), and if we try to construct a meta-compound noun out of them, it starts to horrify everything.
My suggestion: Interbot.
This clear, sense-making and recognizable name is free on all Wikimedia projects and is now registered to DaB.’s email.
I disagree. Interbot sounds very unclear to me. But we digress: I dont see why we need to find a user name now.
Also, I dont see why we should use a single SUL account crosswiki. Think international: it is a nice opportunity to have a bot name that would be localized for each project. Make sure that any beginner contributor can understand what the bot has done by giving it a local name that he can understand.
That would give us very little extra work (just a single page listing all usernames with a few contribs links for each wiki).
I am also very excited by the technical promises of that proposition. * I am pretty sure that one process only would not be enough to ensure a decent-enough refresh rate. It will also require to cycle through the projects to make sure that all projects are treated. * Also, the only way the have to "slice" interwiki work is to select a portion of the source's wiki Special:Allpages that will be treated by the bot: it might not be good enough if several processes are running in the same time, because they will end up duplicating work * Centralizing all work gives us several nice performance/logging opportunities
Regards,
I'm a bit more wary about these changes. I do not currently have an account at the toolserver; I used to have one, but I had no idea what to do with it. Does this mean that in the future if I want to do some interwiki botting work I will have to do it from the toolserver? How is that supposed to be an improvement? At the moment, if I want to run a bot, I open a screen and run it. Am I supposed to get some kind of remote desktop connection in the future?
Hello, At Wednesday 13 January 2010 21:26:13 DaB. wrote:
Does this mean that in the future if I want to do some interwiki botting work I will have to do it from the toolserver?
that's the decision of the wikimedia-projects. We can't decide for them. We can only decide, that there will be only 1 interwiki-bot on the toolserver in future - interwiki-bots outside the TS is not our business.
Sincerly, DaB.
Andre Engels schrieb:
At the moment, if I want to run a bot, I open a screen and run it. Am I supposed to get some kind of remote desktop connection in the future?
As DaB sais: we can't impose any rules on bot use on the individual projects, and don't want to. It just seems silly to run several interwiki-bots on the toolserver, instead of cooperating to run one.
But as to your question about "some kind of remote desktop connection": close. you get a remote console. it's called SSH. that'S how must people use the toolserver. what did you think?
-- daniel
On Wed, Jan 13, 2010 at 9:53 PM, Daniel Kinzler daniel@brightbyte.de wrote:
Andre Engels schrieb:
At the moment, if I want to run a bot, I open a screen and run it. Am I supposed to get some kind of remote desktop connection in the future?
As DaB sais: we can't impose any rules on bot use on the individual projects, and don't want to.
Ok, then I'll just shut up...
It just seems silly to run several interwiki-bots on the toolserver, instead of cooperating to run one.
Still, there's the matter of what it means to 'cooperate to run one'. Would that mean that there's only a single interwiki bot process running? That can only work if it is fast enough to go through _all_ pages on _all_ languages in a reasonable time. And even then you have only replaced the autonomous bots. Running with hints would be hard to get in; interactively running the bot would not be possible at all under such a scheme. Or does it just mean using a single username? In that case I don't see the advantage. Or is it that you want to be sure to use a single codebase? Keeping bots up-to-date is a good idea, but the problem it solves is not a very big one, in my opinion.
I think better results could be reached by putting the emphasis on the "cooperating" part rather than the "running one" part. Get some database or such where the bots notify when they have either updated a page or found that it did not need updating, then have the bots (except when running with hints or such) request this database before doing interwiki on a page and skip it if its last notification was less than so-and-so-much time ago (for example one week).
Andre Engels wrote:
It just seems silly to run several interwiki-bots on the toolserver, instead of cooperating to run one.
Still, there's the matter of what it means to 'cooperate to run one'. Would that mean that there's only a single interwiki bot process running? That can only work if it is fast enough to go through _all_ pages on _all_ languages in a reasonable time.
There may be several instances, but they would be the same bot.
And even then you have only replaced the autonomous bots. Running with hints would be hard to get in; interactively running the bot would not be possible at all under such a scheme.
That's a valid concern.
I think better results could be reached by putting the emphasis on the "cooperating" part rather than the "running one" part. Get some database or such where the bots notify when they have either updated a page or found that it did not need updating, then have the bots (except when running with hints or such) request this database before doing interwiki on a page and skip it if its last notification was less than so-and-so-much time ago (for example one week).
Several cooperating instances of the same bot running from the toolserver can do a lot of magic sharing information and using the tables replicated at the toolserver.
2010/1/14 Platonides platonides@gmail.com:
Andre Engels wrote:
It just seems silly to run several interwiki-bots on the toolserver, instead of cooperating to run one.
Still, there's the matter of what it means to 'cooperate to run one'. Would that mean that there's only a single interwiki bot process running? That can only work if it is fast enough to go through _all_ pages on _all_ languages in a reasonable time.
There may be several instances, but they would be the same bot.
Obviously. Starting languages also have to be different to cover all sources.
And even then you have only replaced the autonomous bots. Running with hints would be hard to get in; interactively running the bot would not be possible at all under such a scheme.
That's a valid concern.
I would say that > 80% of TS interwiki bots, if not 100%, are running using -autonomous, as it seems that those are "launch and forget" instances. If bot owners want to run bots interactively, I dont really see how the toolserver is needed: they can just do this on their own machine. I think that the idea of the Toolserver is to have a server which is always up to be able to schedule regular processes without having to worry about your collegue shutting down your apparently inactive local box while you're away. Again, most Toolserver instances should be non-interactive instances.
I would not have any problem against users that have, for a specific reason, the need to run an _interactive_ interwiki bot instance from the Toolserver.
Andre Engels schrieb:
I think better results could be reached by putting the emphasis on the "cooperating" part rather than the "running one" part.
Yes, I agree. Basically, making this a multi-maintainer project ensures that: * there's a common code base, and a single toolserver-account * operators are aware of each other, and will hopefully coordinate * thus, we get less redundant runs and less concurrent processes
Get some database or such where the bots notify when they have either updated a page or found that it did not need updating, then have the bots (except when running with hints or such) request this database before doing interwiki on a page and skip it if its last notification was less than so-and-so-much time ago (for example one week).
Such a database sounds like an interesting Idea - in fact, it sounds like a good reason for including you in the project :) talk to the bot ops about it.
-- daniel
On Thu, Jan 14, 2010 at 10:23 AM, Daniel Kinzler daniel@brightbyte.de wrote:
Get some database or such where the bots notify when they have either updated a page or found that it did not need updating, then have the bots (except when running with hints or such) request this database before doing interwiki on a page and skip it if its last notification was less than so-and-so-much time ago (for example one week).
Such a database sounds like an interesting Idea - in fact, it sounds like a good reason for including you in the project :) talk to the bot ops about it.
And where am I to found the bot ops? Who are they? What is meant by "bot ops" in fact? (yes, I do realize that it is short for "bot operators", but is there any meaning beyond the literal "someone who operates some bot on some wikimedia project"?)
Andre Engels schrieb:
And where am I to found the bot ops? Who are they? What is meant by "bot ops" in fact? (yes, I do realize that it is short for "bot operators", but is there any meaning beyond the literal "someone who operates some bot on some wikimedia project"?)
Well, in this case, it would be the people who have access to the interwiki-bot multi-maintainer project. That is, the people running such bots on the toolserver now, and willing to cooperate on a shared account. I don't know who they are, but I suppose some of them took part in this thread :)
-- daniel
At pywikipedia, and more generally in the bot communities on each projects, we have regularly young power-users showing up, and trying out bots, because, well... mostly e-fame I would say, and the excitement of playing with a new toy. Depending on the bot policies, those rather Python-unskilled people can get bot flags on some wikis, and end up running tools they dont understand and that they dont update. That hurts, in many contexts.
This remark is spot on. I'm an admin on dewikinews and we had all this trouble in the past. One bot operator even had the boldness to tell us he can't fix his misbehaving bot because he just uses pywikipedia and don't know how to modify it.
Of course i can't speak for dewikinews in whole, but i believe that the existence of a central iw-bot would be enough reason to disallow all other iw-bots and this would really change things to the better. To be frank, i'm soooo sick of all this iw-bot madness.
Regards, Michael
On 16.01.2010, 19:42 Michael wrote:
Of course i can't speak for dewikinews in whole, but i believe that the existence of a central iw-bot would be enough reason to disallow all other iw-bots and this would really change things to the better. To be frank, i'm soooo sick of all this iw-bot madness.
Will a single instance of pywikipedia, even running on a fast box with a fat channel to pmtpa able to perform all edits needed? How many edits per minute will be needed to keep interwikis synchronized on all projects?
Will a single instance of pywikipedia, even running on a fast box with a fat channel to pmtpa able to perform all edits needed? How many
There seems to be confusion about the meaning of "instance". As far as I understand we are talking about a single codebase (possibly linked to a multi user account on the toolserver). There is no limit on how many processes can be run at the same time from that same code.
On Sat, Jan 16, 2010 at 5:42 PM, Michael Holzt < wikimedia-toolserver@michael.holzt.de> wrote:
Of course i can't speak for dewikinews in whole, but i believe that the existence of a central iw-bot would be enough reason to disallow all other iw-bots and this would really change things to the better. To be frank, i'm soooo sick of all this iw-bot madness.
On dewiki, once there was an epic battle between 3 or 4 IW bots on one article... turned out to be different configurations, AFAIR. But actually, I don't see why resources should be *wasted* on interwiki bots. All the time and money already invested woulda be better invested in some central "data wiki" or similar so there 'd be absolutely no need for any IW bots.
Marco
Hello again,
after this discussion seems to be finish for first, I created [1] to push and coordinate the process from now.
Sincerly, DaB.
[1] https://jira.toolserver.org/browse/TS-494
toolserver-l@lists.wikimedia.org