thanks, Morten:
if you know how to program in Python
well, I don't, ... not yet, that is, one reason why I came here to ask :-)
my favourite is b.: a. find the time and set the priorities to do it myself b. ask sb else to support me by programming this query c. follow a different kind of interest and not rely on the Wikipedia community
anyone up for b.?
thanks C.
On Tue, 8 May 2012 11:00:22 -0500, Morten Wang wrote
Claudia asked:
Q: being in none of the special Wikipedia roles, which of these ideas would I be able to try out by
myself?
All the metadata is available through the Wikipedia API, and the Pywikipediabot framework makes a lot of it easily accessible, so if you know how to program in Python, it's doable :)
Cheers, Morten
On 8 May 2012 10:40, koltzenburg@w4w.net wrote:
hi Bináris, Merlijn, Alchimista, and Morten,
thank you very much does anyone of you remember hearing a very new type of song, and being fascinated for sure but not
quite
trusting your ears?
btw, on his talk page yesterday, JAn came up with an idea that sounds like "new song" to me, too: http://cs.wikipedia.org/w/index.php?
title=Diskuse_s_wikipedistou:JAn_Dudík&diff=8497947&oldid=8497773
Morten said
Hope some of this helps, let me know if there's any questions.
I guess there are, Morten, thanks :-)
Q: being in none of the special Wikipedia roles, which of these ideas would I be able to try out by
myself?
btw, thanks for asking @Morten,
cheers, Claudia
On Tue, 8 May 2012 10:01:23 -0500, Morten Wang wrote
I did some data gathering last fall that is more or less the same as Claudia is asking about. Looking up the bot flag, or checking the username is often regarded as a reasonable way of filtering out the bots. I chose to apply both, if there's no bot flag we look for a typical bot signature in the username (regex: "bot$| ", username either ends with bot or a part of it does), and used a case-insensitive match since some users have usernames like "FoObOt".
Checking the edit history to find when interwiki links were first added can be time-consuming if the page had lots of activity. I therefore chose to use a binary search, halving the distance between two test points until either the actual edit is found, or we're down to so few edits that all can be efficiently grabbed through the API (e.g. using Pywikibot's PreloadingGenerator). Otherwise you might be examining thousands of edits for no reason.
Having Toolserver access simplifies the process a lot since all the metadata is more easily accessible, but the revision text will still have to be grabbed from the API.
Hope some of this helps, let me know if there's any questions.
Cheers, Morten
On 8 May 2012 08:39, Bináris wikiposta@gmail.com wrote:
2012/5/8 Merlijn van Deen valhallasw@arctus.nl
This is not completely true - the bot flag is also a property of the user account. You can query e.g.
title=Speciaal:Gebruikerslijst&offset=&limit=500&group=bot&uselang=en
Yes, that's true. And if you want to be quite accurate, you must also determine the date of acquiring the bot flag from bureau logs and compare it to the page history. :-)
-- Bináris
Pywikipedia-l mailing list Pywikipedia-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/pywikipedia-l
Pywikipedia-l mailing list Pywikipedia-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/pywikipedia-l
thanks & cheers, Claudia koltzenburg@w4w.net
Pywikipedia-l mailing list Pywikipedia-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/pywikipedia-l
Pywikipedia-l mailing list Pywikipedia-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/pywikipedia-l
thanks & cheers, Claudia koltzenburg@w4w.net