This can be a good option too. Thanks, Issac.
Haifeng Zhang
Postdoctoral Research Fellow Human-Computer Interaction Institute Carnegie Mellon University ________________________________ From: Wiki-research-l wiki-research-l-bounces@lists.wikimedia.org on behalf of Isaac Johnson isaac@wikimedia.org Sent: Tuesday, March 12, 2019 5:21:11 PM To: Research into Wikimedia content and communities Subject: Re: [Wiki-research-l] Sampling new editors in English Wikipedia
Hey Haifeng, If you decide to process the dumps, you should be able to easily repurpose some quick code that I wrote for a similar project: https://github.com/geohci/miscellaneous-wikimedia/tree/master/editor-turnove...
Notably, I'd suggest using the stub history dumps as they are much smaller because they do not include the actual content. For instance, for March 1st and English Wikipedia (https://dumps.wikimedia.org/enwiki/20190301/), this file would be enwiki-20190301-stub-meta-history.xml.gz and is 57.9 GB.
Best, Isaac
On Tue, Mar 12, 2019 at 3:56 PM Pine W wiki.pine@gmail.com wrote:
Hi Haifeng, thanks for the information. I think that your idea of looking in the dumps makes sense. Am I understanding correctly that you would like advice regarding how to do that in the most efficient way?
Hi Leila, I believe that I asked for more information regarding Heifeng's work. There has been discussion on English Wikipedia regarding volunteers being unhappy with the interventions or proposed interventions of researchers. I think that asking about the nature of Haifeng's research is legitimate, and I tried to provide some examples of possible types of research. I'm trying to protect the community from problematic interventions, while also welcoming research that is accepted by the community.
Pine ( https://meta.wikimedia.org/wiki/User:Pine )
On Tue, Mar 12, 2019 at 8:00 PM Haifeng Zhang haifeng1@andrew.cmu.edu wrote:
Pine and Stuart,
I meant extracting a random sample of new editors (month by month) from Wikipedia edit history.
It is not about survey of new editors, but still thanks for your suggestions.
Thanks, Haifeng Zhang
Postdoctoral Research Fellow Human-Computer Interaction Institute Carnegie Mellon University ________________________________ From: Wiki-research-l wiki-research-l-bounces@lists.wikimedia.org on behalf of Stuart A. Yeates syeates@gmail.com Sent: Tuesday, March 12, 2019 3:46:19 PM To: Research into Wikimedia content and communities Subject: Re: [Wiki-research-l] Sampling new editors in English Wikipedia
There are a number of new-editor-heavy noticeboards. I would suggest posting an invite there to your survey (or whatever) If you ask for editor's usernames you can filter out those who don't meet your definition of 'new'
I'm thinking of places like: https://en.wikipedia.org/wiki/Wikipedia:Teahouse and https://en.wikipedia.org/wiki/Wikipedia:Help_desk
cheers stuart
-- ...let us be heard from red core to black sky
On Wed, 13 Mar 2019 at 08:37, Leila Zia leila@wikimedia.org wrote:
Hi Pine,
Haifeng has a simple question about how to sample editors other than via dumps. It would be great if someone who knows the answer to help them to move forward.
If you are interested to learn more about their research, instead of answering their question, my recommendation would be to start the conversation with: "can you tell us more about your research?" kind of question. I find the current way of communication very speculative, and that is not good for making a vibrant research community that can help us address some of our big questions.
Best, Leila
On Tue, Mar 12, 2019 at 12:08 PM Pine W wiki.pine@gmail.com wrote:
Hi, can you expand on what you mean by "sample"? If you're referring
to
analyzing users' edit histories then that should be fine. However, if you're planning to send surveys or messages to them, sending them barnstars, or otherwise manipulating their on-wiki experience, that
would
be problematic.
Pine ( https://meta.wikimedia.org/wiki/User:Pine )
On Tue, Mar 12, 2019 at 6:19 PM Haifeng Zhang <
haifeng1@andrew.cmu.edu
wrote:
Hi folks,
My work needs to randomly sample new editors in each month, e.g.,
100
editors per month.
Do any of you have good suggestions for how to do this efficiently?
I could think of using the dump files, but wonder are there other
options?
Thanks,
Haifeng Zhang _______________________________________________ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l _______________________________________________ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
-- Isaac Johnson -- Research Scientist -- Wikimedia Foundation _______________________________________________ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l