Hello!
I was thinking about user sessions, yes, so this would mean to aggregate pageviews visited by a user during a short amount of time (I should check the cutoff, but it could be around an hour or less).
I am particularly interested in understanding the order in which pages are seen (start, end), duration, etc. I wouldn't need data from a long period neither, but I think data from multiple languages would be helpful.
I imagined reader data could be sensitive to privacy, but would an NDA with my university and some sort of data encoding help with this? As I said, it is for a scientific purpose.
Thanks,
Marc
El dt., 28 juny 2016 a les 21:09, Nuria Ruiz (nuria@wikimedia.org) va escriure:
Hello!
I am considering to study reader engagement for different article topics
in different languages. Because of this, I would like to know if there is
any plan to make available pageviews dumps detailing activity log at
session level per user - in a similar way to editor sessions.
Are you thinking of "all-pageviews-visited-by-a-certain-user"? If so, no we do not have any projects to provide that data as due to privacy concerns we neither have nor keep that information.
Thanks,
Nuria
On Tue, Jun 28, 2016 at 6:55 PM, Leila Zia leila@wikimedia.org wrote:
- Analytics
On Tue, Jun 28, 2016 at 6:36 AM, Marc Miquel marcmiquel@gmail.com wrote:
Hello,
I have a question for you regarding pageviews datadumps.
I am considering to study reader engagement for different article topics in different languages. Because of this, I would like to know if there is any plan to make available pageviews dumps detailing activity log at session level per user - in a similar way to editor sessions.
Since this would be for a research project I might ask funding for it, I would like to know if I could count on that, what is the nature of the available data, and what would be the procedure to obtain this data and if there would be any implication because of privacy concerns.
Thank you very much!
Best,
Marc Miquel ᐧ
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
If historic data is okay, there's already a dataset released ( https://figshare.com/articles/Activity_Sessions_datasets/1291033) that was designed specifically to answer questions around how to best calculate session length with regards to Wikipedia (http://arxiv.org/abs/1411.2878)
On Tue, Jun 28, 2016 at 3:42 PM, Marc Miquel marcmiquel@gmail.com wrote:
Hello!
I was thinking about user sessions, yes, so this would mean to aggregate pageviews visited by a user during a short amount of time (I should check the cutoff, but it could be around an hour or less).
I am particularly interested in understanding the order in which pages are seen (start, end), duration, etc. I wouldn't need data from a long period neither, but I think data from multiple languages would be helpful.
I imagined reader data could be sensitive to privacy, but would an NDA with my university and some sort of data encoding help with this? As I said, it is for a scientific purpose.
Thanks,
Marc
El dt., 28 juny 2016 a les 21:09, Nuria Ruiz (nuria@wikimedia.org) va escriure:
Hello!
I am considering to study reader engagement for different article
topics in different languages. Because of this, I would like to know if there is >any plan to make available pageviews dumps detailing activity log at session level per user - in a similar way to editor sessions.
Are you thinking of "all-pageviews-visited-by-a-certain-user"? If so, no we do not have any projects to provide that data as due to privacy concerns we neither have nor keep that information.
Thanks,
Nuria
On Tue, Jun 28, 2016 at 6:55 PM, Leila Zia leila@wikimedia.org wrote:
- Analytics
On Tue, Jun 28, 2016 at 6:36 AM, Marc Miquel marcmiquel@gmail.com wrote:
Hello,
I have a question for you regarding pageviews datadumps.
I am considering to study reader engagement for different article topics in different languages. Because of this, I would like to know if there is any plan to make available pageviews dumps detailing activity log at session level per user - in a similar way to editor sessions.
Since this would be for a research project I might ask funding for it, I would like to know if I could count on that, what is the nature of the available data, and what would be the procedure to obtain this data and if there would be any implication because of privacy concerns.
Thank you very much!
Best,
Marc Miquel ᐧ
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Thanks for the answer, Oliver. But I am not sure it answers my questions. I'd like to study aspects like how much time is spent in certain pages, as a proxy of how content is approached/read/understood. I'd be happy with time of entering the page, time of leaving. This is not entirely centered on 'user activity', but I said that because I imagined data would be stored in a similar way to editor sessions, or in a database and I would need to do the time calculations.
Cheers,
Marc
El dc., 29 juny, 2016 03:11, Oliver Keyes ironholds@gmail.com va escriure:
If historic data is okay, there's already a dataset released ( https://figshare.com/articles/Activity_Sessions_datasets/1291033) that was designed specifically to answer questions around how to best calculate session length with regards to Wikipedia (http://arxiv.org/abs/1411.2878)
On Tue, Jun 28, 2016 at 3:42 PM, Marc Miquel marcmiquel@gmail.com wrote:
Hello!
I was thinking about user sessions, yes, so this would mean to aggregate pageviews visited by a user during a short amount of time (I should check the cutoff, but it could be around an hour or less).
I am particularly interested in understanding the order in which pages are seen (start, end), duration, etc. I wouldn't need data from a long period neither, but I think data from multiple languages would be helpful.
I imagined reader data could be sensitive to privacy, but would an NDA with my university and some sort of data encoding help with this? As I said, it is for a scientific purpose.
Thanks,
Marc
El dt., 28 juny 2016 a les 21:09, Nuria Ruiz (nuria@wikimedia.org) va escriure:
Hello!
I am considering to study reader engagement for different article
topics in different languages. Because of this, I would like to know if there is >any plan to make available pageviews dumps detailing activity log at session level per user - in a similar way to editor sessions.
Are you thinking of "all-pageviews-visited-by-a-certain-user"? If so, no we do not have any projects to provide that data as due to privacy concerns we neither have nor keep that information.
Thanks,
Nuria
On Tue, Jun 28, 2016 at 6:55 PM, Leila Zia leila@wikimedia.org wrote:
- Analytics
On Tue, Jun 28, 2016 at 6:36 AM, Marc Miquel marcmiquel@gmail.com wrote:
Hello,
I have a question for you regarding pageviews datadumps.
I am considering to study reader engagement for different article topics in different languages. Because of this, I would like to know if there is any plan to make available pageviews dumps detailing activity log at session level per user - in a similar way to editor sessions.
Since this would be for a research project I might ask funding for it, I would like to know if I could count on that, what is the nature of the available data, and what would be the procedure to obtain this data and if there would be any implication because of privacy concerns.
Thank you very much!
Best,
Marc Miquel ᐧ
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Aye, as Joseph says, the time-on-page or time-leaving is not collected, except as an extension of session reconstruction work. If you want a concrete time, you're not gonna get it.
While PC-based data is more reliable than mobile, that does not necessarily mean "reliable". I'm sort of confused, I guess, as to why the datasets I linked (unless I'm misremembering them?) don't help: you would have to do the calculation yourself but they should contain all the data necessary to make that calculation (unless you want to have the pageID or title associated with the time-on-page, in which case...yeah, that's an issue).
On Wed, Jun 29, 2016 at 3:16 AM, Marc Miquel marcmiquel@gmail.com wrote:
Thanks for the answer, Oliver. But I am not sure it answers my questions. I'd like to study aspects like how much time is spent in certain pages, as a proxy of how content is approached/read/understood. I'd be happy with time of entering the page, time of leaving. This is not entirely centered on 'user activity', but I said that because I imagined data would be stored in a similar way to editor sessions, or in a database and I would need to do the time calculations.
Cheers,
Marc
El dc., 29 juny, 2016 03:11, Oliver Keyes ironholds@gmail.com va escriure:
If historic data is okay, there's already a dataset released ( https://figshare.com/articles/Activity_Sessions_datasets/1291033) that was designed specifically to answer questions around how to best calculate session length with regards to Wikipedia (http://arxiv.org/abs/1411.2878)
On Tue, Jun 28, 2016 at 3:42 PM, Marc Miquel marcmiquel@gmail.com wrote:
Hello!
I was thinking about user sessions, yes, so this would mean to aggregate pageviews visited by a user during a short amount of time (I should check the cutoff, but it could be around an hour or less).
I am particularly interested in understanding the order in which pages are seen (start, end), duration, etc. I wouldn't need data from a long period neither, but I think data from multiple languages would be helpful.
I imagined reader data could be sensitive to privacy, but would an NDA with my university and some sort of data encoding help with this? As I said, it is for a scientific purpose.
Thanks,
Marc
El dt., 28 juny 2016 a les 21:09, Nuria Ruiz (nuria@wikimedia.org) va escriure:
Hello!
I am considering to study reader engagement for different article
topics in different languages. Because of this, I would like to know if there is >any plan to make available pageviews dumps detailing activity log at session level per user - in a similar way to editor sessions.
Are you thinking of "all-pageviews-visited-by-a-certain-user"? If so, no we do not have any projects to provide that data as due to privacy concerns we neither have nor keep that information.
Thanks,
Nuria
On Tue, Jun 28, 2016 at 6:55 PM, Leila Zia leila@wikimedia.org wrote:
- Analytics
On Tue, Jun 28, 2016 at 6:36 AM, Marc Miquel marcmiquel@gmail.com wrote:
Hello,
I have a question for you regarding pageviews datadumps.
I am considering to study reader engagement for different article topics in different languages. Because of this, I would like to know if there is any plan to make available pageviews dumps detailing activity log at session level per user - in a similar way to editor sessions.
Since this would be for a research project I might ask funding for it, I would like to know if I could count on that, what is the nature of the available data, and what would be the procedure to obtain this data and if there would be any implication because of privacy concerns.
Thank you very much!
Best,
Marc Miquel ᐧ
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Yes, the whole thing is about page_title or page_ids. I think Wikipedia as a project provides very different types of information and it would be interesting to see how they are actually read, checked, etc. Likewise, I would need to see variations in different language editions. But not something large-scale or for long periods,...this is why a few days sample would be valuable.
Anyway, thanks for the datasets link, Oliver.
Marc
El dc., 29 juny 2016 a les 13:58, Oliver Keyes (ironholds@gmail.com) va escriure:
Aye, as Joseph says, the time-on-page or time-leaving is not collected, except as an extension of session reconstruction work. If you want a concrete time, you're not gonna get it.
While PC-based data is more reliable than mobile, that does not necessarily mean "reliable". I'm sort of confused, I guess, as to why the datasets I linked (unless I'm misremembering them?) don't help: you would have to do the calculation yourself but they should contain all the data necessary to make that calculation (unless you want to have the pageID or title associated with the time-on-page, in which case...yeah, that's an issue).
On Wed, Jun 29, 2016 at 3:16 AM, Marc Miquel marcmiquel@gmail.com wrote:
Thanks for the answer, Oliver. But I am not sure it answers my questions. I'd like to study aspects like how much time is spent in certain pages, as a proxy of how content is approached/read/understood. I'd be happy with time of entering the page, time of leaving. This is not entirely centered on 'user activity', but I said that because I imagined data would be stored in a similar way to editor sessions, or in a database and I would need to do the time calculations.
Cheers,
Marc
El dc., 29 juny, 2016 03:11, Oliver Keyes ironholds@gmail.com va escriure:
If historic data is okay, there's already a dataset released ( https://figshare.com/articles/Activity_Sessions_datasets/1291033) that was designed specifically to answer questions around how to best calculate session length with regards to Wikipedia (http://arxiv.org/abs/1411.2878 )
On Tue, Jun 28, 2016 at 3:42 PM, Marc Miquel marcmiquel@gmail.com wrote:
Hello!
I was thinking about user sessions, yes, so this would mean to aggregate pageviews visited by a user during a short amount of time (I should check the cutoff, but it could be around an hour or less).
I am particularly interested in understanding the order in which pages are seen (start, end), duration, etc. I wouldn't need data from a long period neither, but I think data from multiple languages would be helpful.
I imagined reader data could be sensitive to privacy, but would an NDA with my university and some sort of data encoding help with this? As I said, it is for a scientific purpose.
Thanks,
Marc
El dt., 28 juny 2016 a les 21:09, Nuria Ruiz (nuria@wikimedia.org) va escriure:
Hello!
I am considering to study reader engagement for different article
topics in different languages. Because of this, I would like to know if there is >any plan to make available pageviews dumps detailing activity log at session level per user - in a similar way to editor sessions.
Are you thinking of "all-pageviews-visited-by-a-certain-user"? If so, no we do not have any projects to provide that data as due to privacy concerns we neither have nor keep that information.
Thanks,
Nuria
On Tue, Jun 28, 2016 at 6:55 PM, Leila Zia leila@wikimedia.org wrote:
- Analytics
On Tue, Jun 28, 2016 at 6:36 AM, Marc Miquel marcmiquel@gmail.com wrote:
> Hello, > > I have a question for you regarding pageviews datadumps. > > I am considering to study reader engagement for different article > topics in different languages. Because of this, I would like to know if > there is any plan to make available pageviews dumps detailing activity log > at session level per user - in a similar way to editor sessions. > > Since this would be for a research project I might ask funding for > it, I would like to know if I could count on that, what is the nature of > the available data, and what would be the procedure to obtain this data and > if there would be any implication because of privacy concerns. > > Thank you very much! > > Best, > > Marc Miquel > ᐧ > > _______________________________________________ > Wiki-research-l mailing list > Wiki-research-l@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/wiki-research-l > >
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
wiki-research-l@lists.wikimedia.org