Dear all,
As I just mentioned to Dan in a private email conversation, keeping datasets even with imperfect measurements is important. Particularly for longitudinal analysis.
Also, from what I understand - me being a newby here - is that the data are stored in separate files. Dan suggested reordering the page into categories. Maybe, another option is to create more extensive datasets with more different measurements in a single datafile. On the other hand, the files would become even bigger in size. Not an issue for mee, but for users in the field accesibility (dowlnload bandwidth) could become an issue.
my two cents Maurice
On Thu, Dec 24, 2015 at 2:58 PM, Alex Druk alex.druk@gmail.com wrote:
Nothing against this approach!
On Thu, Dec 24, 2015 at 2:55 PM, Dan Andreescu dandreescu@wikimedia.org wrote:
On Thu, Dec 24, 2015 at 8:48 AM, Alex Druk alex.druk@gmail.com wrote:
Hi Dan, Happy holidays! Good idea to combine these datasets! However we have one more dataset by Erik Zachte : http://dumps.wikimedia.org/other/pagecounts-ez/
And that's an important one! But I was thinking we could re-organize the page into categories. Erik's dataset could go into a "processed data" category or something like that. The three I wanted to talk about on this thread are just the raw data.
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
-- Thank you.
Alex Druk alex.druk@gmail.com (775) 237-8550 Google voice
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics