Hi;
Who are downloading the Domas visits logs in /mnt/user-store/stats? The script is not downloading the projectcounts-* files, and the are necesary too. Can anyone give me a copy of the projectcounts-200911* of November 2009? Their combined size must be < 15MB. Thanks.
Regards, emijrp
On 07.11.2010 09:47, emijrp wrote:
Who are downloading the Domas visits logs in /mnt/user-store/stats? The
I am !
script is not downloading the projectcounts-* files, and the are necesary too.
Indeed, they are not being downloaded -- this is not done on purpose, but in any case, I have never used them and noone has ever asked about them. What do they contain exactly ?
Can anyone give me a copy of the projectcounts-200911* of November 2009? Their combined size must be < 15MB. Thanks.
Unfortunately, I don't have them, and all other people I have talked to don't have them either...
Frédéric
2010/11/7 Frédéric Schütz schutz@mathgen.ch
On 07.11.2010 09:47, emijrp wrote:
Who are downloading the Domas visits logs in /mnt/user-store/stats? The
I am !
script is not downloading the projectcounts-* files, and the are necesary too.
Indeed, they are not being downloaded -- this is not done on purpose, but in any case, I have never used them and noone has ever asked about them. What do they contain exactly ?
The pageviews per hour and project. Please, add it to the script for downloading from now.
Can anyone give me a copy of the projectcounts-200911* of November 2009? Their combined size must be < 15MB. Thanks.
Unfortunately, I don't have them, and all other people I have talked to don't have them either...
WHAT!? I hope someone has them. Please, I need them to close the November
2009 archive in IA.
Frédéric
Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org) https://lists.wikimedia.org/mailman/listinfo/toolserver-l Posting guidelines for this list: https://wiki.toolserver.org/view/Mailing_list_etiquette
emijrp wrote:
Indeed, they are not being downloaded -- this is not done on purpose, but in any case, I have never used them and noone has ever asked about them. What do they contain exactly ?
The pageviews per hour and project. Please, add it to the script for downloading from now.
This is done -- on my personal archive for now, currently downloading everything that is available, but I will add this to the toolserver too.
> Can anyone give me a copy of the projectcounts-200911* of > November 2009? Their combined size must be < 15MB. Thanks. Unfortunately, I don't have them, and all other people I have talked to don't have them either...
WHAT!? I hope someone has them. Please, I need them to close the November 2009 archive in IA.
Can't we generate them from the pagecounts files, by summing the data from all pages in each project ? It'd be pretty straightforward; I'm happy to do it if you need it.
Frédéric
On 08.11.2010 11:52, Frederic Schutz wrote:
> Can anyone give me a copy of the projectcounts-200911* of > November 2009? Their combined size must be< 15MB. Thanks. Unfortunately, I don't have them, and all other people I have talked to don't have them either...
WHAT!? I hope someone has them. Please, I need them to close the November 2009 archive in IA.
Can't we generate them from the pagecounts files, by summing the data from all pages in each project ? It'd be pretty straightforward; I'm happy to do it if you need it.
About 40 lines of Perl and a few "diff"s later, it looks like the projectcounts files can indeed be regenerated easily from the pagecounts files -- so you don't have to worry about them being lost.
The files for November 2009 are currently being created in /mnt/user-store/stats -- but this means reading all the pagecounts files, and it is going to be veeeery sloooow.
Frédéric
I have request them to Ariel, from the copy that WMF hosts. I hope they exist, if not, I will use yours. Thanks.
2010/11/8 Frédéric Schütz schutz@mathgen.ch
On 08.11.2010 11:52, Frederic Schutz wrote:
> Can anyone give me a copy of the projectcounts-200911* of > November 2009? Their combined size must be< 15MB. Thanks. Unfortunately, I don't have them, and all other people I have
talked to
don't have them either...
WHAT!? I hope someone has them. Please, I need them to close the November 2009 archive in IA.
Can't we generate them from the pagecounts files, by summing the data from all pages in each project ? It'd be pretty straightforward; I'm happy to do it if you need it.
About 40 lines of Perl and a few "diff"s later, it looks like the projectcounts files can indeed be regenerated easily from the pagecounts files -- so you don't have to worry about them being lost.
The files for November 2009 are currently being created in /mnt/user-store/stats -- but this means reading all the pagecounts files, and it is going to be veeeery sloooow.
Frédéric
Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org) https://lists.wikimedia.org/mailman/listinfo/toolserver-l Posting guidelines for this list: https://wiki.toolserver.org/view/Mailing_list_etiquette
On 08.11.2010 21:47, emijrp wrote:
I have request them to Ariel, from the copy that WMF hosts. I hope they exist, if not, I will use yours. Thanks.
I believe that Ariel's collection was copied from mine, so the files may be missing there too (but it is also possible that when the copy was performed, the projectcounts files for 200911 were still available from Domas's server and that Ariel archived them).
In any case, the files will be available if you need them.
Frédéric
No response from Ariel yet. Where are your files?
2010/11/8 Frédéric Schütz schutz@mathgen.ch
On 08.11.2010 21:47, emijrp wrote:
I have request them to Ariel, from the copy that WMF hosts. I hope they exist, if not, I will use yours. Thanks.
I believe that Ariel's collection was copied from mine, so the files may be missing there too (but it is also possible that when the copy was performed, the projectcounts files for 200911 were still available from Domas's server and that Ariel archived them).
In any case, the files will be available if you need them.
Frédéric
Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org) https://lists.wikimedia.org/mailman/listinfo/toolserver-l Posting guidelines for this list: https://wiki.toolserver.org/view/Mailing_list_etiquette
2010/11/7 Frédéric Schütz schutz@mathgen.ch:
On 07.11.2010 09:47, emijrp wrote:
Who are downloading the Domas visits logs in /mnt/user-store/stats? The
I am !
Excellent. It may or may not interest you that /mnt/user-store/stats/pagecounts-20101116-130001.gz is corrupt. Probably a partial download.
On 17.11.2010 23:52, Johan G wrote:
On 07.11.2010 09:47, emijrp wrote:
Who are downloading the Domas visits logs in /mnt/user-store/stats? The
I am !
Excellent. It may or may not interest you that /mnt/user-store/stats/pagecounts-20101116-130001.gz is corrupt. Probably a partial download.
Yes, seems to be the case. I've deleted the file and rerun the script that downloads the stats and it should be ok now. The file was correct in my personal archive (but the archiving process there does some integrity checking).
Thanks for the information !
Frédéric
2010/11/17 Frédéric Schütz schutz@mathgen.ch:
On 17.11.2010 23:52, Johan G wrote:
On 07.11.2010 09:47, emijrp wrote:
Who are downloading the Domas visits logs in /mnt/user-store/stats? The
I am !
Excellent. It may or may not interest you that /mnt/user-store/stats/pagecounts-20101116-130001.gz is corrupt. Probably a partial download.
Yes, seems to be the case. I've deleted the file and rerun the script that downloads the stats and it should be ok now. The file was correct in my personal archive (but the archiving process there does some integrity checking).
I confirm that the file is no longer corrupt. Thanks for the fast response.
Thanks for the information !
Frédéric
Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org) https://lists.wikimedia.org/mailman/listinfo/toolserver-l Posting guidelines for this list: https://wiki.toolserver.org/view/Mailing_list_etiquette
How can we know if there are more corrupted files of old pageviews?
2010/11/18 Johan G johang@toolserver.org
2010/11/17 Frédéric Schütz schutz@mathgen.ch:
On 17.11.2010 23:52, Johan G wrote:
On 07.11.2010 09:47, emijrp wrote:
Who are downloading the Domas visits logs in /mnt/user-store/stats?
The
I am !
Excellent. It may or may not interest you that /mnt/user-store/stats/pagecounts-20101116-130001.gz is corrupt. Probably a partial download.
Yes, seems to be the case. I've deleted the file and rerun the script that downloads the stats and it should be ok now. The file was correct in my personal archive (but the archiving process there does some integrity checking).
I confirm that the file is no longer corrupt. Thanks for the fast response.
Thanks for the information !
Frédéric
Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org) https://lists.wikimedia.org/mailman/listinfo/toolserver-l Posting guidelines for this list:
https://wiki.toolserver.org/view/Mailing_list_etiquette
Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org) https://lists.wikimedia.org/mailman/listinfo/toolserver-l Posting guidelines for this list: https://wiki.toolserver.org/view/Mailing_list_etiquette
On 11/18/2010 03:59 PM, emijrp wrote:
How can we know if there are more corrupted files of old pageviews?
Since the files are gzipped, you can run
gzip -t
to test their integrity. I do it on my own archive; the script is on the toolserver but is not used regularly.
I just launched the script on all files since January 2010. It takes quite a bit of time to run since it also calculates SHA1 fingerprints for all files (this does not directly help finding out corruption, but it allows me to compare the results of archiving on different servers).
Frédéric
toolserver-l@lists.wikimedia.org