I am reading the list :-)
I had started reorganizing the files earlier this year but did not finish; apart from a few directories leftover, there should no duplicate files. In the meantime, I have restarted a few wget processes in order to get the files.
A MMP is a good idea, and I can look into it. However, the most pressing problem, as I see it, is space, that will become tight very quickly.
And the second thing to do would be to produce daily/monthly/whatever summary files from these raw files, as Erik Zachte does -- there should be no reason to use the raw files.
Frédéric
On 02/14/2012 02:03 PM, José Emilio Mori Recio wrote:
I'm also interested in this. As Adam said, it should be no problem to wget the new stats daily from dumps.wikimedia.org, but we should have a writable subdirectory in /mnt/user-store to do this. I don't know if the owner of the current one, schutz, is reading the list... Maybe the best solution is a MMP, as it's been done with the dumps, and that would also help to keep the dir clean of duplicates (not the case currently) and document it on the wiki (as with the dumps, again). Or maybe, to integrate it into the dumps MMP. I offer myself for whatever is needed.
--- Shaurabh Bharti ...
I too need the new page stats.
--- Adam Klimontadamkli@gmail.com ...
Pagecounts files from http://dumps.wikimedia.org/other/pagecounts-raw/ used to be downloaded by a script automatically into /mnt/user-store/stats (...) I do not think I have permissions to write in /mnt/user-store/stats. (...)
** José Emilio Mori Recio - http://es.wikipedia.org/wiki/User:-jem- **
Administrador Informático del Arzobispado de Valladolid *
** Bibliotecario de Wikipedia en español - Promotor de Wikimedia España **
Español: La información contenida en este e-mail es confidencial y va dirigida únicamente al receptor que aparece como destinatario. Si ha recibido este e-mail por error, por favor, notifíquenoslo inmediatamente y bórrelo de su sistema. Por favor, en tal caso, no lo copie ni lo use para ningún propósito, ni revele sus contenidos a ninguna persona ni lo almacene ni copie esta información en ningún medio.
English: This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and destroy this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden.
Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org) https://lists.wikimedia.org/mailman/listinfo/toolserver-l Posting guidelines for this list: https://wiki.toolserver.org/view/Mailing_list_etiquette