Hi all;
I think that we need to sort a bit our giant /mnt/user-store directory. It contains zillions of petabytes of cool info, and I think that a percent of it is probably duplicated. A well structured directory tree would be nice for all us.
Perhaps, writing a line in a /mnt/user-store/README file explaining what every directory contains every time we create one, it would be a first nice step.
Also, downloading the 7z dumps for all the ~700 wikis are only ~100 GB. We have about 3TB of disk space in Toolserver, and I don't know where the hell are those 7z dumps available, if someone has downloaded them, manually or with a cronjob. So, if one of us don't find anything, we are going to re-download again and again. A waste of resources and time.
Regards, emijrp
Hi,
On Mon, Nov 1, 2010 at 12:49 PM, emijrp emijrp@gmail.com wrote:
I think that we need to sort a bit our giant /mnt/user-store directory. It contains zillions of petabytes of cool info, and I think that a percent of it is probably duplicated. A well structured directory tree would be nice for all us.
Yep, definitely.
Perhaps, writing a line in a /mnt/user-store/README file explaining what every directory contains every time we create one, it would be a first nice step.
README is write-protected, I went ahead and created /mnt/user-store/INDEX (chmodded 777).
Also, downloading the 7z dumps for all the ~700 wikis are only ~100 GB. We have about 3TB of disk space in Toolserver, and I don't know where the hell are those 7z dumps available, if someone has downloaded them, manually or with a cronjob. So, if one of us don't find anything, we are going to re-download again and again. A waste of resources and time.
Actually, they are in user-store/dumps, at least some enwiki. Some xml dumps are also spread throughout user-store, by the looks of it. And I vaguely remember another directory for dumps...
Marco
On Mon, Nov 1, 2010 at 13:19, Marco Schuster marco@harddisk.is-a-geek.org wrote:
Hi,
On Mon, Nov 1, 2010 at 12:49 PM, emijrp emijrp@gmail.com wrote:
I think that we need to sort a bit our giant /mnt/user-store directory. It contains zillions of petabytes of cool info, and I think that a percent of it is probably duplicated. A well structured directory tree would be nice for all us.
Yep, definitely.
Perhaps, writing a line in a /mnt/user-store/README file explaining what every directory contains every time we create one, it would be a first nice step.
README is write-protected, I went ahead and created /mnt/user-store/INDEX (chmodded 777).
There's already a page on the toolserver wiki about user-store. Can we use that to avoid duplicating the efforts?
Also, downloading the 7z dumps for all the ~700 wikis are only ~100 GB. We have about 3TB of disk space in Toolserver, and I don't know where the hell are those 7z dumps available, if someone has downloaded them, manually or with a cronjob. So, if one of us don't find anything, we are going to re-download again and again. A waste of resources and time.
Actually, they are in user-store/dumps, at least some enwiki. Some xml dumps are also spread throughout user-store, by the looks of it. And I vaguely remember another directory for dumps...
There's a bunch of stats dumps in user-store/stats too.
Marco
-- VMSoft GbR Nabburger Str. 15 81737 München Geschäftsführer: Marco Schuster, Volker Hemmert http://vmsoft-gbr.de
Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org) https://lists.wikimedia.org/mailman/listinfo/toolserver-l Posting guidelines for this list: https://wiki.toolserver.org/view/Mailing_list_etiquette
On Mon, Nov 1, 2010 at 4:00 PM, Johan G johang@toolserver.org wrote:
There's already a page on the toolserver wiki about user-store. Can we use that to avoid duplicating the efforts?
Somehow I don't like the idea of this information being public...
Marco
Marco Schuster wrote:
On Mon, Nov 1, 2010 at 4:00 PM, Johan G johang@toolserver.org wrote:
There's already a page on the toolserver wiki about user-store. Can we use that to avoid duplicating the efforts?
Somehow I don't like the idea of this information being public...
Huh? I don't understand why you'd have an issue with a public index of a largely public directory.
The page on the Toolserver wiki is located here: https://wiki.toolserver.org/view/User-store
There's also a README file inside /mnt/user-store.
MZMcBride
On Tue, Nov 2, 2010 at 1:10 AM, Marco Schuster marco@harddisk.is-a-geek.org wrote:
On Mon, Nov 1, 2010 at 4:00 PM, Johan G johang@toolserver.org wrote:
There's already a page on the toolserver wiki about user-store. Can we use that to avoid duplicating the efforts?
Somehow I don't like the idea of this information being public...
Marco
Based on the description, there shouldn't really be anything that can't be publicly shared with others, so what is wrong with having a list of the contents available for other users so people [other TS users] can easily identify if there is stuff there may like to use? -Peachey
toolserver-l@lists.wikimedia.org