Would this also be a good time to retire the /a and use a more appropriate place or would that break too much stuff? D
On Mon, Aug 19, 2013 at 10:09 AM, Andrew Otto otto@wikimedia.org wrote:
Update!
/a/squid is no longer accessible on stat1. I was just going to remove /a/squid/archive, but I noticed that there were some webrequest logs in other subdirectories of /a/squid. These have been copied to stat1002 anyway, so this shouldn't be a problem.
Let me know if you have any trouble and I'll see what we can do. If I don't hear any objections I'll delete these for good in a few days.
Thanks! -Andrew
On Jul 31, 2013, at 9:52 AM, Andrew Otto otto@wikimedia.org wrote:
Hello again!
Ok, we're actually going to do this this time. As far as we know,
people who need access to private webrequest data have migrated their stuff over to stat1002.eqiad.wmnet. The private webrequest data that currently exists on stat1 will soon be deleted.
Soon is August 7th. That's in 1 week. We announced this back in May,
so there should have been plenty of notice. If you are still using the webrequest logs in /a/squid/archive on stat1, find me on IRC (ottomata) or email me and we can work together to make sure you can continue to do your work on stat1002.
On Wednesday August 7th, we will be removing private webrequest logs
from stat1.
Thanks all! -Andrew Otto
On May 20, 2013, at 2:13 PM, Andrew Otto otto@wikimedia.org wrote:
"Before that happens, you should make sure that any personal stuff on
stat1 that you need for number crunching is copied over to stat1002. "
from your note it looks like this is only related to webrequest data,
is that correct?
Yup! That is correct. stat1002 will be primarily used as a sensitive
private data host. Only those users that have personal unpuppetized code and cronjobs that use this data need to worry about moving them from stat1 to stat1002.
what are the criteria for deciding who has access to stat1002? I see
that contractors like Aaron Halfaker or Jonathan Morgan currently don't have access to it.
The criteria will be the same as before: RT request + manager approval.
However, the request should only be made if the user actually needs access to the webrequest logs to do analysis. For example, if the main reason someone already has access to stat1 is so that they can access the research slave databases, then they won't need access to stat1002.
can you give us more information on the long-term plans/scope of stat1
vs stat1002 (and update https://office.wikimedia.org/wiki/Data_access as needed)?
I've added a small bit about stat1002 on that page.
I don't know much about a long term plan for stat1. It is hosted at
the Tampa datacenter, and in the long term (yearish?) all the machines there will have be be decommissioned or relocated elsewhere. When it finally does move, it will most likely no longer have a public IP. stat1 is intended to be used as a workspace for analysts to do their thing on non-private data.
-Ao
wmfresearch mailing list wmfresearch@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wmfresearch