2017-11-14 20:00:02,245 INFO force is enabled
2017-11-14 20:00:02,330 INFO removing tools-project-backup
2017-11-14 20:00:02,421 INFO removing tools-project-backup
2017-11-14 20:00:02,994 INFO creating tools-project-backup at 2T
2017-11-14 20:00:03,782 INFO force is enabled
2017-11-14 20:00:03,800 INFO removing tools-snap
2017-11-14 20:00:03,848 INFO removing tools-snap
2017-11-14 20:00:04,930 INFO creating tools-snap at 1T
Hi all,
Ops meeting updates:
* Mark is out this week, Jaime is back
* I thanked Manuel for being awesome to us while being only person around
* No other updates or questions tied to our work
Clinic duty notes:
* Helped Arturo with https://phabricator.wikimedia.org/T173647
* Worked a bit on the k8s etcd ferm issues - https://phabricator.wikimedi
a.org/T179955#3743449
* IRC help for few folks - nothing major
* Responded to a page on Thursday late night for phabricator
* Great weekend IRC helping and collaboration by zhuyifei1999 and vallhala
- Thank you!
* Worked a little bit on this dashboard -
https://grafana-admin.wikimedia.org/dashboard/db/labs-monitoring
--
Madhu :)
2017-11-08 20:00:02,842 INFO force is enabled
2017-11-08 20:00:02,887 INFO removing misc-project-backup
2017-11-08 20:00:02,986 INFO removing misc-project-backup
2017-11-08 20:00:03,510 INFO creating misc-project-backup at 2T
2017-11-08 20:00:04,426 INFO force is enabled
2017-11-08 20:00:04,446 INFO removing misc-snap
2017-11-08 20:00:04,496 INFO removing misc-snap
2017-11-08 20:00:04,796 INFO creating misc-snap at 1T
- I chatted for a bit with Rion Dooley who manages a scientific
computing platform called Agave. His use cases overlap ours by quite a
lot -- he recently migrated his users from grid engine to K8s, supports
a Jypiter install, etc. He seems to live in Austin so I said I'd invite
him to meet up with us when we're down there in December.
- Real people are really using CephFS and Manila as an NFS replacement
and claiming that it pretty much works. So this could easily be a part
of our big ceph/cinder future plan.
- At a 'state of Designate' talk: They're deprecating the 1.0 API in
the Q release which will probably break some of our minor internal
tooling but modern Horizon dashboards have long since moved off of it.
It's pretty clear that the Designate team is limping along -- last cycle
they had 0 paid staff on the project, now they have one (or maybe one
part-timer). Nevertheless adoption of designate is climbing (12% of
installed clouds run it now) so it's unlikely to die off.
- I pestered the Horizon team about some of our performance issues. It
sounds like the super-slow Identity issues have fixes in progress (but
not yet released). It's less clear on if there's anywhere good to go
with the puppet UI -- there's a big caching patch in place for that
widget but we might already be running that, it's unclear.
- I attended a lightning talk about the metadata service to ask about
using metadata to provide per-tenant or per-instance secrets to VMs:
Q: metadata -- is it private? Specifically, if the metadata agent is
providing private data based on instance/tenant can I depend on it not
providing that data to another badly-behaved instance?
A: It's intended to prevent spoofing but you should probably audit this
yourself
A (from a later talk): The metadata server checks x-forwarded-for and
instances can spoof that and steal creds from different VMs. Either
disable x-forwarded-for or use the config drive for security.
- Horizon: Can use the ui-cookiecutter project to create a base
template for a new panel with angular examples. Briefly I thought that
they were threatening to break all of our existing Django-based custom
panels in favor of Angular but I talked to the team lead and it sounds
like there's no actual plan for that.
- Keystone: In the P release they've totally redone policy config and I
don't quite understand the change. They're moving policies 'into code'
which I think means that there are live-alterable database-stored policy
rules. I asked about migration path and they said that the existing
policy files will be considered an override for the time being, so we
don't need to tear down the existing system. (Of course the new system
will be better, but we can move over incrementally.) Keystone STILL
doesn't support project-local admin roles, which is stupid but
definitely means that we should just write our own implementation of
this since an upstream fix is clearly many years away.
- Ops designate feedback talk: I asked about sharing a domain among
multiple projects and they put it on the wishlist. I think it's been on
the wishlist for a while though. It sounds like the next upgrade cycle
(or the next two, L->M->N) will be rocky but after that it may go a bit
better.
Keynote take aways:
- https://allisonrandal.com/ is really cool
- ATT is doing: an openstack shim mostly for Ironic integration to manage
baremetal, then k8s to deploy a full customer oriented Openstack deployment
using k8s native rolling deploy and management for Openstack deployed
within containers, and then Openstack deployed instances that can have k8s
and other container orchestration platforms deployed within instances.
It's turtles all the way down people. This also isn't unique actually.
Uses opensack-helm. A lot of talk this year around the integration of
Openstack and k8s platforms and layering thereof. Really interesting stuff.
- Ironic is a huge winner in the last 6m - 1y. 9% => 20% adoption across
surveyed customers. It seems to have gone from a thing you had to stretch
to your use case with a lot of time and attention to a popular deployment
pattern for all hardware (including hardware underpinning the openstack
clouds themselves). So that's really interesting considering it's on our
roadmap to visit in the PN[0] age.
- Zuulv3 is being touted as a community part of the O[1] ecosystem. They
said these words "Now that Zuul v3 is released it's actually ready to be
used by other people". Should we laugh or cry?
- Good description of Openstack components as "programmable infrastructure".
- O[1] foundation views their work as an integration engine for the full
stack of components
-- Sessions --
* O[1] Passports
- https://www.openstack.org/passport is a cool initiative to make getting
trial credentials to all participating public cloud run on O[1] possible.
Really curious to see how folks have put their custom stamp on 'vanilla
O[1]'.
* O[1] CephFS now fully awesome
- Manila is a projec that aims to manage File Servers (shared file servers
usually) with several backends, including NFS and the new-hotness CephFS.
That means in theory we could replace some of our homebrew things with a
combination of these components down the line. We could even in theory use
Manila on top of NFS. It's pretty interesting and could present some better
deployment and management models like presentation of these shares as block
storage to instances from the host level. That could mean better
management with use of flavor constraints and quota tie-in.
- CephFS seems super awesome. I'm skeptical, my impression of new Ceph
features historically is that they reache fever pitch for adoption way
sooner than maturity.
* Overview and Status of Cep File System
- More CephFS talk and magic. It's new, it's here. It integrates with
Manila, and you can now deploy Ceph as a backend to Glance, Cinder, Nova
(yeah..), and Manila
* How does Neutron Do That?
- Pretty misleading title let me say! We never touched Neutron nor did
they show any actual Neutron internals as I thought. It was however a
pretty decent hands on with components that underpin Neutron. Namespaces,
veth interfaces, linux bridging, etc. I did learn a few tid bits, but it
was a sort of pitch for Rackspace training. Worth going to, could have
been 30m and not 1h30m.
* Dos And Dont's of Ceph and O[1]
This guy finally addressed specifically our problems of overzealous tenants
IO effecting other tenants. The reason I haven't been able to find a backed
in Ceph QoS component is that there isn't one :) The work arounds
discussed were generic and won't work for us atm, but could in future
deployment. Lots of discussion of configuration gotchas and when to use
defaults and not. I would go back and watch this before a trial Ceph
deployment. Straight forward stuff. Much of it too in the weeds for our
current lack of Ceph status, but food for thought. He talked about when is
Ceph appropriate and the age old wisdom of fast network storage being 10x
faster than local IO and how that has faired with the growth in local
storage solutions. Ironically, he talked for a bit about when not to use
Ceph and go for collocated LVM block allocations that are local to the
compute node. Convinced me we should be doing this with Cinder even
without real Ceph solution. Cinder volume types can be used to make QoS
profiles at the host level assuming the correct IO scheduler etc.
Cinder
- Ceph RBD 65% of deployments
- LVM 24%
- NFS 11% (we are not alone!)
* Kubernetes on O[1]
This is was mostly about the k8s side cloud provider native tie ins -- none
of which we use. And honestly none of which were that interesting for us
atm. But the ability to use Neutron LBaaS in place of agnostic intra-k8s in
the future could be a win. I wonder how much of our ingress problem we
could side step. A lot of assumptions made for cloud provider integration,
much of which we could work through pretty easily. He stepped on the idea
of container security a bit and esp collocated container customers on bare
metal and said this quote that is funny when you consider our model: "You
should have a k8s deployment per hostile tenant..."
* Neutron to Neutron routing
Basically a pitch to use BGP VPN's between unrelated Neutron l3 instances.
This is pretty relevant to us and he talked a bit about the ambigouity of
cellsv2 vs regions vs strategy of choice. Worth thinking about, he
reviewed a few of the options with no big surprises. There are a lot of
assumptions about high trust environments without the need for IPSec etc.
https://docs.openstack.org/networking-bgpvpn/latest/
* Inside Neutron routing architectures
Given by one of the early leaders in the Neutron (then quantum space).
Neutron was originally just a test harness but enough features were added
to it that folks started deploying it in production and the rest is
history. That somewhat helps explain the early wild west days of Neutron.
Neutron has grown and changed a ton because of: ongoing understanding of
use cases, and the almost entirely organic growth. For instance they
stopped putting metadata servers in each namespace in Ocata and now have
haproxy and one metadata server to rule them all etc.
Neutron /only/ uses network namespace isolation. Unspoken in the session,
but that means that all processes for all tenants see the global process
space. I haven't figured out how that's exploitable yet but it sure seems
like there could be interesting effects.
The model of topology we are targeting is called "legacy" and so at the end
of the session I asked if this nomenclature meant it would go away at some
point into the future. Both the session leader and the chair of the
Neutron subcommittee (who was sitting right in front of me) guaranteed it's
not even a conversation. The phrasing was "no one is even talking about
it" and "There would be a riot, and it won't happen". So that was worth
going to the session right there.
Much work with openflow integration and related tools: OVN, Midonet, open
daylight, open contrail (a Juniper cross over project I think?).
Acknowledged that with Neutron there is no right answer to any particular
problem, it is so flexible you can shoot yourself in the foot. Described
as a general trade off between operational complexity vs data path
efficiency. So that's a nice summation of the general concerns. There was
no giant 'aha' moment here which I take to be a good thing. Neutron has
moved on from some of the assumptions and models of Mitaka and below it but
for the most part we are thinking in the right terms.
* Questions to make storage vendors squirm
Demonstration of the interplay between latency and throughput. I.e. moving
x payload w/ y latency will only ever demonstrate z throughput and that's
not an indictment of a storage system but instead of a testing methodology.
* Lots of talk about don't be fooled by caching in demos, ask about IO cost
of fancy features, talk about backups early.
"Storage benchmarking is mostly a lie"
I talked to him a bit after about the problem before this problem of
profiling load to know what parameters to actually look for in a solution
and my difficulties with getting a good picture of live load on old systems
that need to be replaced. My approach was based on anecdotal patterns
capture, and a lot of supposition and surely that's bad. He acknowledged
that's pretty much the only solution. There is no codified way to profile
a system in place, and the general difficulty of the problem and cross
vendor/discipline nature means no one wants to tackle creating one. So
that's both mildly validating and exhausting. Nice guy, talked a mile a
minute and clearly was excited about storage.
* Running Openshift on Openstack
Based on
http://www.bu.edu/hic/research/highlighted-sponsored-projects/massachusetts…
I went to this for obvious reasons :) It was mostly a features pitch from
Redhat. The guy who spoke most of the time was a Redhat employee, who did
know the product inside and out, intermixed with example use cases.
Learned that Openshift FaaS is settling on Openwhisk. Kuryr is an
integration project coming to more directly tie Openshift (and k8s vanilla)
into neutron to avoid flannel and in general overlay and (in some cases)
the double encapsulation insanity of vxlan inside vxlan that would happen.
That's been on my mind so it was an interesting note.
This setup is so incredibly similar to ours in ideology and deployment that
we could have given a mirror presentation.
Post I spoke with Robert who was the person from Boston U there to
represent and a few things were clarified:
* This was only really deployed over the summer and has had mostly trivial
usage. A large 300 person use cases was meant to happen but did not.
* They have some awesome open data partnerships with MIT
* I got his contact info and I think we should follow up. I want to know
in 6 months if they are as high on openshift as they are now.
Openshift is the only thing operating in its space that has maturity now.
Redhat is basically killing it here through happenstance because of a well
executed project and every alternative being gobbled up and mismanaged
AFAICT. Kudos to them.
* Move to Cellsv2
(This was a followup to a talk given in boston 6m ago I need to track that
down)
In Ocata Nova splits into multiple DB's for components, and all deployments
become an inherent cell even if there is only the implicit cell0. So
everyone is moving to cellsv2 no matter what. Cells all the way down.
Reasons to spin off a cell:
- message queue is melting from load
- isolate failure domains
- remove cells of compute
Noted that clustering rabbit and rabbit load is tough.
A cellv2 is basically a distinct nova-api instance, associated DB, message
queue (rabbit), and compute farm.
Lots of weaknesses even up to and including Pike.
- no migration between cells
- sorting and reporting is weird
- no scheduler retries
- anti-affinity does not work
3/4 of that is in theory fixed in Queens. They made the appeal for more
engaged users and esp at scale.
Summary (because I found cellsv2 a bit hard to reason about before this
session): cellsv2 are entirely a nova sharding scheme. They are ignorant
to any and all associated Neutron topology. Much of the examples I have
seen had implicit relationships between Neutron scaling and Cellsv2 but
that is only a function of the operators outline. Modern-ish Neutron has
"routed network segments" that can be aligned with a cellv2 providing l3
segmentation that overlaps the cellv2 arch but it's all decoupled
management wise in the control plane. Neutron network segmentation is a
thing we should look at in more detail but it's not ready by Mitaka. It
seems difficult to me that the scaling methodologies for Nova and Neutron
are completely ships in the night, and most examples I have seen do not
address this well so that is a good primer.
* Thoughts
Big bold letters on my note pad from day 1: Ceph has won. It's over. --
Cinder has something like 20+ backends (most of them proprietary) and even
the proprietary backends are trying to figure out how to play nice with
Ceph as non-commodity raw storage. It's not a total sweep though as it's
big and complex and really not a one-size fits all. Ceph has no native QoS
style bindings or management at this time. [combo notes from other days]
Both Andrew and I talked with different folks or were in different
presentations where this was acknowledged separately. The work arounds I
saw were to use flavor integration with cgroup IOPS limiting at the host
level. This has several assumptions to work, most of which we don't
satisfy atm, and comes with many of the downsides of our current low-brow
scheme with TC for NFS. But it's on the "roadmap" and in the minds of the
Ceph devs for sure.
Openshift is killing it.
This conference was really useful for me and I got to ask pointed questions
to Neutron, Ceph, and Cinder developers that are not at all clear from
documentation (esp regarding roadmaps). There was a really great dev and
operators interaction vibe with onboarding for project contribution
alongside sessions purely about operating at scale. Well run and very
dense in engaging content. I mainly stayed with storage and networking
sessions but that's self selection based on deep problem sets we have that
I was trying to gather as much insight as I could. I think these two
arenas are also some of the most complex with a high rate of change in the
last 6 releases. We /need/ to find a way to come to these in an ongoing
fashion. I think Andrew came away feeling very much the same.
[0] Post Neutron
[1] Openstack
--
Chase Pettet
chasemp on phabricator <https://phabricator.wikimedia.org/p/chasemp/> and
IRC
2017-11-07 20:00:02,501 INFO force is enabled
2017-11-07 20:00:02,519 INFO removing tools-project-backup
2017-11-07 20:00:02,553 INFO removing tools-project-backup
2017-11-07 20:00:03,079 INFO creating tools-project-backup at 2T
2017-11-07 20:00:03,894 INFO force is enabled
2017-11-07 20:00:03,944 INFO removing tools-snap
2017-11-07 20:00:03,982 INFO removing tools-snap
2017-11-07 20:00:05,134 INFO creating tools-snap at 1T
The main callout for us from the meeting is godog's feedback needed
for porting NFS stats from Diamond to Prometheus:
* <https://phabricator.wikimedia.org/T177196#3723497>
* Gerrit merge policy discussion; set gerrit to "rebase-if-necessary"?
** Tests may not run on the auto rebase so there is some concern
** Should there be a gate-and-submit job?
** Giuseppe to follow up with RelEng on options
* Keith making progress on Puppet4 setup task --
https://phabricator.wikimedia.org/T177254
* Input needed by godog from WMCS on nfs client/server Prometheus
stats https://phabricator.wikimedia.org/T177196#3723497
* Monitoring project workboard updated:
https://phabricator.wikimedia.org/tag/monitoring/
* 100+ servers over 5 years old in eqiad; new audit in process.
* new OpenJDK and OpenSSL packages coming from Moritz
Bryan
--
Bryan Davis Wikimedia Foundation <bd808(a)wikimedia.org>
[[m:User:BDavis_(WMF)]] Manager, Cloud Services Boise, ID USA
irc: bd808 v:415.839.6885 x6855
In a talk ATM, Arturo could you look when around? Never recovered
On Nov 6, 2017 4:48 AM, "shinken" <shinken(a)shinken-01.shinken.eqiad.wmflabs>
wrote:
> Notification Type: PROBLEM
>
> Service: Free space - all mounts
> Host: tools-webgrid-lighttpd-1428
> Address: 10.68.19.172
> State: WARNING
>
> Date/Time: Sun 05 Nov 17:48:16 UTC 2017
>
> Additional Info:
>
> WARNING: tools.tools-webgrid-lighttpd-1428.diskspace._tmp.byte_percentfree
> (<100.00%)
2017-11-01 20:00:02,979 INFO force is enabled
2017-11-01 20:00:03,022 INFO removing misc-project-backup
2017-11-01 20:00:03,105 INFO removing misc-project-backup
2017-11-01 20:00:03,512 INFO creating misc-project-backup at 2T
2017-11-01 20:00:04,393 INFO force is enabled
2017-11-01 20:00:04,421 INFO removing misc-snap
2017-11-01 20:00:04,467 INFO removing misc-snap
2017-11-01 20:00:04,868 INFO creating misc-snap at 1T