Thanks Count Count. I have identified a new issue
with the new k8s
cluster and am looking into it.
On Sun, 12 Jan 2020 at 21:43, Count Count <countvoncount123456(a)gmail.com>
wrote:
Yes, I switched back to the old cluster. This is
a new tool that was
used in production even if only rarely. I can't leave it offline for hours.
I have created a test tool as a copy with which I can reproduce the
issue:
tools.countcounttest@tools-sgebastion-07:~$ kubectl get pods
NAME READY STATUS RESTARTS
AGE
countcounttest-6b58f5c547-mf4jx 0/1 ContainerCreating 0
77s
I will leave that running. If the container gets created I might also
be able to reproduce the segfault.
Best regards,
Count Count
On Sun, Jan 12, 2020 at 10:20 PM Alex Monk <krenair(a)gmail.com> wrote:
> Hi Count Count,
>
> I'm afraid you seem to have no pods on the new cluster to look at:
>
> # kubectl get -n tool-flaggedrevspromotioncheck pod
> No resources found.
>
> Alex
>
> On Sun, 12 Jan 2020 at 21:07, Count Count <
> countvoncount123456(a)gmail.com> wrote:
>
>> Hi!
>>
>> I don't have much luck with a webservice based on the python3.7
>> image. It is running fine on the legacy K8s cluster.
>>
>> On the new cluster I got a segfault. After stopping the webservice
>> and trying again to get an empty log the pod is now stuck in
>> ContainerCreating.
>>
>> A few minutes ago:
>> tools.flaggedrevspromotioncheck@tools-sgebastion-08:~$ kubectl get
>> pods
>> NAME READY STATUS
>> RESTARTS AGE
>> flaggedrevspromotioncheck-7cbfff44fc-jnhmq 0/1
>> ContainerCreating 0 2m48s
>>
>> ...and just now:
>> tools.flaggedrevspromotioncheck@tools-sgebastion-08:~$ kubectl get
>> pods
>> NAME READY STATUS
>> RESTARTS AGE
>> flaggedrevspromotioncheck-7cbfff44fc-q55gm 0/1
>> ContainerCreating 0 5m18s
>>
>> Best regards,
>>
>> Count Count
>>
>> On Thu, Jan 9, 2020 at 10:58 PM Bryan Davis <bd808(a)wikimedia.org>
>> wrote:
>>
>>> I am happy to announce that a new and improved Kubernetes cluster is
>>> now available for use by beta testers on an opt-in basis. A page has
>>> been created on Wikitech [0] outlining the self-service migration
>>> process.
>>>
>>> Timeline:
>>> * 2020-01-09: 2020 Kubernetes cluster available for beta testers on
>>> an
>>> opt-in basis
>>> * 2020-01-23: 2020 Kubernetes cluster general availability for
>>> migration on an opt-in basis
>>> * 2020-02-10: Automatic migration of remaining workloads from 2016
>>> cluster to 2020 cluster by Toolforge admins
>>>
>>> This new cluster has been a work in progress for more than a year
>>> within the Wikimedia Cloud Services team, and a top priority project
>>> for the past six months. About 35 tools, including
>>>
https://tools.wmflabs.org/admin/, are currently running on what we
>>> are
>>> calling the "2020 Kubernetes cluster". This new cluster is running
>>> Kubernetes v1.15.6 and Docker 19.03.4. It is also using a newer
>>> authentication and authorization method (RBAC), a new ingress routing
>>> service, and a different method of integrating with the Developer
>>> account LDAP service. We have built a new tool [1] which makes the
>>> state of the Kubernetes cluster more transparent and on par with the
>>> information that we already expose for the grid engine cluster [2].
>>>
>>> With a significant number of tools managed by Toolforge
>>> administrators
>>> already migrated to the new cluster, we are fairly confident that the
>>> basic features used by most Kubernetes tools are covered. It is
>>> likely
>>> that a few outlying issues remain to be found as more tools move, but
>>> we have confidence that we can address them quickly. This has led us
>>> to propose a fairly short period of voluntary beta testing, followed
>>> by a short general availability opt-in migration period, and finally
>>> a
>>> complete migration of all remaining tools which will be done by the
>>> Toolforge administration team for anyone who has not migrated their
>>> self.
>>>
>>> Please help with beta testing if you have some time and are willing
>>> to
>>> get help on irc, Phabricator, and the cloud(a)lists.wikimedia.org
>>> mailing list for early adopter issues you may encounter.
>>>
>>> I want to publicly praise Brooke Storm and Arturo Borrero González
>>> for
>>> the hours that they have put into reading docs, building proof of
>>> concept clusters, and improving automation and processes to make the
>>> 2020 Kubernetes cluster possible. The Toolforge community can look
>>> forward to more frequent and less disruptive software upgrades in
>>> this
>>> cluster as a direct result of this work. We have some other feature
>>> improvements in planning now that I think you will all be excited to
>>> see and use later this year!
>>>
>>> [0]:
>>>
https://wikitech.wikimedia.org/wiki/News/2020_Kubernetes_cluster_migration
>>> [1]:
https://tools.wmflabs.org/k8s-status/
>>> [2]:
https://tools.wmflabs.org/sge-status/
>>>
>>> Bryan (on behalf of the Toolforge admins and the Cloud Services team)
>>> --
>>> Bryan Davis Technical Engagement Wikimedia
>>> Foundation
>>> Principal Software Engineer Boise, ID
>>> USA
>>> [[m:User:BDavis_(WMF)]] irc:
>>> bd808
>>>
>>> _______________________________________________
>>> Wikimedia Cloud Services announce mailing list
>>> Cloud-announce(a)lists.wikimedia.org (formerly
>>> labs-announce(a)lists.wikimedia.org)
>>>
https://lists.wikimedia.org/mailman/listinfo/cloud-announce
>>>
>> _______________________________________________
>> Wikimedia Cloud Services mailing list
>> Cloud(a)lists.wikimedia.org (formerly labs-l(a)lists.wikimedia.org)
>>
https://lists.wikimedia.org/mailman/listinfo/cloud
>
> _______________________________________________
> Wikimedia Cloud Services mailing list
> Cloud(a)lists.wikimedia.org (formerly labs-l(a)lists.wikimedia.org)
>
https://lists.wikimedia.org/mailman/listinfo/cloud
_______________________________________________
Wikimedia Cloud Services mailing list
Cloud(a)lists.wikimedia.org (formerly labs-l(a)lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud