You have to create the venv in a container using 'webservice shell of the right runtime'. We support Python versions from Debian Jessie, Stretch and Buster by building in containers, so we cannot sync more than one of those to the bastion. We have moved a lot of Python tools back and forth without issue, but we have rebuilt the container image, which can introduce issues. Tomorrow, I will try a few things to be sure, but a venv can easily cause problems.
On Sun, Jan 12, 2020, 16:30 Alex Monk krenair@gmail.com wrote:
Interesting, uwsgi had Python 3.7.3 but `./www/python/venv/bin/python --version` says 3.7.6. Is that a big enough difference to cause problems?
On Sun, 12 Jan 2020 at 23:19, Chico Venancio chicocvenancio@gmail.com wrote:
Maybe a venv created in a different python version?
Chico Venancio
Em dom, 12 de jan de 2020 20:14, Alex Monk krenair@gmail.com escreveu:
I think I've seen that particular error that you see in stdout/stderr (via kubectl logs) before - on pods that in fact were working.
Meanwhile, uwsgi.log says:
Python version: 3.7.3 (default, Apr 3 2019, 05:39:12) [GCC 8.3.0] Set PythonHome to /data/project/countcounttest/www/python/venv Fatal Python error: initfsencoding: Unable to get the locale encoding ModuleNotFoundError: No module named 'encodings'
Current thread 0x00007fe50490e780 (most recent call first): !!! uWSGI process 1 got Segmentation Fault !!!
followed by a backtrace. Suggests the problem is related to something inside the image/application code rather than the cluster itself anyway. I notice the pod on the new cluster seems to be using the sssd variant of the toolforge-python37-web image, which pods in the old cluster are not using. I doubt it's the source problem as uwsgi shouldn't be segfaulting over some problem talking to LDAP... Needs further investigation by someone during the week I think.
On Sun, 12 Jan 2020 at 23:00, Count Count countvoncount123456@gmail.com wrote:
Your pod started and container and it crashed, I see a uwsgi.log file
with a python module problem and a uwsgi segfault.
Yes. It was working fine with the legacy cluster. The service ist started via webservice --backend=kubernetes python3.7 start
Apparently it cannot load the uwsgi shared library if deployed on the new cluster? tools.countcounttest@tools-sgebastion-07:~$ kubectl logs countcounttest-6b58f5c547-785mr open("/usr/lib/uwsgi/plugins/python_plugin.so"): No such file or directory [core/utils.c line 3724] !!! UNABLE to load uWSGI plugin: /usr/lib/uwsgi/plugins/python_plugin.so: cannot open shared object file: No such file or directory !!!
On Sun, Jan 12, 2020 at 11:42 PM Alex Monk krenair@gmail.com wrote:
Hi Count Count, I believe I may have sorted out an issue that prevented some pods (depending partially on luck) from creating containers. Your pod started and container and it crashed, I see a uwsgi.log file with a python module problem and a uwsgi segfault.
On Sun, 12 Jan 2020 at 22:12, Alex Monk krenair@gmail.com wrote:
Thanks Count Count. I have identified a new issue with the new k8s cluster and am looking into it.
On Sun, 12 Jan 2020 at 21:43, Count Count < countvoncount123456@gmail.com> wrote:
> Yes, I switched back to the old cluster. This is a new tool that was > used in production even if only rarely. I can't leave it offline for hours. > > I have created a test tool as a copy with which I can reproduce the > issue: > tools.countcounttest@tools-sgebastion-07:~$ kubectl get pods > NAME READY STATUS > RESTARTS AGE > countcounttest-6b58f5c547-mf4jx 0/1 ContainerCreating 0 > 77s > > I will leave that running. If the container gets created I might > also be able to reproduce the segfault. > > Best regards, > > Count Count > > On Sun, Jan 12, 2020 at 10:20 PM Alex Monk krenair@gmail.com > wrote: > >> Hi Count Count, >> >> I'm afraid you seem to have no pods on the new cluster to look at: >> >> # kubectl get -n tool-flaggedrevspromotioncheck pod >> No resources found. >> >> Alex >> >> On Sun, 12 Jan 2020 at 21:07, Count Count < >> countvoncount123456@gmail.com> wrote: >> >>> Hi! >>> >>> I don't have much luck with a webservice based on the python3.7 >>> image. It is running fine on the legacy K8s cluster. >>> >>> On the new cluster I got a segfault. After stopping the webservice >>> and trying again to get an empty log the pod is now stuck in >>> ContainerCreating. >>> >>> A few minutes ago: >>> tools.flaggedrevspromotioncheck@tools-sgebastion-08:~$ kubectl >>> get pods >>> NAME READY STATUS >>> RESTARTS AGE >>> flaggedrevspromotioncheck-7cbfff44fc-jnhmq 0/1 >>> ContainerCreating 0 2m48s >>> >>> ...and just now: >>> tools.flaggedrevspromotioncheck@tools-sgebastion-08:~$ kubectl >>> get pods >>> NAME READY STATUS >>> RESTARTS AGE >>> flaggedrevspromotioncheck-7cbfff44fc-q55gm 0/1 >>> ContainerCreating 0 5m18s >>> >>> Best regards, >>> >>> Count Count >>> >>> On Thu, Jan 9, 2020 at 10:58 PM Bryan Davis bd808@wikimedia.org >>> wrote: >>> >>>> I am happy to announce that a new and improved Kubernetes cluster >>>> is >>>> now available for use by beta testers on an opt-in basis. A page >>>> has >>>> been created on Wikitech [0] outlining the self-service migration >>>> process. >>>> >>>> Timeline: >>>> * 2020-01-09: 2020 Kubernetes cluster available for beta testers >>>> on an >>>> opt-in basis >>>> * 2020-01-23: 2020 Kubernetes cluster general availability for >>>> migration on an opt-in basis >>>> * 2020-02-10: Automatic migration of remaining workloads from 2016 >>>> cluster to 2020 cluster by Toolforge admins >>>> >>>> This new cluster has been a work in progress for more than a year >>>> within the Wikimedia Cloud Services team, and a top priority >>>> project >>>> for the past six months. About 35 tools, including >>>> https://tools.wmflabs.org/admin/, are currently running on what >>>> we are >>>> calling the "2020 Kubernetes cluster". This new cluster is running >>>> Kubernetes v1.15.6 and Docker 19.03.4. It is also using a newer >>>> authentication and authorization method (RBAC), a new ingress >>>> routing >>>> service, and a different method of integrating with the Developer >>>> account LDAP service. We have built a new tool [1] which makes the >>>> state of the Kubernetes cluster more transparent and on par with >>>> the >>>> information that we already expose for the grid engine cluster >>>> [2]. >>>> >>>> With a significant number of tools managed by Toolforge >>>> administrators >>>> already migrated to the new cluster, we are fairly confident that >>>> the >>>> basic features used by most Kubernetes tools are covered. It is >>>> likely >>>> that a few outlying issues remain to be found as more tools move, >>>> but >>>> we have confidence that we can address them quickly. This has led >>>> us >>>> to propose a fairly short period of voluntary beta testing, >>>> followed >>>> by a short general availability opt-in migration period, and >>>> finally a >>>> complete migration of all remaining tools which will be done by >>>> the >>>> Toolforge administration team for anyone who has not migrated >>>> their >>>> self. >>>> >>>> Please help with beta testing if you have some time and are >>>> willing to >>>> get help on irc, Phabricator, and the cloud@lists.wikimedia.org >>>> mailing list for early adopter issues you may encounter. >>>> >>>> I want to publicly praise Brooke Storm and Arturo Borrero >>>> González for >>>> the hours that they have put into reading docs, building proof of >>>> concept clusters, and improving automation and processes to make >>>> the >>>> 2020 Kubernetes cluster possible. The Toolforge community can look >>>> forward to more frequent and less disruptive software upgrades in >>>> this >>>> cluster as a direct result of this work. We have some other >>>> feature >>>> improvements in planning now that I think you will all be excited >>>> to >>>> see and use later this year! >>>> >>>> [0]: >>>> https://wikitech.wikimedia.org/wiki/News/2020_Kubernetes_cluster_migration >>>> [1]: https://tools.wmflabs.org/k8s-status/ >>>> [2]: https://tools.wmflabs.org/sge-status/ >>>> >>>> Bryan (on behalf of the Toolforge admins and the Cloud Services >>>> team) >>>> -- >>>> Bryan Davis Technical Engagement Wikimedia >>>> Foundation >>>> Principal Software Engineer Boise, >>>> ID USA >>>> [[m:User:BDavis_(WMF)]] irc: >>>> bd808 >>>> >>>> _______________________________________________ >>>> Wikimedia Cloud Services announce mailing list >>>> Cloud-announce@lists.wikimedia.org (formerly >>>> labs-announce@lists.wikimedia.org) >>>> https://lists.wikimedia.org/mailman/listinfo/cloud-announce >>>> >>> _______________________________________________ >>> Wikimedia Cloud Services mailing list >>> Cloud@lists.wikimedia.org (formerly labs-l@lists.wikimedia.org) >>> https://lists.wikimedia.org/mailman/listinfo/cloud >> >> _______________________________________________ >> Wikimedia Cloud Services mailing list >> Cloud@lists.wikimedia.org (formerly labs-l@lists.wikimedia.org) >> https://lists.wikimedia.org/mailman/listinfo/cloud > > _______________________________________________ > Wikimedia Cloud Services mailing list > Cloud@lists.wikimedia.org (formerly labs-l@lists.wikimedia.org) > https://lists.wikimedia.org/mailman/listinfo/cloud
Wikimedia Cloud Services mailing list Cloud@lists.wikimedia.org (formerly labs-l@lists.wikimedia.org) https://lists.wikimedia.org/mailman/listinfo/cloud
Wikimedia Cloud Services mailing list Cloud@lists.wikimedia.org (formerly labs-l@lists.wikimedia.org) https://lists.wikimedia.org/mailman/listinfo/cloud
Wikimedia Cloud Services mailing list Cloud@lists.wikimedia.org (formerly labs-l@lists.wikimedia.org) https://lists.wikimedia.org/mailman/listinfo/cloud
Wikimedia Cloud Services mailing list Cloud@lists.wikimedia.org (formerly labs-l@lists.wikimedia.org) https://lists.wikimedia.org/mailman/listinfo/cloud
Wikimedia Cloud Services mailing list Cloud@lists.wikimedia.org (formerly labs-l@lists.wikimedia.org) https://lists.wikimedia.org/mailman/listinfo/cloud