On Tue, Nov 20, 2018 at 4:41 PM Mukunda Modell mmodell@wikimedia.org wrote:
Despite each of our efforts, there is still a configuration error preventing MediaWiki from working in beta. I've been stuck for several hours now but I'm hopeful that this is the last major issue and I'm sure it's simple for someone who understands MediaWiki internals better than I do. Unfortunately I'm completely stumped so I could really use some help from someone who understands the configuration of MediaWiki session storage and the underlying object cache.
The problem is described in https://phabricator.wikimedia.org/T210030 so I won't repeat it here. I'll simply appeal for those of you who know something about how "BagOStuff" is configured, please take a look at T210030 and point me in the right direction.
The beta cluster wikis are working again. It turns out that there was some confusion when moving/removing servers because of implementation drift between our production clusters and the beta cluster.
Before the move to the new region we had both "memc*" and "redis*" servers in the beta cluster project. The "memc*" servers are the equivalent of our production "mc*" servers. In production the "mc*" servers run both memcached and redis services. In the beta cluster our "memc*" servers were only providing memcached and the configuration relied on the "redis*" servers for session storage. The "redis*" servers were removed while migrating virtual machines to the eqiad1-r region under the assumption that they were legacy servers from the time when we used redis as storage for the job queue. The fix was to setup the "memc*" servers with both memcached for arbitrary data caching and redis for session storage. If you are interested in the gory details see notes left on https://phabricator.wikimedia.org/T210030.
Bryan