Hi, let me recycle this reply posted initially at "Determine phabricator.wikimedia.org service level" - https://phabricator.wikimedia.org/T76381
Currently Phabricator is getting the same service level that Bugzilla had. Looking at the whole Wikimedia picture, I think this is the most sensible option. I don't see any strong reason to change it.
Bugzilla was down unexpectedly several times in the past years, and if Ops was able to react quicker it's just because we were luckier with the cause, timing and location of the breaks. If we would have Bugzilla instead of Phabricator in the rack that went down this weekend, the service provided by Ops would have been exactly the same.
We can reopen this discussion when planning the migration of code review and (eventually) continuous integration. For now, I think we are good. This is the opinion of the Engineering Community team. If this works also for Operations and Platform Engineering, then we can resolve this task.
PS: About the downtime itself, 5 hours on a weekend is clearly unfortunate, but imho nothing that should make us revise the current service level. Was anybody unable to work, arms crossed? Was any project delayed? I'm counting volunteers as much as employees. Personally I learned about the downtime only in wikitech-l, having used Phabricator on Saturday-Sunday night at 1am CET, and then on Sunday at 1pm.