Aha, replication lag would explain it! Since I'm only doing things on off hours, how long should replication lag be these days?
It shouldn't normally be more than 2-3 seconds. We did have an issue with the S4 cluster (commonswiki) this weekend where the master had crashed, and replag was very long till I brought the master back up. You could have seen some issues due to that.
We are also currently having some network issues, specifically with multicast UDP, which we use for squid purging. This could also be causing the problem you are seeing. The scheduled downtime will address the multicast issues.
- Ryan