The checklist I ran through is in my personal bug tracker [0]. Almost everything worked and almost everything went smoothly. Almost.
Here's some of the notes I made as things went along:
* Running make-wmf-branch on bast1001 needed a change to the build dir because reedy already owns the default dir there. ** Since the first thing it does is `rm -rf` the dir I think we could do something smarter in the script like prompt to delete after completion and/or add the script pid to the dir name.
* Cutting the branch should be automatic. Jenkins could do this easily and make the timing predictable for all parties.
* It seems like the php-1.XwmfY checkouts on tin could be either shallow or single branch checkouts. I think Chad started playing with having multiple working copies that share the same repository which might be even nicer.
* Speaking of Chad's prototype work, /a/common/php-git makes `updateWikiversions` throw a warning: "updateBitsBranchPointers: link target /usr/local/apache/common-local/php-git/skins does not exist."
* I tried to make a script to automate copying security patches from one branch checkout to the next. It worked sort of. `git apply` wasn't smart enough to figure out that the patches I pulled off of the wmf15 checkout were already applied in the wmf16 branch. It would be nice to figure this out and get it automated or find a better way in general to manage security patches.
* Creating the on-wiki deploy notes is a PITA. I read the script a couple of times and tried running on my own and with tips from Sam. I never did get it to work for me (kept getting empty output). Sam ran it and it worked fine but he said "I recall the script being temperamental". We should definitely make this an automated job in Jenkins or elsewhere. Nobody should have to babysit this kind of communications process. (The cobbler's children have no shoes.)
* My ssh-agent (OS X 10.8.5) croaked badly when trying to run sync-wikiversions. This seems to be triggered by the full fanout (not batched) dsh call. Aaron had to step in and run both sync-wikiversions for me.
* l10n sync went badly again. wmf16 got partial en l10n data and then my `scap --versions php-1.23wmf16` to fix it blew up badly in the scap-rebuild-cdbs step: ** Bug 62018 [1] - scap-rebuild-cdbs fails when scap called with `--versions` command line flag - is in the python scap code I'm pretty sure and I'll get on fixing that. ** Bug 51174 [2] - Scap broken for deploying new versions of MediaWiki due to ExtensionMessage file not being created - looks like the things I added in I5467ac8 [3] were necessary but not sufficient to fix this. I stupidly didn't save a copy of the first .json files, but wmf16 didn't get full english l10n cache in the cdb until a second scap was run. It seems likely to me that this is related to the "bootstrap" en l10n build that I put in there to get mergeMessageFileList.php to run.
Things actually went pretty smoothly though. Thanks a lot to Chad and Sam for helping me make a checklist and Aaron for being around to lend a hand when I fell and couldn't get up.
[0]: https://github.com/bd808/wmf-kanban/issues/57 [1]: https://bugzilla.wikimedia.org/show_bug.cgi?id=62018 [2]: https://bugzilla.wikimedia.org/show_bug.cgi?id=51174 [3]: https://gerrit.wikimedia.org/r/#/c/113260/
Bryan
On Thu, Feb 27, 2014 at 2:02 PM, Bryan Davis bd808@wikimedia.org wrote:
- Cutting the branch should be automatic. Jenkins could do this easily
and make the timing predictable for all parties.
Yeah this is true. Especially since we got rid of wmf-specific hacks in the branch :)
- It seems like the php-1.XwmfY checkouts on tin could be either
shallow or single branch checkouts. I think Chad started playing with having multiple working copies that share the same repository which might be even nicer.
Yeah. I'll start poking this again once I'm back Monday.
- Speaking of Chad's prototype work, /a/common/php-git makes
`updateWikiversions` throw a warning: "updateBitsBranchPointers: link target /usr/local/apache/common-local/php-git/skins does not exist."
Well it shouldn't exist since it's a bare repo ;-) Script needs adjusting.
- I tried to make a script to automate copying security patches from
one branch checkout to the next. It worked sort of. `git apply` wasn't smart enough to figure out that the patches I pulled off of the wmf15 checkout were already applied in the wmf16 branch. It would be nice to figure this out and get it automated or find a better way in general to manage security patches.
Yes. This a thousand times. Let's brainstorm next week :)
- My ssh-agent (OS X 10.8.5) croaked badly when trying to run
sync-wikiversions. This seems to be triggered by the full fanout (not batched) dsh call. Aaron had to step in and run both sync-wikiversions for me.
I've been complaining about this for months. It's one of the reasons I'm pretty convinced Tampa is often horribly out of sync because I imagined I wasn't alone in getting such errors.
Things actually went pretty smoothly though. Thanks a lot to Chad and Sam for helping me make a checklist and Aaron for being around to lend a hand when I fell and couldn't get up.
You're very welcome. Hopefully we can tidy up some of the pain points you mentioned.
-Chad
Hello,
Replies in line with random thoughts, dropped section for which I had no reply.
Le 27/02/2014 23:02, Bryan Davis a écrit : <snip>
- Running make-wmf-branch on bast1001 needed a change to the build dir
because reedy already owns the default dir there. ** Since the first thing it does is `rm -rf` the dir I think we could do something smarter in the script like prompt to delete after completion and/or add the script pid to the dir name.
We should not run anything on bastion hosts. Maybe use terbium or tin directly. For perm issues make sure it belong to wikidev with a SETGID bit:
chmod 2775 whateverdir
Everyone should have a umask to grant write bits to the wikidev group.
Maybe the script could check whether the umask is properly set.
- Cutting the branch should be automatic. Jenkins could do this easily
and make the timing predictable for all parties.
Can you fill in as a bug under either the Deployment or Continuous Integration component? Would be more than happy to pair with someone to craft the job.
We might want to have that job on a secured/private Jenkins instead of the CI one though.
- It seems like the php-1.XwmfY checkouts on tin could be either
shallow or single branch checkouts. I think Chad started playing with having multiple working copies that share the same repository which might be even nicer.
We can use Gerrit replication to push everything to some path on the same disk where the wmf checkouts are done. Then one can clone using that local copy as reference which saves IO and disk space:
git clone --reference /some/path mediawiki/core.git destdir
The reference HAS to be on the same physical disks or hardlinks are not possible.
I have been using that trick on the CI production slaves which receives replications of all repositories. That speeds up operations dramatically.
With submodules that might be tricky. Although git submodule accepts a --reference parameter, I am not sure how one can make it map the submodules to the local checkouts.
<snip php-git>
- Speaking of Chad's prototype work, /a/common/php-git makes
`updateWikiversions` throw a warning: "updateBitsBranchPointers: link target /usr/local/apache/common-local/php-git/skins does not exist."
- I tried to make a script to automate copying security patches from
one branch checkout to the next. It worked sort of. `git apply` wasn't smart enough to figure out that the patches I pulled off of the wmf15 checkout were already applied in the wmf16 branch. It would be nice to figure this out and get it automated or find a better way in general to manage security patches.
If the patch is in a local topic branch, you can use 'git cherry' to find out whether it is already applied:
$ git checkout -b security01 $ git apply $ git cherry -v origin/master + caaa889a831c33a95017b76d665e5304d1a20004 Security patch 001 $
+ means the patch is NOT in origin/master - means an equivalent is in upstream
So you can cherry-pick the commits having a '+'.
I use constantly to find out whether my local topic branches landed and thus delete them safely (though they are in Gerrit).
- Creating the on-wiki deploy notes is a PITA. I read the script a
couple of times and tried running on my own and with tips from Sam. I never did get it to work for me (kept getting empty output). Sam ran it and it worked fine but he said "I recall the script being temperamental". We should definitely make this an automated job in Jenkins or elsewhere. Nobody should have to babysit this kind of communications process. (The cobbler's children have no shoes.)
Can you fill in as a bug under either the Deployment or Continuous Integration component? Can pair with Sam during our afternoons.
Again might want a dedicated Jenkins server.
- My ssh-agent (OS X 10.8.5) croaked badly when trying to run
sync-wikiversions. This seems to be triggered by the full fanout (not batched) dsh call. Aaron had to step in and run both sync-wikiversions for me.
Tim being far from tin, he got a ssh key pair on tin and launch an agent there. Thus that is the tin ssh-agent serving the keys to the hosts which makes the sync way faster. I am doing the same (plus this way I can disable ssh-agent forwarding and avoid exposing my credentials to root user on tin).
<snip l10n scap>
Things actually went pretty smoothly though. Thanks a lot to Chad and Sam for helping me make a checklist and Aaron for being around to lend a hand when I fell and couldn't get up.
Well done folks! Maybe we should deploy more often instead of solely relying on Sam (no offense). That is a good opportunity to identify automatizable tasks and culprit as you have highlighted, freeing us more time to diagnose potential issue or.. code!
cheers,
mediawiki-core@lists.wikimedia.org