On Thu, May 29, 2014 at 9:14 AM, Bryan Davis bd808@wikimedia.org wrote:
On Thu, May 29, 2014 at 8:58 AM, Bryan Davis bd808@wikimedia.org wrote:
I thought I'd start this discussion in earnest here on the mw-core list and then take it to a larger list if needed once we have a reasonable plan.
My logging changes [0][1][2][3] are getting closer to being mergeable (the first has already been merged). Tony Thomas' Swift Mailer change [4] is also progressing. Both sets of changes introduce the concept of specifying external library dependencies, both required and suggested, to mediawiki/core.git via composer.json. Composer can be used by people directly consuming the git repository to install and manage these dependencies. I gave a example set of usage instructions in the commit message for my patch that introduced the dependency on PSR-3 [0]. In the production cluster, on Jenkins job runners and in the tarball releases prepared by M&M we will want a different solution.
My idea of how to deal with this is to create a new gerrit repository (mediawiki/core/vendor.git?) that contains a composer.json file similar to the one I had in patch set 7 of my first logging patch [5]. This composer.json file would be used to tell Composer the exact versions of libraries to download. Someone would manually run Composer in a checkout of this repository and then commit the downloaded content, composer.lock file and generated autoloader.php to the repository for review. We would then be able to branch and use this repository as git submodule in the wmf/1.2XwmfY branches that are deployed to production and ensure that it is checked out along with mw-core on the Jenkins nodes. By placing this submodule at $IP/vendor in mw-core we would be mimicking the configuration that direct users of Composer will experience. WebStart.php already includes $IP/vendor/autoload.php when present so integration with the rest of wm-core should follow from that. It would also be possible for M&M to add this repo to their tarballs for distribution.
I think Ori has a slightly different idea about how to approach this issue. I'd like to hear his idea in this thread and then reach consensus on how to move forward here or take both ideas (and any other credible alternatives) to a large list for a final decision.
[5]:
https://gerrit.wikimedia.org/r/#/c/119939/7/libs/composer.json,unified
I was just talking about this email with RobLa and he brought up a use case that my current description doesn't fully explain and I remembered one that Ori gave on irc that is similar but slightly different.
RobLa's example is that of an external library that we need to patch for WMF useage and upstream the change. To keep from blocking things for our production cluster we would want to fork the upstream, add our patch for local use and upstream the patch. During the time that the patch was pending review in the upstream we would want to use our locally patched version in production and Jenkins.
Composer provides a solution for this with its "repository" package source. The Composer documentation actually gives this exact example in their discussion of the "vcs" repository type [6]. We would create a gerrit repository tracking the external library, add our patch(es), adjust the composer.json file in mediawiki/core/vendor.git to reference our fork, and finally run Composer in mediawiki/core/vendor.git to pull in our patched version.
The example that Ori gave on irc was for libraries that we are extracting from mw-core and/or extensions to be published externally. This may be done for any and all of the current $IP/includes/libs classes and possibly other content from core such as FormatJson.
My idea for this would be to create a new gerrit repository for each exported project. The project repo would contain a composer.json manifest describing the project correctly to be published at packagist.org like most Composer installable libraries. In the mediawiki/core/vendor.git composer.json file we would pull these libraries just like any third-party developed library. This isn't functionally much different than the way that we use git submodules today. There is one extra level of indirection when a library is changed. The mediawiki/core/vendor.git will have to be updated with the new library version before the hash for the git submodule of mediawiki/core/vendor.git is updated in a deploy or release branch.
I'm assuming we'll eventually branch the project repo for each mediawiki release, in so if mediawiki 1.24 relies on one version of a library, and 1.25 another, that will all get handled?
Obligatory security questions: * Who is going to approve what libraries we use, since we're basically blessing the version we use? And are we going to require code reviews for all of them? * Who is going to remain responsible for making sure that security updates in those dependencies are merged with our repos and new versions of mediawiki tarballs released?
(/me yells "Not it!")
As long as we have strong, ongoing, internal commitment to this, then I don't see a problem.
Bryan
Bryan Davis Wikimedia Foundation bd808@wikimedia.org [[m:User:BDavis_(WMF)]] Sr Software Engineer Boise, ID USA irc: bd808 v:415.839.6885 x6855
MediaWiki-Core mailing list MediaWiki-Core@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-core