Architecturally it may be desirable to factor our codebase into multiple independent services with clear APIs, but small wikis would clearly like a "single server" installation with all of the services running under one roof, as it were. Some options previously proposed have involved VM containers that bundle PHP, Node, MediaWiki and all required services into a preconfigured full system image. (T87774 https://phabricator.wikimedia.org/T87774)
This summit topic/RFC proposes an alternative: tightly integrating PHP/HHVM with a persistent server process running under node.js. The central service bundles together multiple independent services, written in either PHP or JavaScript, and coordinates their configurations. Running a wiki-with-services can be done on a shared node.js host like Heroku.
This is not intended as a production configuration for large wikis -- in those cases having separate server farms for PHP, PHP services, and JavaScript services is best: that independence is indeed the reason why refactoring into services is desirable. But integrating the services into a single process allows for hassle-free configuration and maintenance of small wikis.
A proof-of-concept has been built. The node package php-embed https://www.npmjs.com/package/php-embed embeds PHP 5.6.14 into a node.js (>= 2.4.0) process, with bidirectional property and method access between PHP and node. The package mediawiki-express https://www.npmjs.com/package/mediawiki-express uses this to embed MediaWiki into an express.js http://expressjs.com/ HTTP server. (Other HTTP server frameworks could equally well be used.) A hook in the ` LocalSettings.php` allows you to configure the mediawiki instance in JavaScript.
A bit of further hacking would allow you to fully configure the MediaWiki instance (in either PHP or JavaScript) and to dispatch to Parsoid (running in the same process).
*SUMMIT GOALS / FOR DISCUSSION*
- Determine whether this technology (or something similar) might be an acceptable alternative for small sites which are currently using shared hosting. See T113210 https://phabricator.wikimedia.org/T113210 for related discussion. - Identify and address technical roadblocks to deploying modular single-server wikis (see below). - Discuss methods for deploying complex wikitext extensions. For example, the WikiHiero https://www.mediawiki.org/wiki/Extension:WikiHiero extension would ideally be distributed with (a) PHP code hooking mediawiki core, (b) client-side JavaScript extending Visual Editor, and (c) server-side JavaScript extending Parsoid. Can these be distributed as a single integrated bundle?
*TECHNICAL CHALLENGES*
- Certain pieces of our code are hardwired to specific directories underneath mediawiki-core code. This complicates efforts to run mediawiki from a "clean tree", and to distribute piece of mediawiki separately. In particular: - It would be better if the `vendor` directory could (optionally) live outside the core mediawiki tree, so it could be distributed separately from the main codebase, and allow for alternative package structures. - Extensions and skins would benefit from allowing a "path-like" list of directories, rather than a single location underneath the core mediawiki tree. Extensions/skins could be distributed as separate packages, with a simple hook to add their locations to the search path. - Tim Starling has suggested that when running in single-server mode, some internal APIs (for example, between mediawiki and Parsoid) would be better exposed as unix sockets on the filesystem, rather than as internet domain sockets bound to localhost. For one, this would be more "secure by default" and avoid inadvertent exposure of internal service APIs. - It would be best to define a standardized mechanism for "services" to declare themselves & be connected & configured. This may mean standard ro utes on a single-server install (`/w` and `/wiki` for core, `/parsoid` for parsoid, `/thumb` for the thumbnailer service, etc), standard ports for each service (with their own http servers), or else standard locations for unix sockets. - Can we leverage some of the various efforts to bridge composer and npm (for example https://github.com/eloquent/composer-npm-bridge), so we don't end up with incompatible packaging?
Phabricator ticket: https://phabricator.wikimedia.org/T114457
Download the code for mediawiki-express and play with it a bit and let's discuss! --scott
That is just *all kinds* of awesome.
-- brion
On Thu, Nov 5, 2015 at 1:47 PM, C. Scott Ananian cananian@wikimedia.org wrote:
Architecturally it may be desirable to factor our codebase into multiple independent services with clear APIs, but small wikis would clearly like a "single server" installation with all of the services running under one roof, as it were. Some options previously proposed have involved VM containers that bundle PHP, Node, MediaWiki and all required services into a preconfigured full system image. (T87774 https://phabricator.wikimedia.org/T87774)
This summit topic/RFC proposes an alternative: tightly integrating PHP/HHVM with a persistent server process running under node.js. The central service bundles together multiple independent services, written in either PHP or JavaScript, and coordinates their configurations. Running a wiki-with-services can be done on a shared node.js host like Heroku.
This is not intended as a production configuration for large wikis -- in those cases having separate server farms for PHP, PHP services, and JavaScript services is best: that independence is indeed the reason why refactoring into services is desirable. But integrating the services into a single process allows for hassle-free configuration and maintenance of small wikis.
A proof-of-concept has been built. The node package php-embed https://www.npmjs.com/package/php-embed embeds PHP 5.6.14 into a node.js (>= 2.4.0) process, with bidirectional property and method access between PHP and node. The package mediawiki-express https://www.npmjs.com/package/mediawiki-express uses this to embed MediaWiki into an express.js http://expressjs.com/ HTTP server. (Other HTTP server frameworks could equally well be used.) A hook in the ` LocalSettings.php` allows you to configure the mediawiki instance in JavaScript.
A bit of further hacking would allow you to fully configure the MediaWiki instance (in either PHP or JavaScript) and to dispatch to Parsoid (running in the same process).
*SUMMIT GOALS / FOR DISCUSSION*
- Determine whether this technology (or something similar) might be an
acceptable alternative for small sites which are currently using shared hosting. See T113210 https://phabricator.wikimedia.org/T113210 for related discussion.
- Identify and address technical roadblocks to deploying modular
single-server wikis (see below).
- Discuss methods for deploying complex wikitext extensions. For
example, the WikiHiero https://www.mediawiki.org/wiki/Extension:WikiHiero extension would ideally be distributed with (a) PHP code hooking mediawiki core, (b) client-side JavaScript extending Visual Editor, and (c) server-side JavaScript extending Parsoid. Can these be distributed as a single integrated bundle?
*TECHNICAL CHALLENGES*
- Certain pieces of our code are hardwired to specific directories
underneath mediawiki-core code. This complicates efforts to run mediawiki from a "clean tree", and to distribute piece of mediawiki separately. In particular:
- It would be better if the `vendor` directory could (optionally) live outside the core mediawiki tree, so it could be distributed
separately from the main codebase, and allow for alternative package structures. - Extensions and skins would benefit from allowing a "path-like" list of directories, rather than a single location underneath the core mediawiki tree. Extensions/skins could be distributed as separate packages, with a simple hook to add their locations to the search path. - Tim Starling has suggested that when running in single-server mode, some internal APIs (for example, between mediawiki and Parsoid) would be better exposed as unix sockets on the filesystem, rather than as internet domain sockets bound to localhost. For one, this would be more "secure by default" and avoid inadvertent exposure of internal service APIs.
- It would be best to define a standardized mechanism for "services" to
declare themselves & be connected & configured. This may mean standard ro utes on a single-server install (`/w` and `/wiki` for core, `/parsoid` for parsoid, `/thumb` for the thumbnailer service, etc), standard ports for each service (with their own http servers), or else standard locations for unix sockets.
- Can we leverage some of the various efforts to bridge composer and npm
(for example https://github.com/eloquent/composer-npm-bridge), so we don't end up with incompatible packaging?
Phabricator ticket: https://phabricator.wikimedia.org/T114457
Download the code for mediawiki-express and play with it a bit and let's discuss! --scott
-- (http://cscott.net) _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Is this simply to support hosted providers? npm is one of the worst package managers around. This really seems like a case where thin docker images and docker-compose really shines. It's easy to handle from the packer side, it's incredibly simple from the user side, and it doesn't require reinventing the world to distribute things.
If this is the kind of stuff we're doing to support hosted providers, it seems it's really time to stop supporting hosted providers. It's $5/month to have a proper VM on digital ocean. There's even cheaper solutions around. Hosted providers at this point aren't cheaper. At best they're slightly easier to use, but MediaWiki is seriously handicapping itself to support this use-case.
On Thu, Nov 5, 2015 at 1:47 PM, C. Scott Ananian cananian@wikimedia.org wrote:
Architecturally it may be desirable to factor our codebase into multiple independent services with clear APIs, but small wikis would clearly like a "single server" installation with all of the services running under one roof, as it were. Some options previously proposed have involved VM containers that bundle PHP, Node, MediaWiki and all required services into a preconfigured full system image. (T87774 https://phabricator.wikimedia.org/T87774)
This summit topic/RFC proposes an alternative: tightly integrating PHP/HHVM with a persistent server process running under node.js. The central service bundles together multiple independent services, written in either PHP or JavaScript, and coordinates their configurations. Running a wiki-with-services can be done on a shared node.js host like Heroku.
This is not intended as a production configuration for large wikis -- in those cases having separate server farms for PHP, PHP services, and JavaScript services is best: that independence is indeed the reason why refactoring into services is desirable. But integrating the services into a single process allows for hassle-free configuration and maintenance of small wikis.
A proof-of-concept has been built. The node package php-embed https://www.npmjs.com/package/php-embed embeds PHP 5.6.14 into a node.js (>= 2.4.0) process, with bidirectional property and method access between PHP and node. The package mediawiki-express https://www.npmjs.com/package/mediawiki-express uses this to embed MediaWiki into an express.js http://expressjs.com/ HTTP server. (Other HTTP server frameworks could equally well be used.) A hook in the ` LocalSettings.php` allows you to configure the mediawiki instance in JavaScript.
A bit of further hacking would allow you to fully configure the MediaWiki instance (in either PHP or JavaScript) and to dispatch to Parsoid (running in the same process).
*SUMMIT GOALS / FOR DISCUSSION*
- Determine whether this technology (or something similar) might be an
acceptable alternative for small sites which are currently using shared hosting. See T113210 https://phabricator.wikimedia.org/T113210 for related discussion.
- Identify and address technical roadblocks to deploying modular
single-server wikis (see below).
- Discuss methods for deploying complex wikitext extensions. For
example, the WikiHiero https://www.mediawiki.org/wiki/Extension:WikiHiero extension would ideally be distributed with (a) PHP code hooking mediawiki core, (b) client-side JavaScript extending Visual Editor, and (c) server-side JavaScript extending Parsoid. Can these be distributed as a single integrated bundle?
*TECHNICAL CHALLENGES*
- Certain pieces of our code are hardwired to specific directories
underneath mediawiki-core code. This complicates efforts to run mediawiki from a "clean tree", and to distribute piece of mediawiki separately. In particular:
- It would be better if the `vendor` directory could (optionally) live outside the core mediawiki tree, so it could be distributed
separately from the main codebase, and allow for alternative package structures. - Extensions and skins would benefit from allowing a "path-like" list of directories, rather than a single location underneath the core mediawiki tree. Extensions/skins could be distributed as separate packages, with a simple hook to add their locations to the search path. - Tim Starling has suggested that when running in single-server mode, some internal APIs (for example, between mediawiki and Parsoid) would be better exposed as unix sockets on the filesystem, rather than as internet domain sockets bound to localhost. For one, this would be more "secure by default" and avoid inadvertent exposure of internal service APIs.
- It would be best to define a standardized mechanism for "services" to
declare themselves & be connected & configured. This may mean standard ro utes on a single-server install (`/w` and `/wiki` for core, `/parsoid` for parsoid, `/thumb` for the thumbnailer service, etc), standard ports for each service (with their own http servers), or else standard locations for unix sockets.
- Can we leverage some of the various efforts to bridge composer and npm
(for example https://github.com/eloquent/composer-npm-bridge), so we don't end up with incompatible packaging?
Phabricator ticket: https://phabricator.wikimedia.org/T114457
Download the code for mediawiki-express and play with it a bit and let's discuss! --scott
-- (http://cscott.net) _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
I view it as partly an effort to counteract the perceived complexity of running a forest full of separate services. It's fine to say they're all preinstalled in this VM image, but that's still a lot of complexity to dig through: where are the all the servers? What ports are they listening all? Did one of them crash? How do I restart it?
For some users, the VM (or an actual server farm) is indeed the right solution. But this was an attempt to see if I could recapture the "everything's here in this one process (and one code tree)" simplicity for those for whom that's good enough. --scott
On Thu, Nov 5, 2015 at 5:38 PM, C. Scott Ananian cananian@wikimedia.org wrote:
I view it as partly an effort to counteract the perceived complexity of running a forest full of separate services. It's fine to say they're all preinstalled in this VM image, but that's still a lot of complexity to dig through: where are the all the servers? What ports are they listening all? Did one of them crash? How do I restart it?
When you run docker-compose, your containers are linked together. If you have the following containers:
parsoid mathoid mediawiki mysql (hopefully not) cassandra redis
You'd talk to redis from mediawiki via: redis://redis:6379 and you'd talk to parsoid via: http://parsoid and to mathoid via http://mathoid, etc etc. It handles the networking for you.
If one of them crash then docker compose will tell you. If any of them fail to start it will also tell you.
I'm not even a huge proponent of docker, but the docker-compose solution for this is way more simple and way more standard than what you're proposing and it doesn't investing a ton of effort into something that no one other project will ever consider using.
Ride the ocean on the big boat, not the life raft.
For some users, the VM (or an actual server farm) is indeed the right solution. But this was an attempt to see if I could recapture the "everything's here in this one process (and one code tree)" simplicity for those for whom that's good enough.
There's no server farm here. If you're running linux it's just a set of processes running in containers on a single node (which could be your laptop). If you're on OSX or Windows it's a VM, but that can be totally abstracted away using Vagrant.
If you're launching in the cloud, you could launch directly to joyent or AWS ECS or very easily stand something up on digital ocean. If you're really feeling like making things easier for end-users, provide orchestration code that will automatically provision MW and its depedencies via docker-compose in a VM in one of these services.
Orchestration + containers is what most people are doing for microservices. Don't make something complex to maintain that's completely out of the ordinary out of fears of complexity. Go with the solutions everyone else is using and wrap tooling around it to make it easier for people.
- Ryan
On 2015-11-05, Ryan Lane rlane32@gmail.com wrote:
Is this simply to support hosted providers? npm is one of the worst package managers around. This really seems like a case where thin docker images and docker-compose really shines. It's easy to handle from the packer side, it's incredibly simple from the user side, and it doesn't require reinventing the world to distribute things.
I got heavily involved in to node world recently and I fully share your opinion about npm and npm@3 takes the disaster to the next level.
Are we using some native npm modules in our stack? *That* is hard to support.
If this is the kind of stuff we're doing to support hosted providers, it seems it's really time to stop supporting hosted providers. It's $5/month to have a proper VM on digital ocean. There's even cheaper solutions around. Hosted providers at this point aren't cheaper. At best they're slightly easier to use, but MediaWiki is seriously handicapping itself to support this use-case.
I feel very strongly there is a need for a quick setup for people who have their LAMP stack already working and feel familiar with that environment. The problem is that a full-stack MediaWiki is no longer a LAMP application. Those people aren't going away any soon and joining the coolest game in town.
I have already written scripts to keep code, vendor and core skins in sync from git. I am beginning to write even more scripts to quickly deploy/destroy MW instances. (My platform does not do Docker, btw.).
Maybe the right strategic move will be to implement MediaWiki phase four in the server-side JavaScript. Then the npm way is probably the only way forward.
Saper
wikitech-l@lists.wikimedia.org