[Labs-l] Evaluation of opt-in alternatives to Grid Engine on Tool Labs ('clustering solution')

Yuvi Panda yuvipanda at gmail.com
Thu Aug 20 14:45:00 UTC 2015


Hello!

One of the experimental goals for this quarter for labs' team is to
make available an new, more modern gridengine alternative just for
webservices on Tool Labs. We are starting to evaluate which systems we
should use - this is tracked at
https://phabricator.wikimedia.org/T106475. The (still incomplete)
evaluation spreadsheet is at
https://docs.google.com/spreadsheets/d/1YkVsd8Y5wBn9fvwVQmp9Sf8K9DZCqmyJ-ew-PAOb4R4/edit?usp=sharing

We are evaluationg Kubernetes and Mesos/Marathon as alternatives.
GridEngine is also being scored along with them, so if we find that it
wins we'll abandon the experiment and continue using GridEngine only.
Do provide comments on the phab ticket and follow along :)

== WHY? ==

Because our current webservices setup is a pile of hacks on top of
GridEngine, causing... interesting problems due to the complexity
involved.

GridEngine doesn't support a lot of features that people using more
modern systems take for granted - like containerization + isolation, a
nice API, continuous deploy, autoscaling.... Having an alternative to
play with allows us to build newer, better featured and more robust
systems.

== OMG, WILL I HAVE TO CHANGE THE WAY MY CODE WORKS NOW?! ==

Nope. For now this is just an alternative - when completed, you will
be able to run your webservice on this cluster by something like:

    webservice --provider=<something> start

And nothing else will change - everything else should still be
compatible. We'll eventually provide more features on the new setup,
but there will be no forced migration of any sort. If the alternative
becomes the default at any point, we'll ensure that things that worked
before continue working without any extra effort from the Tool
Author's part.


-- 
Yuvi Panda T
http://yuvi.in/blog



More information about the Labs-l mailing list