In order to gain access to Universe, we must fulfil the requirements laid out in the currently open PR: d2iq-archive/universe#703
To achieve this, we must make some breaking changes to the Riak-Mesos Framework. Those changes, and the reasoning behind them, are laid out below.
For a long time we have discussed the path forward as being to a split-package world, where we maintain separate packaging for DC/OS for KV and TS: this has seemed like the appropriate way for us to continue for some time but upon reflection I feel this is not how we should proceed.
In the light of the revelation that we may not now, nor ever, delete a package from Universe, the path forward is much clearer: we must maintain a single package - the one which already exists.
One thing that drove us to want separate packages for KV and TS was the difficulty (impossibility) of running simultaneous KV and TS clusters under a single framework instance. The changes below outline how we rid ourselves of this restriction.
It's a noteworthy caveat that our use of relocatable packages is restricted to
- TS 1.4.0+
- KV 2.2.0+
They have not been produced for any versions prior to those (there are still no released packages for OSS KV - only EE RC packages).
Currently we only configure a single Riak version in an RMF instance. We need to upgrade this
to include a map of configured Riak versions: by default we will include the latest supported
version of TS and KV. The configuration for these versions will be an optional field in the
config.json file, at .riak.node.uris. This same value can then be copied into the
resources.json file in the DCOS Universe package definition.
This requires a slightly different approach between DCOS and pure-Mesos configuration:
in DCOS resources.json, we will specify .assets.uris with the same value as one would use
in config.json at .riak.node.uris.
With this in place, in DCOS marathon.json.mustache we will need to (ab)use templating to
build the list of uris in the appropriate format to tell marathon what to download. This might
prove difficult.
We will also need to build the appropriate env var map of (version -> uri) to give to the Scheduler.
Example:
DCOS resources.json:
{
"assets": {
"uris": {
"scheduler": "https://github.com/basho-labs/riak-mesos-scheduler/releases/download/1.8.1/riak_mesos_scheduler-1.8.1-mesos-1.0.0-ubuntu-14.04.tar.gz",
"director": "https://github.com/basho-labs/riak-mesos-director/releases/download/1.0.1/riak_mesos_director-1.0.1-ubuntu-14.04.tar.gz",
"executor": "https://github.com/basho-labs/riak-mesos-executor/releases/download/1.7.0/riak_mesos_executor-1.7.0-mesos-1.0.0-ubuntu-14.04.tar.gz",
"explorer": "https://github.com/basho-labs/riak_explorer/releases/download/1.2.1/riak_explorer-1.2.1.patch-ubuntu-14.04.tar.gz",
"patches": "https://github.com/basho-labs/riak-mesos-executor/releases/download/1.7.0/riak_erlpmd_patches-1.7.0-mesos-1.0.0-ubuntu-14.04.tar.gz",
"riak-kv-2.1.4": "https://github.com/basho-labs/riak-mesos/releases/download/1.1.0/riak-2.1.4-ubuntu-14.04.tar.gz",
"riak-ts-1.3.1": "https://github.com/basho-labs/riak-mesos/releases/download/1.1.0/riak_ts-1.3.1-ubuntu-14.04.tar.gz"
}
},
...
}DCOS marathon.json.mustache:
{
...
"fetch": [
{"uri": "{{resource.assets.uris.scheduler}}"},
{"uri": "{{resource.assets.uris.director}}", "extract": false},
{"uri": "{{resource.assets.uris.executor}}", "extract": false},
{"uri": "{{resource.assets.uris.explorer}}","extract": false},
{"uri": "{{resource.assets.uris.patches}}", "extract": false},
{"uri": "{{resource.assets.uris.riak-ts-1.3.1}}", "extract": false},
{"uri": "{{resource.assets.uris.riak-kv-2.1.4}}", "extract": false}
],
...
"env": {
...
"RIAK_MESOS_ASSETS": "{{resource.assets.uris}}",
...
},
...
}TODO: Perhaps we should consider changing config.json to expect a .resources.assets section
too? That might reduce some complexity...
DCOS config.json:
TODO: How to define a jsonschema that allows ANY key in the "uris" object?
Tools config.json:
{
"riak": {
...
"node": {
"uris": {
"riak-kv-2.1.4": "https://github.com/basho-labs/riak-mesos/releases/download/1.1.0/riak-2.1.4-ubuntu-14.04.tar.gz",
"riak-ts-1.3.1": "https://github.com/basho-labs/riak-mesos/releases/download/1.1.0/riak_ts-1.3.1-ubuntu-14.04.tar.gz"
},
...
}
}The CLI tools will need to be changed to pass through these sensibly. TODO Since the Scheduler must know how to deal with the blunt-force methods of DCOS, perhaps we should just do the same thing for consistency?
Caveat 1: never use outputFile in marathon.json: if we do, we'll need to change the config in a way that is probably incompatible with DCOS resources.json
rms_config:artifacts() -> all configured riak versions full uri, grab filename from end of each uri
Currently we set a bunch of env vars e.g. RIAK_MESOS_EXECUTOR_PKG which contain just the name
of the package. The scheduler then uses this filename to serve the artifacts to executors via
HTTP.
We need to change this to give us the map of (version -> uri) from the configuration.
To support this, we'll need to change how we define clusters and interact with them via the CLI.
We'll need a new command, and changes to a couple of existing ones:
$ riak-mesos list-versions
riak-kv-2.1.4
riak-ts-1.3.1
$ riak-mesos list-versions --json
# Prints the full version/URI configured, as a valid JSON blob
{
"riak-kv-2.1.4": "https://github.com/basho-labs/riak-mesos/releases/download/1.1.0/riak-2.1.4-ubuntu-14.04.tar.gz",
"riak-ts-1.3.1": "https://github.com/basho-labs/riak-mesos/releases/download/1.1.0/riak_ts-1.3.1-ubuntu-14.04.tar.gz"
}
$ riak-mesos cluster create my-kv riak-kv-2.1.4
{"success": "true"}
$ riak-mesos cluster destroy my-kv
{"success": "true"}
$ riak-mesos cluster add-node my-kv
{"success": "true"}
$ riak-mesos cluster add-node my-kv --nodes 3
{"success": "true"}
{"success": "true"}
{"success": "true"}
There's one thing that we need to address before we can go down this road: patches and explorer artifacts, and the paths the extract to.
We've suffered in trying to support both our format of custom-built riak and the relocatable
rel packages now coming into the official buildchain. To summarise:
We inject 2 sets of patches into the Riak dir when creating an Executor:
- riak_explorer
- erlpmd patches (to allow using our customised EPMD setup)
In order to correctly inject these into Riak's env, we rely on Mesos to extract them to the correct path, but it does not support (even in v1.0.1) customising this path at runtime - the archive must contain the correct path beforehand.
Solution: we stop Mesos extracting these, and have the Executor do so - this way we can
control exactly where it goes to. We can allow Mesos to extract Riak - that will give us
an immediate way to determine where the appropriate path to extract the archives to is. To
achieve this, we'll change our explorer and patches artifacts to contain simply riak/lib
paths - then we can extract that directly into the Riak dir inside the Executor. Simples!
[NB: We already have functionality in the Scheduler to derive the path to Riak inside the supplied Riak archive that gets sent to the Executor for this purpose.]
- Stop using our own special snowflake Riak archives
- Use only official relocatable builds (which have
./riak/...paths)
- Deprecate
.riak.*.urland move all to a single.riak.urisentry - Add multiple versions of riak to
.riak.uris(to start, by default, KV 2.2.0 and TS 1.4.0) - Pass the
.riak.urisvalues both to Marathon and to the Scheduler (via env var) - TODO Find an appropriate format to give this map to the Scheduler in - has to be
achievable from the DCOS
marathon.json.mustachefile - Add
list-versionscommand to print the configured Riak versions - Alter
clustercommands to require cluster name, Riak version
A note about deprecation: we should, for a while, support the old configuration but print a brief-yet-informative blurb with a link for more detail on the matter, whenever it is detected.
- Pick up the URIs from the Env as per Tools above
- No framework-wide Riak version: move to per-cluster
- Don't expect
RIAK_MESOS_*_PKGvars to tell us package names: figure it out from the URIs provided in env
- No longer expect explorer or patches archives to be extracted by Mesos by default: add steps for manual extraction prior to starting Riak node
@sanmiguel what would the output of
riak-mesos list-versionsand what would the tools config.json look like as more and more versions of KV and TS are published? Would they include every published version? For example, I assume after KV 2.2 and TS 1.5 are released and RMF is updated appropriately to include all available versions,riak-mesos list-versionswould return:And I assume the tools config.json would include lines for the same.
cc @ph07