As part of helping a customer develop their proof of concept monitoring system with Sensu Enterprise, I worked up a mutator which uses stash data to determine if an event occurred within a pre-defined maintenance window.
The idea here is that event data needs to be annotated to indicate whether an event occurred during a scheduled maintenance window for SLA reporting purposes. With this added downtime context, events logged to an external source (e.g. greylog, elasticsearch) via Sensu Enterprise event bridge should provide enough information to determine whether or not a client's check result matches a scheduled downtime window.
Please note that I have done very little in the way of testing so this plugin is not likely to be very robust. Since this mutator probably needs to be applied to every event, it should probably be implemented as an extension before being put into a production system.
The mutator assumes the following:
-
Relative to Sensu event processor, Sensu API is running on 127.0.0.1:4567 . This will be true of any Sensu Enterprise server.
-
Sensu Clients are configured with a custom attribute,
services, whose value is an array containing zero or more strings defining service names which will be compared to named stashes under thedowntimepath. -
Stashes will be created via the Sensu API under the
downtimepath, with a name matching a service defined on clients withstartandendattributes whose values are unix epoch timestamps.
Example client definition, Note "arbitrary_service_id" as a value in the services array.:
{
"client":{
"name":"datboi",
"address":"192.168.2.227",
"subscriptions":[
"client:datboi"
],
"environment":"staging",
"tags":[],
"services":[
"arbitrary_service_id"
]
}
Example curl command to create a "scheduled downtime" stash under the downtime path, matching the arbitrary_service_id service defined on the client above:
curl -X POST -H 'Content-Type: application/json' -d '{"path":"downtime/arbitrary_service_id","content":{"start":1493158003,"end":1493168003,"creator":"Your Name Here","description":"this is a test"}}' http://127.0.01:4567/stashes
With a client configured and a stash created, the mutator can be defined in configuration and applied to a handler. Here's the combined handler and mutator configuration I used in my testing:
{
"handlers": {
"downtime_test": {
"type": "pipe",
"command": "tee /tmp/downtime_test",
"mutator": "scheduled_downtime"
}
},
"mutators": {
"scheduled_downtime": {
"command": "/usr/local/bin/scheduled-downtime.rb"
}
}
}
After restarting Sensu services to apply configuration, I tested the mutator using nc (netcat) to send a check result to the local client socket:
echo '{"name":"test","status":2,"output":"test output","handler":"downtime_test"}' | nc 127.0.0.1 3030
And I see the data written to disk by tee, with a copy of the downtime stash incorporated in the event data under the downtime array, as I expect:
$ cat /tmp/downtime_test | jq .
{
"client": {
"name": "datboi",
"address": "192.168.2.227",
"subscriptions": [
"client:datboi"
],
"environment": "staging",
"tags": [],
"services": [
"arbitrary_service_id"
],
"version": "0.29.0",
"timestamp": 1493160058
},
"check": {
"name": "test",
"status": 2,
"output": "test output",
"handler": "downtime_test",
"executed": 1493160075,
"issued": 1493160075,
"type": "standard",
"history": [
"2",
"2",
"2",
"2",
"2",
"2",
"2",
"2",
"2",
"2",
"2",
"2",
"2",
"2",
"2",
"2",
"2",
"2",
"2",
"2",
"2"
],
"total_state_change": 0
},
"occurrences": 21,
"occurrences_watermark": 21,
"action": "create",
"timestamp": 1493160075,
"id": "fc081db1-961a-4f64-8412-d5a56a152ed4",
"last_state_change": 1491797301,
"last_ok": 1491797301,
"silenced": false,
"silenced_by": [],
"downtime": [
{
"start": 1493159338,
"end": 1493169338,
"creator": "Your Name Here",
"description": "this is a test"
}
]
}