AWS:Ec2 in Mgmt

PUBLISHED ON APR 28, 2018

Before dealing with the nitty-gritty of the EC2 API for Golang (and boy is it gritty,) it will help to understand the basics of mgmt and how resources are implemented. Mgmt is an open-source event-based config management solution written entirely in Golang, based on a Directed Acyclic Graph model. It is the brain-child of @purpleidea, who maintains the project and provides mentoring to new contributors (myself included.) Mgmt’s resource modules each define methods to watch, verify and change the state of any resources the user may need to maintain, as well as any relationships to other resources where appropriate. This is the story of my battle to write one of those modules.

With that out of the way, let’s dive into the implementation. There are two crucial methods that must be implemented in order to get a resource working in mgmt; Watch() and CheckApply(). Both methods do what they say. Watch() monitors the resource and sends an event when it detects that the state has changed, and CheckApply() checks the state and, if necessary, applies the actions required to bring the resource into the desired state.

One thing I feel compelled to point out is that, like all of Amazon’s APIs, the AWS SDK for Golang is auto-generated, as is its documentation. While this undoubtedly saves Amazon’s developers countless hours of work, it makes programming against the API significantly more frustrating than it ought to be.

I began my implementation with the CheckApply() method, as that tends to be the quickest way to a working proof-of-concept. The check portion of the method needs to query the API for any instance that matches the name specified by the user. If it finds one, it then needs to check its state against the definition. If the state matches, we are done. That was easy.

If the resource is not in the correct state, we continue to the apply portion of the method. Here’s where the real work gets done. Apply defines the actions needed to bring the resource into the desired state. It consists of the logic required to create, start, stop and terminate EC2 instances. Once I wrapped my head around the API (which was no small feat,) this code was straight-forward to write, and was fairly close to the examples available in the documentation.

Once the CheckApply() method was finished, I could actually run it using mgmt’s poll meta-parameter. Poll essentially bypasses the Watch() method, and just repeats CheckApply() on an interval defined by the user. It worked! There were a few bugs work out in the logic, but once they were addressed, I could be certain that no matter what actions I took, mgmt made sure that the resources I defined would always return to the correct state.

The next step was to implement Watch(). This would prove to be more difficult. Until I impliment an http server to receive events from AWS CloudWatch we have to rely on the APIs WaitUntilInstance… methods to alert us that the state may have changed. I say may, because these methods return after about ten minutes, whether or not the state has changed. It would be really helpful if the API returned an error, which we could use to trigger the method to retry. When the wait method returns, we actually have to check whether the state actually changed, or if it just timed out, by rechecking the state. If the state really has changed, we send an event to the engine.

There are some issues with this approach, including two subtle races. The first consists of notifying the engine that we’re running only when the watching actually begins (rather than when we ask it to.) The second is sending events only when we are sure the watching has restarted after detecting a change. Luckily @purpleidea recently demonstrated a race-free HTTP server implementation that I may use in a future patch to leverage AWS CloudWatch events in Watch().

Between navigating the auto-generated API documentation and puzzling together the logic required to watch the resources reliably, writing this module was a challenge. If the API were a little more sane, I might have saved a little of my own sanity. I couldn’t have done it without @purpleidea’s awesome mentoring and the community he’s built around mgmt. Come join us on freenode at #mgmtconfig and on GitHub.

comments powered by Disqus