As more and more of my CloudFormation (CF) stacks use a base image and CloudFormation::Init magic, it’s become imperative to have an AMI that has the helper scripts (cfn-signal, cfn-init, etc.) built-in. This isn’t a problem if you use the Amazon Linux AMI, but if you’re playing with things like immutable infrastructure or baking your own custom AMIs for CIS hardening or some other regulatory requirement, it can become a big issue quickly. There’s a little documentation out there on installing the CF helper scripts (http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/cfn-helper-scripts-reference.html) but the installation process is not quite so straightforward as one would hope.
The solution to this issue varies depending on your OS. I’ve had no issue with Windows AMIs because the Ec2Config service takes care of everything, but in CentOS and RHEL, there are a few extra steps. I’ll break them down by OS. Note that you may need to search for updated version of things like epel-release to make sure it matches your OS or you’re using the most current version.
This was relatively painless, thanks to the contents of the
cloudformation-examples bucket being publicly visible. The latest version of the helper-scripts requires some Python elements/versions that are a pain to set up, but you can use an older version of the helper scripts without any issues. As of CentOS 6.8, you can (more…)
I was having trouble getting a Windows CloudFormation stack that leveraged CloudFormation::Init (cfn-init) to work properly. All I found in the cfn-init.log file was a repeating error that looked like this:
Traceback (most recent call last):
File “cfnbootstrap\util.pyc”, line 159, in _retry
File “cfnbootstrap\util.pyc”, line 231, in _timeout
ConnectionError: (‘Connection aborted.’, error(10060, ‘A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond’))
2016-03-24 20:13:40,355 [DEBUG] Sleeping for 0.873665 seconds before retrying
I ran through some google searches but came back with surprising few hits on this error. Everything regarding the syntax of the stack was clean. It passed all checks, came back as valid, etc.. But I got this error every time I deployed the stack. The culprit? Internet connectivity. The stack referenced a file to download and was unable to do so thanks to some strict Network ACLs. I had seen this issue once before when a NAT instance in the VPC was stopped. The lesson: check your internet connectivity when dealing with this error.
AWS Lambda functions currently have a five minute time limit to execute and while this is not a big problem many functions, it becomes problematic when you’re executing a task that has some inherent latency. I created a function that stops all instances, create snapshots of all attached EBS volumes and starts those instances back up. This was easily feasible in my personal environment, but when you get to larger environments, the amount of time it takes to stop all instances – and back up all those volumes without hitting a CreateSnapshot limit – can easily exceed five minutes.
The solution is two-fold.
First, make sure you insert an increasing or variable sleep timer between creating snapshots. I had to do this for the CreateSnapshot limit issue.
Second, in order to shut down all your instances properly, create snapshots of volumes and start instances back up, I had to use three separate functions and chain them together through the magic of CloudWatch and SNS.
Here’s how it works:
The function will output logs in CloudWatch. When you find those logs, you’ll usually see something akin to “END RequestId” when the function has completed. You can create a metric filter in that log group that looks for “END RequestId.” Once that filter is created, you can create an alarm with it. The alarm will trigger when the metric filter has been met and, if configured to do so, it can send a notification to an SNS topic of your choice.
The SNS topic can be tied to a Lambda function and should be considered a trigger to get the next function started. Tie your CloudWatch alarm for the function that shuts down instances to the SNS topic that is tied to your backup function. Go through the same process of creating a CloudWatch metric filter with an alarm and have that alarm notify a second SNS topic.
The second SNS topic should be tied to a Lambda function that will start your instances back up again.
So, in essence, we’re chaining everything together like this:
Lambda function > CloudWatch log > Metric Filter > Alarm > SNS topic > next Lambda function > CloudWatch log > Metric Filter … and so on. You can daisy chain these Lambda function together ad infinitum to meet your desired effect.
I recently tried to use AWS CLI to upload a folder full of files to S3 using a custom KMS key. This is possible by using the “aws s3api put-object” command, but it’s not possible using the “aws s3 sync” command. If you’re just uploading a few files, this isn’t a big deal, but the frustration grows with each extra file that needs to be uploaded.
The “s3 sync” command is a container for the s3api PUT action, so in order to use it for an entire folder (with a custom KMS key), you would need to write some kind of wrapper for it.
Otherwise, you can use one of the stock encryption keys and upload your entire folder to S3.
A while ago, I went through the setup here ( http://lg.io/2015/07/05/revised-and-much-faster-run-your-own-highend-cloud-gaming-service-on-ec2.html ) to build a gaming machine in AWS and loved the result. A lot cheaper to run one in the cloud than to shell out loads of cash for a new one and a great write-up.
The downside, although it was a one-time downside, was going through many of the settings to create the AMI that I needed and to tighten security a bit the way I needed. Also, the article was written with a Mac client in mind and I run Windows.
So, with my Windows experience and with all the AWS work I’ve been doing lately, I put together a CloudFormation template to automate many of the steps. If you’re looking (more…)