Why do we need CDK bootstrap?
3 min read, last updated on 2020-12-22
Last time, during deployment, we have skimmed over one very important aspect of the CDK structure: we have not talked about bootstrap, why we need it, and how CDK works under the hood.
What is the bootstrap process?
So when we have proceeded with deployment last time, behind the scenes, it executed for us the bootstrapping process on AWS.
It is a separate subcommand in the AWS CDK command-line interface responsible for populating a given environment (that is, a combination of AWS account and region) with resources required by the CDK to perform deployments into that environment.
# If you want to execute this manually, you can use
# the following command (assuming you have AWS
# credentials configured).
$ cdk bootstrap
Currently, it creates just an S3 bucket, but AWS is planning much more (e.g., ECR repository, IAM roles, etc.), and you can read about it here.
Why do we need a bucket? It is used for holding the file assets and the resulting CloudFormation template to deploy.
How CDK works under the hood?
So far, so good. Our generated AWS CloudFormation templates will be deployed with the use of the created S3 bucket. However, that’s not all.
Let’s review the resulting template generated with cdk synth:
# ...
VPCIGWB7E252D3:
Type: AWS::EC2::InternetGateway
Properties:
Tags:
- Key: Name
Value: FirstVPC/VPC
Metadata:
aws:cdk:path: FirstVPC/VPC/IGW
# ...
SSHSG26D56496:
Type: AWS::EC2::SecurityGroup
Properties:
# ...
Metadata:
aws:cdk:path: FirstVPC/SSH-SG/Resource
CDKMetadata:
Type: AWS::CDK::Metadata
Properties:
Modules: aws-cdk=1.33.0,@aws-cdk/aws-cloudwatch=1.33.0,...
Condition: CDKMetadataAvailable
Let’s start at the end. Our template contains information about how the AWS CDK is used, including the versions of libraries used by AWS CDK applications, are collected and reported by using a resource identified as AWS::CDK::Metadata
. This resource is added to AWS CloudFormation templates, and can quickly be reviewed. This information can also be used to identify stacks utilizing a package (e.g., for known serious security or reliability issues) to contact their users with relevant information.
By default, we have a name and version of the following modules that are loaded at synthesis time:
- AWS CDK Core,
- AWS Construct Library,
- AWS Solutions Constructs.
However, CDKMetadata
is just one thing. The tool leverages tags (where it can use it) and Metadata per resource inside the CloudFormation template. It saves an attribute in Metadata called aws:cdk:path, which represents a path where the support is located inside the infrastructure graph.
How it calculates diffs?
Last but not least - let’s talk about diffs. There is a dedicated command called cdk diff
, that allows us to review changes stored remotely, locally, and compare that (also with configuration drift feature as a third source). How can it do that?
You can see that the CDK tool adds a hash at the end of the logical name inside a template. This hash is calculated based on the attributes, so if any attribute changes, CDK will know that it has to compare changes between the sources.
Summary
So we know how CDK works behind the scenes. Bringing down the veil helps not only with understanding, but also demystifies the tool itself. And that’s great because after the first tutorials tool looks like a very complicated and has an opinionated process.
In the next article, we will talk about testing: how we can shorten the feedback cycle thanks to the unit tests, improve reliability, reduce complexity, and build a safety net for our infrastructure code.