Local testing of CDK-defined Step Functions state machine
This blog post describes the different pieces of the puzzle needed to unit test a Step Functions state machine defined with CDK code.
CDK-defined state machine
CDK ?
CDK stands for Cloud Development Kit, an open-source framework that allows to define your infrastructure as real code (Typescript, Python, Java, …) not just yaml or json templates. With the CDK CLI, this code is translated in CloudFormation templates, used to provision the resources on AWS.
CDK provides “Constructs”, the basic building blocks to create resources in AWS. A Construct can be a single AWS resource (for example an EC2 instance) or a set of resources (for exemple, a VPC and a set of subnets and other network components).
State machine definition
In our case, we want to define a Step Functions state machine. AWS provides a Construct Library, where you can search for the resources you need. We will use the Step Functions module. The following code (in Typescript) shows how to define the different states of the machine and the state machine itself (full code is available here):
CDK2ASL
Later, we will need the state machine definition in the Amazon State Language format (“ASL”, which is JSON). To get this, we have 2 solutions (tldr; the 2nd is better):
1. Using outputs
This is the solution I initially used. Starting from the StateMachine construct, it is not possible to get the definition in this format. To get it we need to use a lower level construct, the CfnStateMachine
which is the equivalent of the CloudFormation resource (note the Cfn
in the beginning). Hopefully, we can retrieve the lower level construct from the higher level. Then, from this construct, we can retrieve the definition string and put it in output of the CloudFormation stack (just like any CloudFormation output):
To actually retrieve the output, we need to deploy the stack, using the CDK CLI with the command cdk deploy -O cdk.out/output.json
. It generates a json file in CDK working folder:
{
“DirectBankAccountCreation”: { “workflowStateMachineDefinitionE0E0A7BF”: “{\”StartAt\”:\”Input checks\”,\”States\”:{\”Input checks\”:{\”Type\”:\”Parallel\”,\”Next\”:\”Create User\”,\”Branches\”:[{\”StartAt\”:\”Extract info from ID\”,\”States\”:{\”Extract info from ID\”:{\”Next\”:\”Crosscheck Identity\”,\”Retry\”:...\”User created, account creation initiated\”:{\”Type\”:\”Succeed\”}}}”, “userCreationAPIBankAccountCreationApiEndpoint1A0514D2”: “https://9xko5dgqbf.execute-api.eu-west-1.amazonaws.com/prod/"
}
}
This file contains all the outputs of our stack (called “DirectBankAccountCreation”). Each output has a name composed of the construct in which it is defined (“workflow”), the name of the output itself (“StateMachineDefinition”), and a random string to avoid name overlapping.
Using jq , a powerful JSON processor for the command line, we can extract the interesting part and get the ASL:
jq -r ‘.DirectBankAccountCreation|with_entries(select(.key | startswith(“workflowStateMachineDefinition”))) | to_entries | .[].value’ cdk.out/output.json > cdk.out/state_machine.asl.json
2. Using a 3rd party tool
While writing this blog post, I discovered cdk-asl-extractor
which simplifies this step a lot:
- First install the tool:
npm install -g cdk-asl-extractor
- Second, use the CDK CLI and the
cdk synth
command to synthesise the CloudFormation template. It will generate the template in JSON format in the cdk.out folder (named [stackname].template.json). - Finally, run the tool with this template as parameter. In my case:
cdk-asl-extractor cdk.out/DirectBankAccountCreation.template.json
- It will generate as many files as there are state machines in the template. Files are named asl-0.json, asl-1.json, …
There are multiple advantages with this solution:
- First, it doesn’t require to deploy the stack on AWS. Nothing deployed means faster to do and no cost (actually, a state machine is free when it doesn’t run).
- Second, we don’t have a useless output (which can be quite big).
- And finally, you don’t need to fight with jq for hours to find how to extract a key with a random string 🌀🌩 ☠️❗️💥🤯…
Important consideration:
cdk-asl-extractor
does not provide proper resource ARNs (obviously, nothing was deployed, so we don’t have any ARN). That’s not an issue if we perform local tests and mock every call to AWS services (we’ll see this in a minute), but if some service integrations are not mocked, Step Functions will fail calling the service and the test will fail…
Step Functions local
I think it’s pretty unique for AWS, Step Functions local is a downloadable version of Step Functions that lets you test your state machine locally. Once started, you can create a state machine, execute it, … just like you would do on the cloud but on your computer.
The thing is Step Functions is not a solitary service, and aims to be integrated with other AWS services, especially since AWS announced the ability to call almost any service and API (more than 200 services and 10 000 as of today). Having Step Functions local is great but we also need these 200 services… unless we can mock them.
Mocked Service Integrations
In January 2022, AWS announced the Mocked Service Integrations for Step Functions Local. It allows to provide mock responses for service integrations and avoid reaching out to the cloud when testing a state machine locally.
Before starting Step Functions local, we must first define these mocks. We create a json file (MockConfigFile.json
) that will contain the following configuration:
- One or several test cases (
TestCases
section). - Each tests case list the mocked states and the mocked response names.
- Below, in the
MockedResponses
section, we have the definitions of each mock: either aReturn
with the expected payload returned by the mocked service, or aThrow
to simulate an error. - You can get more details on this file in the doc.
There are 2 important things to note and to remember for later:
- The name of the state machine:
DirectIntegrationTest
(line 3) - The name of the test case:
HappyPath
(line 5)
Starting Step Functions local 🚀
Step Functions local exists in 2 versions: as an executable jar file and as a docker image. I choose to use the docker image and run it with the following command:
docker run -p 8083:8083 -d --rm --name stepfunctions-local \
--mount type=bind,readonly,source=$(ROOT_DIR)/test/MockConfigFile.json,destination=/home/StepFunctionsLocal/MockConfigFile.json \
-e SFN_MOCK_CONFIG=”/home/StepFunctionsLocal/MockConfigFile.json” \ amazon/aws-stepfunctions-local
The important thing here is the environment variable SFN_MOCK_CONFIG
configured with the mock config file previously created. With this, Step Functions local will be able to use the mocks instead of calling the services on AWS.
Creating the state machine
Once Step Functions local is running, we can create our state machine locally. We use the AWS CLI, specify the local endpoint, and use the state machine ASL definition previously retrieved. Please also note the name of the state machine, remember? It was the name of the state machine in the MockConfigFile.json
so that Step Functions Local makes the link with the mocks:
aws stepfunctions create-state-machine \
--endpoint-url http://localhost:8083 \
# we can also use the asl-0.json file here
--definition file://cdk.out/state_machine.asl.json \
--name “DirectIntegrationTest” \
--role-arn “arn:aws:iam::123456789012:role/DummyRole” \
--no-cli-pager
Tests
This was the initial topic of this post: testing! And here we are. We now have Step Functions local running, and able to mock service integrations. We have a state machine available locally that we can execute. All that remains is to write the unit tests.
Setup
To test our state machine, we use the AWS SDK. As shown in the snippet below, we use the SFNClient
with a local endpoint and the state machine that we’ve just created locally (DirectIntegrationTest).
To start the execution of the state machine, we must specify the test case defined in the MockConfigFile.json
that we want to leverage. To do so, on line 12, we suffix the state machine ARN with # and the test case name (ex: #HappyPath). Thanks to this, Step Functions local will replace all calls to AWS services by the mocked responses corresponding to this test case, and avoid calls to the cloud.
We use the StartExecutionCommand
and not the StartSyncExecutionCommand
because of a network error (see comment on line 8–9). Thus we cannot retrieve the result of the execution and need to perform a DescribeExecutionCommand
to retrieve the status and the output of the state machine (lines 18–33).
Unit tests
Finally we can add our unit tests. As previously mentioned, we pass the test case name (here “HappyPath”) to get the appropriate mocks, and we simply execute the state machine with a JSON event. We then wait for the end of the execution and assert on the result:
We could go further and get the details of the execution, using GetExecutionHistory
API, to retrieve information about each state. In my case I just wanted to get the output and the status.
You can then run your test (npm run test
or yarn test
or anything else depending on your programming language) and add more tests to verify all the paths in your state machine.
Conclusion
The puzzle is complete, we’ve shown how to locally test your state machine, defined with CDK code. I’m a big fan of Step Functions! With its integration with almost all AWS APIs, I’m pretty sure it will become an important piece of serverless architectures, even replacing Lambda functions in many cases. State machines must be tested in the same way we test Lambda functions. They provide “business” logic and deserve the same attention. So, test your state machine!