Local testing of CDK-defined Step Functions state machine

Jérôme Van Der Linden
7 min readMay 26, 2022

--

This blog post describes the different pieces of the puzzle needed to unit test a Step Functions state machine defined with CDK code.

CDK-defined state machine

CDK ?

CDK stands for Cloud Development Kit, an open-source framework that allows to define your infrastructure as real code (Typescript, Python, Java, …) not just yaml or json templates. With the CDK CLI, this code is translated in CloudFormation templates, used to provision the resources on AWS.

CDK provides “Constructs”, the basic building blocks to create resources in AWS. A Construct can be a single AWS resource (for example an EC2 instance) or a set of resources (for exemple, a VPC and a set of subnets and other network components).

State machine definition

In our case, we want to define a Step Functions state machine. AWS provides a Construct Library, where you can search for the resources you need. We will use the Step Functions module. The following code (in Typescript) shows how to define the different states of the machine and the state machine itself (full code is available here):

CDK2ASL

Later, we will need the state machine definition in the Amazon State Language format (“ASL”, which is JSON). To get this, we have 2 solutions (tldr; the 2nd is better):

1. Using outputs

This is the solution I initially used. Starting from the StateMachine construct, it is not possible to get the definition in this format. To get it we need to use a lower level construct, the CfnStateMachine which is the equivalent of the CloudFormation resource (note the Cfn in the beginning). Hopefully, we can retrieve the lower level construct from the higher level. Then, from this construct, we can retrieve the definition string and put it in output of the CloudFormation stack (just like any CloudFormation output):

To actually retrieve the output, we need to deploy the stack, using the CDK CLI with the command cdk deploy -O cdk.out/output.json. It generates a json file in CDK working folder:

{
“DirectBankAccountCreation”: {
workflowStateMachineDefinitionE0E0A7BF”: “{\”StartAt\”:\”Input checks\”,\”States\”:{\”Input checks\”:{\”Type\”:\”Parallel\”,\”Next\”:\”Create User\”,\”Branches\”:[{\”StartAt\”:\”Extract info from ID\”,\”States\”:{\”Extract info from ID\”:{\”Next\”:\”Crosscheck Identity\”,\”Retry\”:...\”User created, account creation initiated\”:{\”Type\”:\”Succeed\”}}}”, “userCreationAPIBankAccountCreationApiEndpoint1A0514D2”: “https://9xko5dgqbf.execute-api.eu-west-1.amazonaws.com/prod/"
}
}

This file contains all the outputs of our stack (called “DirectBankAccountCreation”). Each output has a name composed of the construct in which it is defined (“workflow”), the name of the output itself (“StateMachineDefinition”), and a random string to avoid name overlapping.

Using jq , a powerful JSON processor for the command line, we can extract the interesting part and get the ASL:

jq -r ‘.DirectBankAccountCreation|with_entries(select(.key | startswith(“workflowStateMachineDefinition”))) | to_entries | .[].value’ cdk.out/output.json > cdk.out/state_machine.asl.json

2. Using a 3rd party tool

While writing this blog post, I discovered cdk-asl-extractor which simplifies this step a lot:

  • First install the tool: npm install -g cdk-asl-extractor
  • Second, use the CDK CLI and the cdk synth command to synthesise the CloudFormation template. It will generate the template in JSON format in the cdk.out folder (named [stackname].template.json).
  • Finally, run the tool with this template as parameter. In my case:

cdk-asl-extractor cdk.out/DirectBankAccountCreation.template.json

  • It will generate as many files as there are state machines in the template. Files are named asl-0.json, asl-1.json, …

There are multiple advantages with this solution:

  • First, it doesn’t require to deploy the stack on AWS. Nothing deployed means faster to do and no cost (actually, a state machine is free when it doesn’t run).
  • Second, we don’t have a useless output (which can be quite big).
  • And finally, you don’t need to fight with jq for hours to find how to extract a key with a random string 🌀🌩 ☠️❗️💥🤯…

Important consideration: cdk-asl-extractor does not provide proper resource ARNs (obviously, nothing was deployed, so we don’t have any ARN). That’s not an issue if we perform local tests and mock every call to AWS services (we’ll see this in a minute), but if some service integrations are not mocked, Step Functions will fail calling the service and the test will fail…

Step Functions local

I think it’s pretty unique for AWS, Step Functions local is a downloadable version of Step Functions that lets you test your state machine locally. Once started, you can create a state machine, execute it, … just like you would do on the cloud but on your computer.

The thing is Step Functions is not a solitary service, and aims to be integrated with other AWS services, especially since AWS announced the ability to call almost any service and API (more than 200 services and 10 000 as of today). Having Step Functions local is great but we also need these 200 services… unless we can mock them.

Mocked Service Integrations

In January 2022, AWS announced the Mocked Service Integrations for Step Functions Local. It allows to provide mock responses for service integrations and avoid reaching out to the cloud when testing a state machine locally.

Unit Testing dependencies not being mocker
Unit tests should be independent from external factors

Before starting Step Functions local, we must first define these mocks. We create a json file (MockConfigFile.json) that will contain the following configuration:

  • One or several test cases (TestCases section).
  • Each tests case list the mocked states and the mocked response names.
  • Below, in the MockedResponses section, we have the definitions of each mock: either a Return with the expected payload returned by the mocked service, or a Throw to simulate an error.
  • You can get more details on this file in the doc.

There are 2 important things to note and to remember for later:

  • The name of the state machine: DirectIntegrationTest (line 3)
  • The name of the test case: HappyPath (line 5)

Starting Step Functions local 🚀

Step Functions local exists in 2 versions: as an executable jar file and as a docker image. I choose to use the docker image and run it with the following command:

docker run -p 8083:8083 -d --rm --name stepfunctions-local \
--mount type=bind,readonly,source=$(ROOT_DIR)/test/MockConfigFile.json,destination=/home/StepFunctionsLocal/MockConfigFile.json \
-e SFN_MOCK_CONFIG=”/home/StepFunctionsLocal/MockConfigFile.json” \ amazon/aws-stepfunctions-local

The important thing here is the environment variable SFN_MOCK_CONFIG configured with the mock config file previously created. With this, Step Functions local will be able to use the mocks instead of calling the services on AWS.

Creating the state machine

Once Step Functions local is running, we can create our state machine locally. We use the AWS CLI, specify the local endpoint, and use the state machine ASL definition previously retrieved. Please also note the name of the state machine, remember? It was the name of the state machine in the MockConfigFile.json so that Step Functions Local makes the link with the mocks:

aws stepfunctions create-state-machine \
--endpoint-url http://localhost:8083 \
# we can also use the asl-0.json file here
--definition file://cdk.out/state_machine.asl.json \
--name “DirectIntegrationTest” \
--role-arn “arn:aws:iam::123456789012:role/DummyRole” \
--no-cli-pager

Tests

This was the initial topic of this post: testing! And here we are. We now have Step Functions local running, and able to mock service integrations. We have a state machine available locally that we can execute. All that remains is to write the unit tests.

Unit tests, Unit tests everywhere

Setup

To test our state machine, we use the AWS SDK. As shown in the snippet below, we use the SFNClient with a local endpoint and the state machine that we’ve just created locally (DirectIntegrationTest).

To start the execution of the state machine, we must specify the test case defined in the MockConfigFile.json that we want to leverage. To do so, on line 12, we suffix the state machine ARN with # and the test case name (ex: #HappyPath). Thanks to this, Step Functions local will replace all calls to AWS services by the mocked responses corresponding to this test case, and avoid calls to the cloud.

We use the StartExecutionCommand and not the StartSyncExecutionCommand because of a network error (see comment on line 8–9). Thus we cannot retrieve the result of the execution and need to perform a DescribeExecutionCommand to retrieve the status and the output of the state machine (lines 18–33).

Unit tests

Finally we can add our unit tests. As previously mentioned, we pass the test case name (here “HappyPath”) to get the appropriate mocks, and we simply execute the state machine with a JSON event. We then wait for the end of the execution and assert on the result:

We could go further and get the details of the execution, using GetExecutionHistory API, to retrieve information about each state. In my case I just wanted to get the output and the status.

You can then run your test (npm run test or yarn test or anything else depending on your programming language) and add more tests to verify all the paths in your state machine.

Conclusion

The puzzle is complete, we’ve shown how to locally test your state machine, defined with CDK code. I’m a big fan of Step Functions! With its integration with almost all AWS APIs, I’m pretty sure it will become an important piece of serverless architectures, even replacing Lambda functions in many cases. State machines must be tested in the same way we test Lambda functions. They provide “business” logic and deserve the same attention. So, test your state machine!

--

--

Jérôme Van Der Linden

Senior Solution Architect @AWS - software craftsman, agile and devops enthusiastic, cloud advocate. Opinions are my own.