Serverless testing strategy

  • Another essential criteria is the accuracy of the feedback: everyone wants reliable information (no flaky test) and be confident that if the tests are ok, the application will be ok.
  • And lastly, precision of the feedback is also key to determine the origin of an error or a failed test. The more precise, the faster it is to correct.

Rather ‘test pyramid’ or ‘testing honeycomb’?

Test pyramid (left) vs Testing honeycomb (right)
Sociable tests (left) vs Solitary tests (right)

Testing options

For the rest of the article, I will use the architecture represented in the following diagram:

Option #1: Solitary tests

As quickly introduced, solitary tests permit to validate a piece of code without its dependencies, completely isolated from the external world. In the case of serverless applications, it means you test the Lambda function code independently from the cloud. There are different ways to do so:

1. Using mock libraries

At the test level, you can leverage libraries such as Mockito (for Java), sinon.js (for javascript) or unittest.mock / moto (for python) to mock dependencies. We generally use them to mock the AWS SDK calls. You can have a look at this AWS blog post (“Mocking modular AWS SDK for JavaScript v3 in Unit Tests”) explaining how to use aws-sdk-client-mock to easily mock the AWS Javascript SDK v3:

aws-sdk-client-mock example with SNS

2. Using hexagonal architecture

When your Lambda function is growing (be careful not to grow recklessly) or simply to avoid mocking AWS services, you should probably think about decoupling your code: decoupling the business logic from the external dependencies. Hexagonal architecture can help achieving this by using ports and adapters to isolate the core business (also called domain) from the outside world:

Hexagonal Architecture (ports & adapters)

3. Using “local cloud” frameworks

Have you ever dreamed about having the cloud in your computer? This is the promise of Localstack: “A fully functional local cloud stack. Develop and test your cloud and serverless apps offline!”. Looks promising, isn’t it?! All you have to do is to install (pip install localstack ) and start (localstack start) LocalStack, and change the AWS SDK endpoint to the local address within your code. Here is an example with DynamoDB:

import AWS = require('aws-sdk');let docClient = new AWS.DynamoDB.DocumentClient( {
region: "eu-west-1",
endpoint: "http://localhost:4566"
});
  • On partially supported features, there may be (actually, there are) discrepancies with the cloud. And having your tests green locally doesn’t mean it will work when deployed on AWS.
  • Last but not least, it’s not that easy to setup and you soon find yourself troubleshooting LocalStack rather than your application, adding some glue code or even modifying your own application to be testable against LocalStack. I recommend you to have a look at this article (“Is it possible to do cloud development without the cloud?”) for more concrete feedback on the tool.

4. Wrap-up

Let’s rate our solitary tests against the four criteria given in introduction (0 is the lowest, 5, the highest score):

Solitary tests radar: Speed: 5, Precision: 5, Cost: 3, Accuracy: 1
  • Favour a clean architecture, if not hexagonal (because it can be heavy to implement), at least have your business logic separated from all the cloud stuff (even the handler).

Option #2: Sociable tests with emulated Lambda

We are now moving to the sociable tests, where the tested code is not completely isolated from its dependencies. There are no mocks anymore in this situation. Applied to Lambda, it means the function will actually perform real calls to real AWS services, deployed on a real cloud.

AWSTemplateFormatVersion: 2010-09-09
Description: >-
sam-sqs
Transform:
- AWS::Serverless-2016-10-31
Parameters:
# Use the command: sam deploy --parameter-overrides 'EnvType="test"'
# to deploy the test environment (and testing SQS queue)
# Use sam deploy only for production deployment (in a CI/CD pipeline)
EnvType:
Description: Environment type.
Default: prod
Type: String
AllowedValues:
- prod
- test
ConstraintDescription: must specify prod or test.
Conditions:
Testing: !Equals
- !Ref EnvType
- test
Resources:
RealQueue:
Type: AWS::SQS::Queue
TestingQueue:
Type: AWS::SQS::Queue
Condition: Testing # only created when the condition is met
SQSPublisher:
Type: AWS::Serverless::Function
Properties:
Description: A Lambda function that send messages to a queue.
Runtime: nodejs16.x
Architectures:
- x86_64
Handler: src/handlers/sqs-publisher.handler
Policies:
- AWSLambdaBasicExecutionRole
- SQSSendMessagePolicy:
QueueName:
!If [Testing, !GetAtt TestingQueue.QueueName, !GetAtt RealQueue.QueueName]
Environment:
Variables:
# give the appropriate queue reference according to the condition (test or not)
SQS_QUEUE: !If [Testing, !Ref TestingQueue, !Ref RealQueue]
SQSConsumer:
Type: AWS::Serverless::Function
Properties:
Description: A Lambda function that logs the payload of messages sent to an associated SQS queue.
Runtime: nodejs16.x
Architectures:
- x86_64
Handler: src/handlers/sqs-payload-logger.handleroked
Events:
SQSQueueEvent:
Type: SQS
Properties:
# This function remains plugged to the real queue
Queue: !GetAtt RealQueue.Arn
MemorySize: 128
Timeout: 25
Policies:
- AWSLambdaBasicExecutionRole
Sociable tests (emulated Lambda) radar: Speed: 3, Precision: 4, Cost: 3, Accuracy: 3
  • The accuracy is a bit better (3) for the interaction of our Lambda functions, but we are still on emulated environment, with no verifications of the permissions.
  • Precision is also pretty good (4), even if not at the level of unit tests as we don’t solely test a specific piece of code but the whole function.
  • As for the cost, I keep a 3 because we can test most of our architecture thanks to the free tier. But automating the tests is also more complex and will take more time.
  • You can eventually use the SAM or Serverless CLI to execute and debug (breakpoint / step-by-step) your function locally.

Option #2bis: Sociable tests

We cannot leverage the CLI tools mentioned in option #2 at scale and build a complete test harness with them (too slow and hard to automate). So instead of using them, why not simply invoking the handler method of our Lambda function, but keeping the call to real services. The picture is very similar to the previous one apart from the actions performed:

@ParameterizedTest
@Event(value = "sqs/sqs_event.json", type = SQSEvent.class)
public void testInjectSQSEvent(SQSEvent event) {
// test your handleRequest method with this event as parameter
}
@ParameterizedTest
@HandlerParams(
events = @Events(folder = "apigw/events/", type = APIGatewayProxyRequestEvent.class),
responses = @Responses(folder = "apigw/responses/", type = APIGatewayProxyResponseEvent.class))
public void testMultipleEventsResponsesInFolder(APIGatewayProxyRequestEvent event, APIGatewayProxyResponseEvent response) {
// will inject multiple events (API GW requests) and assert responses of the Lambda
}
Sociable tests radar: Speed: 4, Precision: 4, Cost: 4, Accuracy: 3
  • Reduced complexity of implementation and thus costs, moving to 4 too.
  • Accuracy and precision remain the same.
  • Do not solely rely on these tests, we are still missing some part (permissions, real events/triggers), see options #3 or #4.

Going further

If you don’t want to bother with a testing SQS queue and reduce the risk on the infrastructure part, I would suggest to extend the test like in the following picture. It tends to become an integration test and you will loose some precision in the feedback, but you keep your infra as code straight to the point.

  • I would encourage to have one AWS account per developer. Doing so, you reduce the risk of going beyond the free tier on one single account and you reduce the risk of collision during tests (everyone using the same SQS queue…). If not possible, be sure to deploy one stack per developer and define a naming convention (e.g. “nickname-mystack“).

Option #3: Integration tests

In all the previous options, we are missing two essential aspects:

  • The Lambda function permissions (execution role) to interact with downstream services.
Integration tests radar: Speed: 2, Precision: 3, Cost: 3, Accuracy: 4

Option #4: End-to-end tests

We’re arriving to our last option with the end-to-end tests. As their name suggests, the aim is to validate that the whole system is well integrated and works from end to end on the cloud. As for the integration tests, we will write our tests using JUnit/Jest/… and the AWS SDK. This time however, we are not splitting the job and we test the whole “feature”.
In our example, we validate that when calling the API with some data we have the expected output in the RDS database at the end:

E2E tests radar: Speed: 1, Precision: 1, Cost: 1, Accuracy: 5
  • Precision is really low (1) and makes it much harder to troubleshoot.
  • We are loosing in speed (1) as we need to deploy everything, the tests are also quite long to execute, and harder to troubleshoot also means we spend much more time on it.
  • Time is money! That’s the main reason why I’m downgrading the rate to 1.
  • The objective here is to validate only what could not be validated before: integration between each service and permissions. If you can validate something with another kind of test, please do so.
  • Do not run these tests continuously. Contrary to solitary and sociable tests that you want to run hundreds times a day, these tests are run less often, and generally not from your laptop but rather from your CI/CD pipeline. Remember they are slow, don’t wait for them.

Conclusion

We don’t really care about the shape of our tests, be it a triangle, an honeycomb or anything else. Here’s a flower:

  • Don’t rely on sociable tests with emulated Lambda.
  • Favour sociable tests: invoke your handler method locally without mocking external calls to the AWS services. These tests obtain the best score with 3.75/5 based on the four criteria listed in the introduction.
  • Don’t rely on integration tests, prefer the end-to-end tests which provide a better accuracy, but don’t over invest on them. Just validate the infrastructure paths (integration between services: events & messages, configuration, permissions).

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Jérôme Van Der Linden

Jérôme Van Der Linden

103 Followers

Senior Solution Architect @AWS - software craftsman, agile and devops enthusiastic, cloud advocate. Opinions are my own.