Serverless observability made easy with Lambda Powertools for Java — Tracing

Published in

My Local Farmer Engineering

5 min readMay 10, 2022

This post describes how we simplified our code and improved logging, tracing and metrics collection in our API and Lambda functions, using Lambda Powertools for Java. This first part is focused on tracing.

Disclaimer
I Love My Local Farmer is a fictional company inspired by customer interactions with AWS Solutions Architects. Any stories told in this blog are not related to a specific customer. Similarities with any real companies, people, or situations are purely coincidental. Stories in this blog represent the views of the authors and are not endorsed by AWS.

Last year, we published an article about monitoring and metrics and how we implemented it in our API for deliveries. Thanks to the Embedded Metrics Format (EMF), we were able to easily collect business and technical metrics within our Lambda functions.

Recently, we faced some issues with our API and needed to dig into the logs to troubleshoot the problem. Unfortunately the logs didn’t have the information we needed and the collected metrics were not really helpful either. We decided to enrich our logs and to activate the request tracing with AWS X-Ray.

We also heard about Lambda Powertools, a library that actually aims to simplify logging, tracing and collecting metrics in Lambda functions. We decided to give it a try and see if it could help us diagnose and solve our issue.

This blog post describes the integration of Lambda Powertools for Java in the Delivery API and the outcomes. First part is dedicated to tracing

Tracing

Even though the Delivery API only contains a few components, we wanted to introduce distributed tracing and get insights on the latency of the different parts of the application: the API Gateway REST API, several Lambda functions and the communication with the database.

On AWS, tracing is provided by X-Ray: it enables visualisation of requests between the different AWS services we use and helps troubleshoot performance issues. Thanks to a correlation id (called “trace id” in X-Ray) sent between all components of the application, we can see where requests go, how long a request to a service takes and find potential bottlenecks in the whole chain. Tracing, and X-Ray, does not simply add a trace id in the logs, it comes with a dashboard where we can actually visualize the traces, and get metrics about latency.

Tracing can be enabled at the function level, using CDK:

FunctionProps.builder()
    .runtime(Runtime.JAVA_11)
    .code(...)
    .timeout(Duration.seconds(29))
    .memorySize(2048)
    .handler("...")
    .role(role)
    .tracing(Tracing.ACTIVE)
    .build());

AWS Lambda manages the integration with AWS X-Ray. It asynchronously sends the trace data so it doesn’t increase the duration of you function.

And here is the configuration for the REST API exposed by API Gateway, which also manages the integration with X-Ray:

CfnApiProps.builder()
    .stageName(apiStageName)
    .definitionBody(openapiSpecAsObject)
    .tracingEnabled(true)
    .build());

Once we have enabled it, we got the following graph in the X-Ray console , a.k.a service map. It displays the different services our Delivery API is using with average response time, number of transactions and the health of each one:

Some users were complaining about long waiting times. To troubleshoot this issue, we needed to go deeper in the traces. To get more details, especially about the communication with RDS, we needed to instrument our code with the X-Ray SDK for Java. This is where Lambda Powertools helped us.

To trace SQL queries, the X-Ray SDK provides interceptors based on Tomcat JdbcInterceptor, but we can not use them in our application as we don’t use Tomcat. We can also instrument the code programatically but it is a bit verbose and not really convenient.

Lambda Powertools comes with an annotation that really simplifies things. To get annotations working, Lambda Powertools needs compile-time weaving, so we add the aspectj post-compile-weaving plugin and the powertools-tracing dependency in our build.gradle file:

plugins{
  id ‘io.freefair.aspectj.post-compile-weaving’ version ‘6.4.1’
}dependencies {
  aspect ‘software.amazon.lambda:powertools-tracing:1.12.1’
}

Then in the code, we can simply add the @Tracing annotation on the handleRequest method:

@Tracing
public APIGatewayProxyResponseEvent handleRequest(APIGatewayProxyRequestEvent input, Context context) {
  //...
}

We can also add this annotation on any method within our code to record a SubSegment, a finer-grained recording that provides more precise timing information about a specific piece of code: for example a data processing algorithm or a call to a third-party system. In our case, we’ve added it on all the methods where we execute a query on the database:

@Tracing(segmentName = "Book_Delivery_Transaction")
public Delivery bookDelivery(Integer farmId, Integer slotId, Integer userId) throws SQLException {
   //...
}

Looking at X-Ray traces, we can now drill down into the details of a trace:

X-Ray traces with segments and subsegments

In the bottom of the screen, we have the different segments (in bold) and subsegments of the trace. Now we can analyse which part of our code takes longer and try to optimise the portions of our system that create the most latency. We also have a trace id that will actually be very useful later (in metrics and logs).

Using annotations and metadata, we can collect even more information. In our case, we collect the farm id, the user id and eventually the slot id as annotations. Stack traces and bigger json can be collected as metadata. This came in handy, as collecting this info helped us to better understand what happened when an issue occurred. It can be done with the following code with Lambda Powertools:

TracingUtils.putAnnotation("farmId", farmId);

Be careful not to add any confidential data (PII) here. These pieces of information are then accessible by clicking on the subsegment. Note that Lambda Powertools automatically adds an annotation whenever there is cold start:

Trace annotations with additional context (business data and coldstart) — Trace annotations with additional context

Conclusion

By adding tracing to our API, we’ve been able to investigate some latency issues we had, and to improve our code. The service map also provides a great visualisation of our application health and architecture. And the tracing module of Lambda Powertools really simplified the addition of subsegments by reducing the boilerplate code inherent to X-Ray.

The second part of this article will focus on metrics and logging.

See the source code on GiHub.

Serverless observability made easy with Lambda Powertools for Java — Tracing

Tracing

Conclusion

Written by Jérôme Van Der Linden