Packaging & deploying Java11-based Lambda functions

Jérôme Van Der Linden
AWS Tip
Published in
8 min readMar 27, 2023

--

Why yet another article on how to build and deploy Java-based Lambda functions? There is already the doc, this one with AWS SAM, this one with AWS CDK, and many others. Well, none of them is talking about Java 11 in particular and most important none of them is talking about multi-release JAR files… In this blog post, I’ll explain what happens to your JAR/ZIP on Lambda and what does this mean for the package you need to build: their content, their structure and how you can achieve this with SAM and CDK.

Once upon a time, an error occurred

It all started with the following error:

Exception in thread "main" java.lang.ExceptionInInitializerError
Caused by: java.lang.UnsupportedOperationException: No class provided, and an appropriate one cannot be found.
at
org.apache.logging.log4j.LogManager.callerClass(LogManager.java:555)
at org.apache.logging.log4j.LogManager.getLogger(LogManager.java:580)
at org.apache.logging.log4j.LogManager.getLogger(LogManager.java:567)
at app.App.<clinit>(App.java:11)

The crazy thing is not so much the error itself (yet another log4j error), but the fact that it does not happen when using SAM but happens with CDK (obviously with the same Java code and same pom.xml).

The Lambda function code

public class LambdaFunctionHandler implements RequestHandler<APIGatewayProxyRequestEvent, APIGatewayProxyResponseEvent> {
Logger log = LogManager.getLogger(); // <-- the error happens here

@Logging(logEvent = true)
public APIGatewayProxyResponseEvent handleRequest(final APIGatewayProxyRequestEvent input, final Context context) {
// do some stuff and return...
}
}

The pom.xml

I won’t past the complete pom, just the interesting parts, and mainly the maven-shade-plugin configuration. If you don’t know this plugin, it permits to generate a fat-jar, containing your own compiled code, plus all the dependencies (deflated and regrouped altogether). Also note the use of a specific transformer for log4j due to some errors with log4j when using a shaded JAR.

    <properties>
<!-- JAVA 11 -->
<maven.compiler.source>11</maven.compiler.source>
<maven.compiler.target>11</maven.compiler.target>
<!-- recent log4j -->
<log4j.version>2.20.0</log4j.version>
</properties>

<dependencies>
<dependency>
<groupId>org.apache.logging.log4j</groupId>
<artifactId>log4j-api</artifactId>
<version>${log4j.version}</version>
</dependency>
<!-- ... other dependencies -->
</dependencies>

<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<version>3.8.1</version>
<configuration>
<source>${maven.compiler.source}</source>
<target>${maven.compiler.target}</target>
</configuration>
</plugin>

<!-- such as described in the doc: https://docs.aws.amazon.com/lambda/latest/dg/java-package.html -->
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<version>3.4.1</version>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>shade</goal>
</goals>
<configuration>
<transformers>
<transformer
implementation="com.github.edwgiz.maven_shade_plugin.log4j2_cache_transformer.PluginsCacheFileTransformer">
</transformer>
</transformers>
<createDependencyReducedPom>false</createDependencyReducedPom>
</configuration>
</execution>
</executions>
<dependencies>
<dependency>
<groupId>com.github.edwgiz</groupId>
<artifactId>maven-shade-plugin.log4j2-cachefile-transformer</artifactId>
<version>2.15</version>
</dependency>
</dependencies>
</plugin>
</plugins>
</build>

The SAM template

This is actually a very standard one as you can generate it with sam init and selecting the template #7 (“Hello World Example With Powertools”).

HelloWorldFunction:
Type: AWS::Serverless::Function
Properties:
CodeUri: HelloWorldFunction
Handler: helloworld.LambdaFunctionHandler::handleRequest
Runtime: java11
MemorySize: 1024
Timeout: 29
Tracing: Active
Environment:
Variables:
POWERTOOLS_LOG_LEVEL: INFO
POWERTOOLS_LOGGER_LOG_EVENT: true
POWERTOOLS_METRICS_NAMESPACE: sam-java-powertools
JAVA_TOOL_OPTIONS: -XX:+TieredCompilation -XX:TieredStopAtLevel=1

The CDK code

⚠️ Do not use the following code as it does not work with the previous pom & Java code…

// command used to build the lambda function
List<String> functionPackagingInstructions = Arrays.asList(
"/bin/sh",
"-c",
"mvn clean install " +
"&& cp target/HelloWorldFunction-1.0-SNAPSHOT.jar /asset-output/ "
);

// docker image options used to build the lambda function
BundlingOptions.Builder builderOptions = BundlingOptions.builder()
.image(Runtime.JAVA_11.getBundlingImage())
.user("root")
.outputType(BundlingOutput.ARCHIVED); // <- use the archive (JAR file)

// the function itself
Function function = new Function(this, "Function", FunctionProps.builder()
.runtime(Runtime.JAVA_11)
.code(Code.fromAsset("HelloWorldFunction", AssetOptions.builder()
.bundling(builderOptions
.command(functionPackagingInstructions)
.build())
.build()))
.handler("helloworld.LambdaFunctionHandler")
.memorySize(1024)
.timeout(Duration.seconds(29))
.logRetention(RetentionDays.ONE_WEEK)
.environment(Map.ofEntries(
new AbstractMap.SimpleEntry<>("POWERTOOLS_LOG_LEVEL", "INFO"),
new AbstractMap.SimpleEntry<>("POWERTOOLS_LOGGER_LOG_EVENT", "true"),
new AbstractMap.SimpleEntry<>("POWERTOOLS_METRICS_NAMESPACE", "cdk-java-powertools"),
new AbstractMap.SimpleEntry<>("JAVA_TOOL_OPTIONS", "-XX:+TieredCompilation -XX:TieredStopAtLevel=1")
.build());

What is happening?

SAM does not care about your maven-shade-plugin

If you thought SAM was using your pom.xml to build the Lambda package, that’s not (completely) true! SAM has its own process. Looking at the sam build logs, you will see:

Running JavaMavenWorkflow:CopySource
Running JavaMavenWorkflow:MavenBuild
Running JavaMavenWorkflow:MavenCopyDependency
Running JavaMavenWorkflow:MavenCopyArtifacts
Running JavaMavenWorkflow:CleanUp
Running JavaMavenWorkflow:JavaCopyDependencies

And actually, this is because SAM is using a custom Maven Lambda Builder with the following operations:

  1. CopySource: Copy source project to scratch directory
  2. MavenBuild: Build and package with mvn clean install
  3. MavenCopyDependency: Copy dependencies (from the maven repository to the scratch directory) with mvn dependency:copy-dependencies -DincludeScope=runtime -Dmdep.prependGroupId=true
  4. MavenCopyArtifacts: Copy classes from target/classes to the artifact directory and dependencies (all JAR files retrieved in step 3) are copied to the artifact/lib directory.

SAM will zip this artifact directory during sam deploy and use it as the final package for your function. You can read this blog post (“Building serverless Java applications with the AWS SAM CLI”) for further details.

Two important things here:

- SAM is using the generated classes but not the package (JAR) built by maven and the shade plugin.

- Dependencies are copied to a lib directory. This is the key! We’ll come back to it later…

Imitate SAM with CDK

We can actually mimic the SAM behaviour with CDK, using the following snippet (the rest of the code remains the same as before):

List<String> functionPackagingInstructions = Arrays.asList(
"/bin/sh",
"-c",
"mvn clean install " +
"&& mvn dependency:copy-dependencies -DincludeScope=runtime -Dmdep.prependGroupId=true " +
"&& mkdir /asset-output/lib " +
"&& cp -r target/classes/. /asset-output/ " +
"&& cp -r target/dependency/. /asset-output/lib "
);

BundlingOptions.Builder builderOptions = BundlingOptions.builder()
.command(functionPackagingInstructions)
.image(Runtime.JAVA_11.getBundlingImage())
.user("root")
.outputType(BundlingOutput.NOT_ARCHIVED); // <- the content is not an archive (JAR) here

Note the bundling option output type NOT_ARCHIVED, which permits to get all the files within the asset-output directory, not just a JAR/ZIP (with ARCHIVED), so that CDK will zip the content of this directory, just like SAM does.

With this code, it works just like SAM, the JAR generated by the maven-shade-plugin is ignored and the error mentioned in the beginning does not happen.

Multi-release JAR files

In the introduction, I was mentioning multi-release JAR files. What is it and why do we care here?

What are multi-release JAR files?

As per the doc:

A multi-release JAR file allows for a single JAR file to support multiple major versions of Java platform releases. For example, a multi-release JAR file can depend on both the Java 8 and Java 9 major platform releases, where some class files depend on APIs in Java 8 and other class files depend on APIs in Java 9. This enables library and framework developers to decouple the use of APIs in a specific major version of a Java platform release from the requirement that all their users migrate to that major version. Library and framework developers can gradually migrate to and support new Java features while still supporting the old features.

Concretely, it means that within the same JAR file, you can provide some code that depends on Java 9 or higher, while remaining compatible with Java 8. The Java 9 code will only be used on a JVM with version 9+.

To do so, you create a project with the following structure:

src/
└── main/
├── java/
| └── com/
| └── jvdl/
| └── myapp/
| ├── App.java
| └── MySuperHelper.java
└── java9/
└── com/
└── jvdl/
└── myapp/
└── MySuperHelper.java

After compiling and building the JAR (for example with maven), you get the following content (see the versions directory under META-INF):

com/
└── jvdl/
└── myapp/
├── App.class
└── MySuperHelper.class
META-INF/
├── MANIFEST.MF
└── versions
└── 9
└── com/
└── jvdl/
└── myapp/
└── MySuperHelper.class

And the MANIFEST.MF has an important property:

Multi-Release: true

Why it’s important here?

Extract from the Log4j documentation:

As of version 2.9.1 Log4j supports Java 9 but will still work in Java 7 or 8. In this version log4j-api is packaged as a multi-release jar and supports the use of the StackWalker and Process APIs.

log4j-api is packaged as a multi-release jar and contains the following content (truncated):

META-INF
├── MANIFEST.MF
└── versions
└── 9
└── org
└── apache
└── logging
└── log4j
└── util
├── Base64Util.class
├── ProcessIdUtil.class
├── StackLocator.class
└── internal
└── DefaultObjectInputFilter.class

The interesting one here is the StackLocator which is actually used by the getLogger method where we had the error in the beginning:

// in LogManager.java
public static Logger getLogger() {
return getLogger(StackLocatorUtil.getCallerClass(2));
}

// in StackLocatorUtil.java
public static Class<?> getCallerClass(final int depth) {
return stackLocator.getCallerClass(depth + 1);
}

If interested, you can have a look at the StackLocator code: Java 8 & Java 9.

OK, so in order to have our Java 11 code working with this multi-release log4j JAR, we have to make our own JAR a multi-release JAR. To do so, we need to have the Multi-Release: true in the MANIFEST.MF.

Coming back to our initial config using the maven-shade-plugin, we need to do this (also see this stackoverflow post):

<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<version>3.4.1</version>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>shade</goal>
</goals>
<configuration>
<transformers>
<!-- Use the Manifest transformer to add the Multi-Release flag -->
<transformer implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">
<manifestEntries>
<Multi-Release>true</Multi-Release>
</manifestEntries>
</transformer>

<transformer
implementation="com.github.edwgiz.maven_shade_plugin.log4j2_cache_transformer.PluginsCacheFileTransformer">
</transformer>
</transformers>
<createDependencyReducedPom>false</createDependencyReducedPom>
</configuration>
</execution>
</executions>
<dependencies>
<dependency>
<groupId>com.github.edwgiz</groupId>
<artifactId>maven-shade-plugin.log4j2-cachefile-transformer</artifactId>
<version>2.15</version>
</dependency>
</dependencies>
</plugin>

But unfortunately, it still not enough! Let me explain why…

How AWS Lambda handle JAR/ZIP files

First let’s see what’s happening with the SAM package (ZIP file):

The zip file is just unpacked like this in the $LAMBDA_TASK_ROOT directory. The Java Lambda Runtime also adds the lib directory to the classpath. The log4j-api JAR is kept intact and used as a is by the JVM (meaning it can read the MANIFEST.MF and the Multi-Release flag).

Now let’s have a look at our fat JAR. As you can see, the code and the deflated libraries are at the same level in the JAR:

Just like the ZIP file, the JAR file is extracted in the $LAMBDA_TASK_ROOT directory. Unfortunately, as it is not a JAR file anymore, all the content under the META-INF directory is ignored by the JVM, and thus the MANIFEST.MF is not read, the Multi-Release flag is ignored and all the classes in the versions directory are ignored too, resulting in the issue we had in the beginning where log4j didn’t work properly.

Yes, that means multi-releases JAR files do not work in Lambda, or to be more precise, they do not work if they are simply unpacked in the $LAMBDA_TASK_ROOT directory.

So let’s try something… Let’s try to put our fat-jar (created by the maven-shade-plugin) in the lib directory and to zip it:

List<String> functionPackagingInstructions = Arrays.asList(
"/bin/sh",
"-c",
"mvn clean install " +
"&& mkdir /asset-output/lib " +
"&& cp target/Function-1.0-SNAPSHOT.jar /asset-output/lib "
);

BundlingOptions.Builder builderOptions = BundlingOptions.builder()
.command(functionPackagingInstructions)
.image(Runtime.JAVA_11.getBundlingImage())
.user("root")
.outputType(BundlingOutput.NOT_ARCHIVED); // <- will zip the lib directory...

Doing that, our JAR file (multi-releases fat-jar file to be precise) is under lib, added to the classpath and works as expected:

Conclusion

In this article, we saw how AWS Lambda is handling Java packages. A best practice, to avoid any issue, is to have your code deployed to the $LAMBDA_TASK_ROOT directory and the libraries in $LAMBDA_TASK_ROOT/lib directory. By the way, this is also recommended in the doc:

Reduce the time it takes Lambda to unpack deployment packages authored in Java by putting your dependency .jar files in a separate /lib directory. This is faster than putting all your function’s code in a single jar with a large number of .class files.

Thus, dependencies, even if they are multi-releases JAR, are added to the classpath and loaded normally by the JVM.

Following this recommendation, I would also advise not to use the maven-shade-plugin which somehow breaks everything by deflating all the dependencies in the same folder. If you work with SAM, that’s ok because it doesn’t care about it. If you work with CDK or another framework, just use the snippet I gave above to mimic SAM, or use the maven-assembly-plugin to create a ZIP file in the good format.

--

--

Senior Solution Architect @AWS - software craftsman, agile and devops enthusiastic, cloud advocate. Opinions are my own.