Coincidentally, Apache Kafka 4.0 was released last week, too. So let’s download it and use it for our experiments. Unpack the distribution and format a directory for the Kafka files:
1
2
3
tar xvf kafka_2.13-4.0.0.tgz
KAFKA_CLUSTER_ID="$(bin/kafka-storage.sh random-uuid)"
bin/kafka-storage.sh format --standalone -t $KAFKA_CLUSTER_ID -c config/server.properties
Building an AOT cache is a two-step process. First, a list of all the classes which should go into the archive needs to be generated. This list is then used for creating the archive itself. This feels a bit more convoluted than it should be, and indeed the JEP mentions that simplifying this is on the roadmap.
Create the class list like so:
1
2
export EXTRA_ARGS="-XX:AOTMode=record -XX:AOTConfiguration=kafka.aotconf" (1)
bin/kafka-server-start.sh config/server.properties
1 | The EXTRA_ARGS variable can be used to pass any additional arguments to the JVM when launching Kafka, in this case to specify that the list of classes for the AOT cache should be recorded in the file kafka.aotconf |
As an aside, Kafka has completely parted ways with ZooKeeper as of the 4.0 release and exclusively supports KRaft for cluster coordination. By using the server.properties file, our single broker runs in the so-called "combined" mode, so it has both the "broker" and "controller" roles. Very nice to see how simple things have become here over the years!
Once Kafka has started, open a separate shell window. Create a topic in Kafka, then produce and consume a couple of messages like so:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
bin/kafka-topics.sh --create --topic my-topic --bootstrap-server localhost:9092
Created topic my-topic.
bin/kafka-console-producer.sh --topic my-topic --bootstrap-server localhost:9092
>hello
>world
<Ctrl + C>
bin/kafka-console-consumer.sh --topic my-topic --from-beginning --bootstrap-server localhost:9092
hello
world
<Ctrl + C>
Processed a total of 2 messages
This shows the trade-off involved when creating AOT cache files: we don’t have to produce and consume messages here, but in all likelihood this will trigger the loading of classes which otherwise would be loaded and linked at runtime only. It may be a good idea to monitor which classes get loaded via JDK Flight Recorder, thus making sure you are indeed capturing the relevant set when creating the AOT cache file.
Stop the broker by hitting <Ctrl + C>
in the session where you started it.
If you take a look at the kafka.aotconf file, you’ll see that it essentially is a long list of classes to be cached,
as well as other class-related metadata.
The comment at the top still hints at the history of Leyden’s AOT support being built on top of CDS:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
# NOTE: Do not modify this file.
#
# This file is generated via the -XX:DumpLoadedClassList=<class_list_file> option
# and is used at CDS archive dump time (see -Xshare:dump).
#
java/lang/Object id: 0
java/io/Serializable id: 1
java/lang/Comparable id: 2
java/lang/CharSequence id: 3
java/lang/constant/Constable id: 4
java/lang/constant/ConstantDesc id: 5
java/lang/String id: 6
java/lang/reflect/AnnotatedElement id: 7
java/lang/reflect/GenericDeclaration id: 8
java/lang/reflect/Type id: 9
java/lang/invoke/TypeDescriptor id: 10
...
Next, let’s try and create the actual AOT cache file.
To do so, specify the -XX:AOTMode=create
option.
Note that the application is not actually executed during this process,
instead the JVM will only create the AOT cache file and exit again:
1
2
export EXTRA_ARGS="-XX:AOTMode=create -XX:AOTConfiguration=kafka.aotconf -XX:AOTCache=kafka.aot" (1)
bin/kafka-server-start.sh config/server.properties
1 | Create the AOT cache using the previously created configuration file |
Uh, oh, something isn’t quite working as expected:
1
2
3
4
5
java.lang.IllegalArgumentException: javax.management.NotCompliantMBeanException: com.sun.management.UnixOperatingSystemMXBean: During -Xshare:dump, module system cannot be modified after it's initialized
at java.management/javax.management.StandardMBean.<init>(StandardMBean.java:270)
at java.management/java.lang.management.ManagementFactory.addMXBean(ManagementFactory.java:882)
at java.management/java.lang.management.ManagementFactory.lambda$getPlatformMBeanServer$1(ManagementFactory.java:474)
...
This message was a bit confusing to me—I don’t think I’m interacting with the Java module system in any way? So I sent a message to the leyden-dev mailing list, where I learned that this may be triggered by starting the JMX agent of the JVM. While I was not actively doing that, indeed this is the case by default as per the run-class.sh launcher script coming with the Kafka distribution. So let’s disable JMX diagnostics and try again:
1
2
export KAFKA_JMX_OPTS=" "
bin/kafka-server-start.sh config/server.properties
Some of the classes are skipped for different reasons, but overall, things look much better this time:
1
2
3
4
5
6
7
8
[0.908s][warning][cds] Preload Warning: Verification failed for org.apache.logging.log4j.core.async.AsyncLoggerContext
[2.307s][warning][cds] Skipping org/slf4j/Logger: Old class has been linked
[2.307s][warning][cds,resolve] Cannot aot-resolve Lambda proxy because org.slf4j.Logger is excluded
[2.613s][warning][cds ] Skipping jdk/internal/event/Event: JFR event class
[2.615s][warning][cds ] Skipping org/apache/logging/slf4j/Log4jLogger: Unlinked class not supported by AOTClassLinking
[2.615s][warning][cds ] Skipping org/apache/logging/slf4j/Log4jLoggerFactory: Unlinked class not supported by AOTClassLinking
...
AOTCache creation is complete: kafka.aot
A tad concerning that Log4j’s AsyncLoggerContext
class fails verification, but we’ll leave analysis of that for another time.
The AOT cache file has a size of 66 MB in this case.
It is considered an implementation detail and as such is subject to change between Java versions.
Now let’s see what’s the impact of using the AOT cache on Kafka’s start-up time.
To do so, simply specify the name of the cache file when running the application:
1
2
export EXTRA_ARGS="-XX:AOTCache=kafka.aot"
bin/kafka-server-start.sh config/server.properties
I’ve measured the start-up time by comparing the timestamp of the very first log message emitted by Kafka to the timestamp of the message saying "Kafka Server started", always starting from a freshly formatted Kafka logs directory and flushing the page cache in between runs. Averaged over five runs, this took 285 ms on my machine (a 2023 MacBook Pro with M3 Max processor and 48 GB shared memory). In comparison, Kafka took 690 ms to start without the archive, i.e. the AOT cache makes for a whopping 59% reduction of start-up time in this scenario.
When building the AOT cache, you can also disable AOT class loading and linking by specifying the -XX:-AOTClassLinking
option,
effectively resulting in the same behavior you’d get when using AppCDS on earlier Java versions.
This would result an Kafka start-up time of 327 ms on my laptop,
i.e. the lion share of the improvement in the case at hand indeed originates from reading and parsing the class files ahead of time,
with AOT loading and linking them only yielding a relatively small improvement in addition.
Finally, I’ve also measured how long it takes to start the Kafka native binary in a Docker container (see KIP 974),
which took 118 ms, i.e. less than half of the time it took with the AOT cache.
Keep in mind though that this image is considered experimental and not ready for production,
whereas there shouldn’t be any concern of that kind when running Kafka with the AOT cache on the JVM.