Summary
Introduce strictly-initialized fields in the Java Virtual Machine.
Such fields must be initialized before they are read, thus default values such
as 0 or null are never observed.
For strictly-initialized fields that are final, the same value is always
observed.
This is a preview VM feature, available for use
by compilers that emit class files.
Goals
-
Offer designers of JVM-based programming languages a model for field initialization which has stronger integrity guarantees than the present model.
-
Give these designers the flexibility to choose, for each static and instance field in a class, whether to opt in to the new model or continue with the present model.
Non-Goals
-
It is not a goal to introduce new Java language features, such as a strictly-initialized modifier for fields.
-
It is not a goal to change
javaccompilation strategies in order to impose strict field initialization on existing Java source code.
Motivation
The Java Platform specifies that every variable is initialized before use,
ensuring that a program can never read from uninitialized memory.
If a field in a class — whether a static field or an instance field — is not
initialized explicitly then it is initialized implicitly before it is used, by
being set to a default value.
This value is always some form of zero: the number 0, the boolean false, or a
null reference.
Default values are a mixed blessing. They provide a straightforward safety net, ensuring that a program never observes uninitialized memory, but they can often be misinterpreted as legitimate data rather than as a signal that nothing has yet been written.
For example, a method may read a null value from a field and then pass that on
to other methods and constructors, only to trigger a NullPointerException
somewhere far from where the field was read.
JDK 14 improved the messages in such exceptions to make it
easier to pinpoint the source of the error in a specific line of code, but these
messages cannot direct you back to the initialization bug that supplied the
null in the first place.
The Java Platform also specifies that variables declared final cannot be
mutated, ensuring that any two reads of a final variable produce the same
value.
For final fields, however, this rule does not apply while the class or instance
is being initialized.
A program may thus read different values at different times as the fields are
set to their intended values.
Field initialization bugs in practice
The following example illustrates the problems of unexpected default values and
inconsistent final fields.
In these classes, the final field App.appID may be read by code in the Log
class before it is assigned its proper value.
When that happens, different program components end up working with conflicting
field values.
class App {
public static final long appID = Log.currentPID(); // [1], [4], [6]
public static void main() {
IO.println("App[" + appID + "] has started");
// ...
Log.log("Completed 'main'");
}
}
class Log { // [2]
private static final String prefix = "App[" + App.appID + "]: "; // [3]
public static void log(String msg) {
IO.println(prefix + msg);
}
public static long currentPID() {
return ProcessHandle.current().pid(); // [5]
}
}
When the class App is run from the command line, the output is something like:
App[96052] has started
App[0]: Completed 'main'
The discrepancy between ID numbers arises because the invocation of
Log.currentPID() in the App class [1] triggers initialization of the Log
class [2], and during that class's initialization, the default 0 value of the
appID field is read [3] and embedded into the prefix string.
After the Log class is initialized, the call to its currentPID method from
the App class [4] proceeds, producing the current process's ID number [5],
which is finally assigned to App.appID [6].
That assignment is, however, too late for the prefix field.
In complex systems, these sorts of bugs are difficult to recognize and diagnose.
One subtlety is that the order of initialization matters: If the Log class is
initialized first, the discrepancy is not observed.
Another subtlety is that the circular dependency between the classes App and
Log is easy to create by mistake and easy to overlook later; if the utility
method currentPID were declared in some other class, the circularity would not
exist and everything would behave as expected.
Most kinds of Java variables do not suffer from these problems. A local variable must be explicitly assigned before it is read, and a final local variable may only be assigned once. Fields are unique in their reliance upon default values.
A strict approach to field initialization
We propose an alternative approach to initializing fields, both non-final and
final.
Instead of every field being initialized to a default value when it is created,
we alter the JVM to ensure that some fields, designated strictly-initialized,
are explicitly initialized in bytecode before they are allowed to be read.
Compilers such as javac are responsible for choosing which fields are
designated strictly-initialized based on the language features used in source
code.
We call this strict field initialization because it imposes additional
restrictions on the code that initializes fields.
Strict field initialization makes it impossible to have unexpected default values and inconsistent final fields. Every read from a strictly-initialized field observes a previously-written value and, if the field is final, every read observes the same value. These properties are what we already intuitively expect from fields; strict field initialization promotes these properties from mere intuitions to actual integrity guarantees, enforced by the JVM.
Strict field initialization improves integrity
Strict field initialization lays the foundation for two new Java language features:
-
Value classes are new kinds of classes whose instances lack identity and can never be mutated. It is essential that the final instance fields of a value class instance always be observed to have the same value.
-
Null-restricted fields are fields that can never store
null. It is essential that these fields, both static and instance, not usenullas a default value. They must be explicitly initialized with a non-nullvalue before they can be read.
As shown above, the process of field initialization can be delicate. The JVM must not impose new initialization behavior upon existing programs since they could depend upon the existing behavior. New language features, by contrast, can define new rules and behaviors for field initialization and then adopt strict field initialization. As the language evolves and new features are adopted, program components will gradually be hardened against field initialization bugs.
Description
A strictly-initialized field does not have a default value.
It cannot be read before it has been explicitly initialized and, if it is final,
all reads produce the same value.
Compilers mark fields that are subject to strict initialization with a new flag
in the class file, ACC_STRICT_INIT (0x0800).
For strictly-initialized fields, the JVM enforces these invariants:
The invariants of strictly-initialized fields give the JVM new opportunities to optimize uses of those fields. For example, the HotSpot JVM's JIT compiler will treat strictly-initialized final fields as trusted. A trusted final field is known to never change, so once a value has been read from it, subsequent reads can reuse that same value. As a result, JIT-compiled code has fewer interactions with memory and may run faster.
Below, we review the class initialization process in the JVM and discuss new rules for strictly-initialized static fields in more depth. We then review the instance initialization process and discuss new rules for strictly-initialized instance fields.
This is a preview VM feature, disabled by default
The ACC_STRICT_INIT flag denoting a strictly-initialized field is recognized
only in class files with a preview version number (XX.65535), and only when
preview features are enabled at run time.
To enable preview features at run time, use the --enable-preview command-line
option:
$ java --enable-preview Main
Value classes, a new Java language feature, rely upon strict field
initialization: Compilers mark all the fields of value classes as
ACC_STRICT_INIT.
To program with value classes, you must enable preview features at both compile
time and run time in order to enable both value classes and strict field
initialization.
Strict field initialization is a standalone feature in the JVM.
It does not assume that value classes exist, and it can be used by compilers of
non-Java languages.
Regardless of the compiler, class files with fields marked as ACC_STRICT_INIT
can be loaded only if preview features are enabled at run time.
Class initialization today
Whenever a class is loaded by the JVM, it must be initialized.
In bytecode, a class or interface can declare a class initialization method,
named <clinit>, for this purpose.
The class initialization method is free to execute arbitrary code.
Usually, class initialization includes setting all of the class's static fields
to appropriate initial values; it may also involve interactions with global
state.
In Java source code, a class's initialization method is not written directly; it is, rather, an aggregation of the class's static field initializers and static initializer blocks.
Each class in a hierarchy may have its own <clinit> method.
Every superclass must be initialized before executing the <clinit> method of a
subclass.
A class whose initialization has begun but not yet completed is considered larval. It is developing, but not yet fully formed.
The JVM tracks the initialization state of each class at run time. In today's JVM (see JVMS §5.5), a class's initialization state is one of:
-
Uninitialized: The class is loaded, but initialization has not yet started.
-
Larval (within a particular thread): The class is currently being initialized.
-
Initialized: The class has successfully completed initialization, and can be used without restriction.
-
Erroneous: The class failed initialization and may not be used.
The <clinit> method runs while the class is in the larval state.
The class is not yet initialized at this point, but its fields and methods can
be freely accessed by code running in the current thread.
If the <clinit> method completes successfully, the class transitions to the
initialized state.
If an exception is thrown, the class transitions to the erroneous state and can
never become initialized.
The constraints on class initialization are enforced dynamically, at run time.
For example, each getstatic instruction checks the initialization state of the
resolved field's class.
If the class is not initialized, but is in the larval state in another thread,
then the getstatic instruction blocks until initialization completes.
Strict initialization of static fields
To implement strict initialization of static fields, we enhance the larval class initialization state to track whether each static field of the class has been set, and whether each static field of the class has been read.
When executing a putstatic or getstatic instruction, if the resolved field
is declared by a class in the larval state in the current thread, the state is
updated to record that the field has been set (by putstatic) or read (by
getstatic).
This occurs even if the field is accessed from another method or class, and even
if the field is accessed through a subclass.
A field declared with the ConstantValue attribute is always considered set.
With this information, the JVM can enforce the invariants of strictly-initialized static fields:
-
If a
getstaticinstruction attempts to read from a strictly-initialized field declared by a class in the larval state, and that field is not yet set, then the JVM throws an exception, indicating that the field cannot yet be read. -
If a
putstaticinstruction attempts to write to a strictly-initialized final field declared by a class in the larval state, and that field has already been read, then the JVM throws an exception, indicating that the field can no longer be set. -
Just before a class transitions to the initialized state, its larval state is checked to ensure that every strictly-initialized static field has been set; if not, the JVM throws an exception, indicating one of the fields that must be explicitly set during class initialization.
(In some complex cases, such as during exception handling, a static final field may be written multiple times during initialization. This is allowed, but only the ultimate value of the field will be readable.)
The above rules are enforced even if a static field is read or written
reflectively during class initialization via, e.g., the
java.lang.reflect.Field or java.lang.invoke.VarHandle APIs.
Instance initialization today
Whenever a class instance is created with the new bytecode, that instance must
be initialized.
In bytecode, a class can declare multiple instance initialization methods,
named <init>, for this purpose.
These methods are free to execute arbitrary code.
Through a chain of <init> method invocations, every class in an inheritance
hierarchy defines what constitutes an initialized class instance.
Usually, instance initialization includes setting all of the object's instance
fields to appropriate initial values; it may also involve interactions with the
static fields of the class, or other global state.
In Java source code, instance initialization methods are mainly expressed with
constructors, and delegation between constructors is expressed with super(...)
and this(...) calls.
Instance initialization methods may also include code from a class's instance
field initializers and instance initializer blocks.
Each class in a hierarchy has at least one <init> method, and that method
must, at some point before it completes, delegate to another <init> method of
either the current class or its superclass.
This recursion bottoms out at Object::<init>.
An instance whose initialization has begun but not yet completed is, like a class, considered larval. It is developing, but not yet fully formed.
Like classes, instances have an initialization state, although this is expressed only indirectly in the JVM Specification. Today, an object's initialization state is one of:
-
Uninitialized: The object has been created by a
newinstruction, but initialization has not yet started. -
Early larval: The object is currently being initialized, and limited operations are available.
-
Late larval: The object is currently being initialized, but is sufficiently mature that it can be used without restriction.
-
Initialized: The object has successfully completed initialization.
-
Erroneous: The object failed initialization and may not be used.
An <init> method begins execution in the early-larval state.
Most operations, including method invocations, are not allowed on an object in
the early-larval state, and the object may not be shared with other code.
However, its fields may be assigned with putfield.
Eventually, another <init> method is invoked and the initialization process
continues recursively, eventually reaching Object::<init>.
At that point, the instance transitions to the late-larval state and, one by
one, the recursively invoked <init> methods complete their execution and
return.
In the late-larval state, use of the object, including its fields and methods,
is unrestricted; the object may even be shared across threads.
The object is considered initialized once the outermost <init> method returns
successfully.
Alternatively, any <init> call in the stack might fail with an exception; in
that case, the object transitions to the erroneous state and can never become
initialized.
The constraints on instance initialization are enforced statically, by the bytecode verifier. Verification determines a type state for each instruction, which is either restricted (for code operating on an instance in the early-larval state) or unrestricted (for code operating on an instance in the late-larval and initialized states, and for code in static methods).
For instructions with restricted type states, the verifier prevents most
operations on the current object.
It also ensures that an unrestricted type state can be reached only via a chain
of recursively delegating <init> calls that eventually reaches
Object::<init>.
The return instruction, which makes a newly constructed object available to
the caller of <init>, is only allowed in an unrestricted type state.
Strict initialization of instance fields
To implement strict initialization of instance fields, we enhance the early-larval instance initialization state to track whether each instance field of the class has been set.
In the verifier, this is expressed with a restricted type state that carries a
list of all the current class's strictly-initialized instance fields that have
not yet been set.
A putfield on the current class instance in a restricted type state removes
the named field from the list.
The enhanced type state supports the following rules to enforce the invariants of strictly-initialized instance fields:
-
An
invokespecialof an<init>method, applied to the current class instance in a restricted type state, requires that if the invocation is of a superclass method, the list of unset fields must be empty. (If the invocation is of another<init>method of the same class, there is no such requirement — the invoked method is responsible for setting the fields.) -
A
putfieldinstruction writing to a strictly-initialized final field of the current class is only allowed in a restricted type state. (In contrast,putfieldis allowed throughout the body of an<init>method for final fields that are not strictly initialized.)
It has never been permitted to use getfield on an instance in a restricted
type state.
Thus, there is no rule for getfield analogous to the getstatic rule for
static fields, and no need to track whether final fields have been read.
Jumps between restricted and unrestricted type states are not allowed. Jumps between different restricted type states are allowed, as long as the jump is to a type state in which fewer fields are set.
These verification rules ensure that all strictly-initialized fields of an object are set while it is in an early-larval state, before any reads can occur, and that no strictly-initialized final fields are mutated once the object enters the late-larval state. When the verified code executes, there is no need for additional run-time checks to enforce the initialization invariants.
In a class file, the StackMapTable attribute expresses the
expected incoming type state for a jump target.
In the past, a restricted type state has been expressed simply by including the
special type uninitializedThis in the list of local variables.
But when a class has strictly-initialized fields, the type state may also need
to indicate whether each field has been set.
This is accomplished with a new kind of StackMapTable frame entry:
early_larval_frame {
u1 frame_type = EARLY_LARVAL; /* 246 */
u2 number_of_unset_fields;
u2 unset_fields[number_of_unset_fields];
// array of NameAndType constants
base_stack_map_frame base_frame;
// any other kind of stack frame
}
Alternatively, if a stack frame has any other frame_type but mentions
uninitializedThis, the stack frame is implicitly restricted, with unset fields
inferred as whatever fields were unset in the previous frame.
Strictly-initialized final fields cannot be mutated by deep reflection
Some applications and frameworks use deep reflection, as embodied in the
setAccessible
and
set
methods of the java.lang.reflect.Field API, to manipulate an object's
private or final fields after instance initialization completes.
In JDK 26, the mutation of final fields by deep reflection is permitted but
causes a warning; in a future release, those who need this capability will have
to enable it explicitly at startup.
(See JEP 500 for more information.)
The mutation of strictly-initialized final fields by deep reflection is
inconsistent with the invariants of strict field initialization: Different reads
of the same final field could observe different values.
The setAccessible method therefore categorizes these fields as non-modifiable,
just as it does for static final fields and the final fields of record classes.
Attempting to set a strictly-initialized final field always throws an
IllegalAccessException.
Using --enable-final-field-mutation=... will not enable mutation of these
non-modifiable fields.
To set a strictly-initialized final instance field of a class, you must employ one of the class's constructors, which has the exclusive ability to assign to the field.
Strictly-initialized fields require custom deserialization
Object deserialization, as embodied in the ObjectInputStream API, skips the
usual execution of an <init> method in the class being instantiated.
Instead, the API does its own construction via reflective library code.
Much like deep reflection, this capability bypasses the verification-based
enforcement of constraints on strictly-initialized instance fields, and cannot
be used for classes that declare these fields.
The ObjectOutputStream::writeObject and ObjectInputStream::readObject
methods therefore throw an InvalidClassException if a class being serialized
or deserialized declares a strictly-initialized instance field and the class is
not a record class.
To avoid this exception, implement the writeReplace and
readResolve methods.
Doing so causes a replacement object to be serialized and deserialized in place
of the object with strictly-initialized fields.
(We anticipate a future enhancement to serialization which allows you to
designate construction code that ObjectInputStream::readObject can use to
safely create new instances from the data in a serialization stream.
This process will rely on regular constructor invocation, and so will be
compatible with strictly-initialized instance fields.)
Supporting changes
-
In the
java.lang.reflect.Fieldclass, the existingaccessFlagsmethod and a newisStrictInitmethod reflect the presence of theACC_STRICT_INITflag on fields. -
The
java.lang.classfileAPI supports theACC_STRICT_INITaccess flag on fields andearly_larval_frameentries inStackMapTableattributes. When aStackMapTableis automatically generated for an<init>method, it properly encodes the status of strictly-initialized instance fields. -
The
javaptool displays theACC_STRICT_INITmodifier andearly_larval_frameentries; it also displays the implicit unset fields of otherStackMapTableentries. -
The AsmTools utilities similarly support the
ACC_STRICT_INITflag andearly_larval_frameentries.
Alternatives
-
Fields that have a
ConstantValueattribute, a longstanding feature of the JVM, can be thought of as already being strictly initialized: The given value is assigned to the field before any user code can attempt to read the field. But the attribute only works on static fields with a primitive type or typeString, and, unsurprisingly, can only assign constant values. Many use cases for strict field initialization need to allow initial values to be derived from constructor parameters or computed with general-purpose bytecode. -
In JDK 21, the
javaccompiler began to issue warnings to discourage invocations of instance methods from superclass constructors. These warnings help prevent late-larval objects from being shared for general use before their fields have been properly initialized:class Parent { Parent() { super(); // warning: 'this' may not be fully initialized: OtherClass.foo(this); } } class Child extends Parent { String s; Child(String s) { super(); this.s = s; } }Warnings about the handling of late-larval objects are useful, but warnings can be ignored, and a subclass author cannot always control the coding conventions enforced in a superclass. Strict field initialization instead requires that fields be assigned while the object is in the early-larval state, before there is any possibility of leaking the object to outside code.
-
In some situations, you may wish to dynamically guarantee that a field is initialized before it is read, but without being forced to compute the field's value at initialization time. Rather than adding such complexity to the JVM, this kind of behavior is best provided via libraries.
For example, you can use a lazy constant to model a final variable with initialization code that executes on-demand, at the first attempt to read it:
class Constants { final LazyConstant<String> s = LazyConstant.of(() -> lazyInitializer()); }
Risks and Assumptions
-
New JVM features are costly. We anticipate that there will be multiple meaningful use cases for strict field initialization, which together will justify its cost. This depends, however, on the success of new language features that rely on the new integrity guarantees, such as those discussed earlier. It also depends on developers being willing to adopt alternatives to the traditional top-to-bottom instance initialization sequence.
-
There is a small risk that existing tools may set the
ACC_STRICT_INITflag on a field by mistake. The access flag value0x0800was historically used to indicatestrictfpmethods, which opted in to special strict floating-point semantics that became obsolete in Java 17. The chance of confusion is low, however, sincestrictfpis relevant only in class files of version60or earlier, whileACC_STRICT_INITis relevant only in class files of versionXX.65535.