Java: Native Memory Handling is Getting Easier
JDK 19, 20, 21 has a new preview API called 'Foreign Function & Memory API' that makes interacting with native memory and native libraries way easier. Before you had to use JNI, JNA or JNR to interact with native libraries.
Further, interacting with 'native' memory was clunky. You either used ByteBuffers,
which have a clunky/dated API, max addressable size 2GByte plus 'freeing' relying on
the GC, unless you do ugly reflection hacks. Or you went down the Unsafe
route.
Anyway, if your projects allows, I would heavily recommend to the new APIs. If it is a project you deploy in your backend, you can control the JDK. If it is a new library, I would still think about starting/consider the new APIs, so that the library can be kick ass by the time it is mature.
Hello MemorySegment
Anyway, is the first snippet: We allocate some native memory and write hello world to it:
var helloWorld = "Hello World";
var bufferLen = helloWorld.length() + 1; // length in UTF-8 + 0 byte terminator.
var m = MemorySegment.allocateNative(bufferLen,SegmentScope.auto());
// this writes C-style string with a 0 byte terminator.
m.setUtf8String(0, helloWorld);
// reads a C-style string with 0 byte terminator
String readBackString = m.getUtf8String(0);
System.out.println(readBackString);
Ok, this doesn’t look that exiting, but note:
All indexes are longs, no more 2 Gyte limit! Use these 100s of GBytes of memory in a server or memory map large files.
Not stateful
.flip()
like in Byte buffers. It acts like a plain section of memory.There are bounds checks enforced.
Free the memory
How does the allocated memory get freed? In the above example, its freed by the GC.
Are we right back to the issues with ByteBuffer where you do not know when memory is freed?
Nope, we have more control this time. The allocateNative
takes a scope:
// Freed by the GC (worst case never!)
var m = MemorySegment.allocateNative(bufferLen, scope);
// Never freed, until the JVM stops
var m = MemorySegment.allocateNative(bufferLen,SegmentScope.global());
For more control, you can use an Area
, which gives an explicit live time.
I think this what you should do when ever possible.
try (var arena = Arena.openConfined()) {
var m = MemorySegment.allocateNative(6L * 1024 * 1024 * 1024, arena.scope());
// use the memory
} // segment region deallocated here
However, this has one major limitation: The memory segment can only be accessed from the thread which created it. This ensures that when the scope closes there is no other thread is looking at that memory.
try (var arena = Arena.openShared()) {
var m = MemorySegment.allocateNative(6L * 1024 * 1024 * 1024, arena.scope());
// use the memory, it can be accessed from other threads.
} // segment region deallocated here
There is also the an Area
which can be shared across threads:
In my opinion it is better than global
and auto
scopes.
However, it does involve more JVM machinery than the openConfined()
scope:
The JVM ensures that any concurrent thread accessing this memory gets a
proper exception if the memory gets closed, and doesn’t segfault or go off
into an undefined behavior walk.
Reading/Writing
Once you have a MemorySegment
, you can read and write values to it like so:
MemorySegment m = ...;
long offset = ...;
int aInt = m.get(ValueLayout.JAVA_INT, 0);
long aLong = m.get(ValueLayout.JAVA_LONG, 0);
// etc
So far the API looked like ByteBuffers
made great again.
However, the API, as the name says, is mostly concerned about memory.
So, by default it uses the underlying machines endianness.
If you need a specific endianness, like if your memory is actually for a file or network,
then you have to specify it on writes/reads:
var INT_LITTLE_ENDIAN = ValueLayout.JAVA_INT.withOrder(ByteOrder.LITTLE_ENDIAN);
int aInt = m.get(INT_LITTLE_ENDIAN, 0);
Alignment
Another thing from the memory land, if your this code, it will crash:
long aInt = m.get(ValueLayout.JAVA_INT, 1);
// throws: java.lang.IllegalArgumentException: Misaligned access at address
By default, access to MemorySegment needs to be aligned. If you want unaligned access, you need to specify that:
var INT_BYTE_ALIGNED = ValueLayout.JAVA_BYTE.withBitAlignment(8);
long aInt = m.get(INT_BYTE_ALIGNED, 1);
Explore there is more!
Note that there is more to this API. I didn’t experiment with these parts yet. - Specify memory layouts (think C-Structs & unions). - Call into C-libraries, including C functions calling back into Java. - Tooling to create bindings from C-headers.