May 15, 2023

Java: Native Memory Handling is Getting Easier

JDK 19, 20, 21 has a new preview API called 'Foreign Function & Memory API' that makes interacting with native memory and native libraries way easier. Before you had to use JNI, JNA or JNR to interact with native libraries.

Further, interacting with 'native' memory was clunky. You either used ByteBuffers, which have a clunky/dated API, max addressable size 2GByte plus 'freeing' relying on the GC, unless you do ugly reflection hacks. Or you went down the Unsafe route.

Anyway, if your projects allows, I would heavily recommend to the new APIs. If it is a project you deploy in your backend, you can control the JDK. If it is a new library, I would still think about starting/consider the new APIs, so that the library can be kick ass by the time it is mature.

Waiting for Response
Figure 1. Memory Hunger

Hello MemorySegment

Anyway, is the first snippet: We allocate some native memory and write hello world to it:

        var helloWorld = "Hello World";
        var bufferLen = helloWorld.length() + 1; // length in UTF-8 + 0 byte terminator.
        var m = MemorySegment.allocateNative(bufferLen,SegmentScope.auto());
        // this writes C-style string with a 0 byte terminator.
        m.setUtf8String(0, helloWorld);
        // reads a C-style string with 0 byte terminator
        String readBackString = m.getUtf8String(0);
        System.out.println(readBackString);

Ok, this doesn’t look that exiting, but note:

  1. All indexes are longs, no more 2 Gyte limit! Use these 100s of GBytes of memory in a server or memory map large files.

  2. Not stateful .flip() like in Byte buffers. It acts like a plain section of memory.

  3. There are bounds checks enforced.

Free the memory

How does the allocated memory get freed? In the above example, its freed by the GC. Are we right back to the issues with ByteBuffer where you do not know when memory is freed? Nope, we have more control this time. The allocateNative takes a scope:

// Freed by the GC (worst case never!)
var m = MemorySegment.allocateNative(bufferLen, scope);
// Never freed, until the JVM stops
var m = MemorySegment.allocateNative(bufferLen,SegmentScope.global());

For more control, you can use an Area, which gives an explicit live time. I think this what you should do when ever possible.

try (var arena = Arena.openConfined()) {
    var m = MemorySegment.allocateNative(6L * 1024 * 1024 * 1024, arena.scope());

    // use the memory
} // segment region deallocated here

However, this has one major limitation: The memory segment can only be accessed from the thread which created it. This ensures that when the scope closes there is no other thread is looking at that memory.

try (var arena = Arena.openShared()) {
    var m = MemorySegment.allocateNative(6L * 1024 * 1024 * 1024, arena.scope());


    // use the memory, it can be accessed from other threads.
} // segment region deallocated here

There is also the an Area which can be shared across threads: In my opinion it is better than global and auto scopes. However, it does involve more JVM machinery than the openConfined() scope: The JVM ensures that any concurrent thread accessing this memory gets a proper exception if the memory gets closed, and doesn’t segfault or go off into an undefined behavior walk.

Reading/Writing

Once you have a MemorySegment, you can read and write values to it like so:

MemorySegment m = ...;
long offset = ...;
int aInt = m.get(ValueLayout.JAVA_INT, 0);
long aLong = m.get(ValueLayout.JAVA_LONG, 0);
// etc

So far the API looked like ByteBuffers made great again. However, the API, as the name says, is mostly concerned about memory. So, by default it uses the underlying machines endianness. If you need a specific endianness, like if your memory is actually for a file or network, then you have to specify it on writes/reads:

var INT_LITTLE_ENDIAN = ValueLayout.JAVA_INT.withOrder(ByteOrder.LITTLE_ENDIAN);
int aInt = m.get(INT_LITTLE_ENDIAN, 0);

Alignment

Another thing from the memory land, if your this code, it will crash:

long aInt = m.get(ValueLayout.JAVA_INT, 1);

// throws: java.lang.IllegalArgumentException: Misaligned access at address

By default, access to MemorySegment needs to be aligned. If you want unaligned access, you need to specify that:

var INT_BYTE_ALIGNED = ValueLayout.JAVA_BYTE.withBitAlignment(8);
long aInt = m.get(INT_BYTE_ALIGNED, 1);

Explore there is more!

Note that there is more to this API. I didn’t experiment with these parts yet. - Specify memory layouts (think C-Structs & unions). - Call into C-libraries, including C functions calling back into Java. - Tooling to create bindings from C-headers.

Tags: Java Development