I went into this talk expecting a talk on Remote Function Calls.
https://www.youtube.com/watch?v=kUFysMkMS00
This talk, however, covers JEP-422, which is a better way to do JNI (Java Native Interface).
I had never directly worked with JNI before watching this talk. Therefore, it was outside my comfort zone and challenging while, at the same time, interesting to learn about the current limitations and future improvements.
I will most likely never have to interact with native code via Java. However, having such knowledge in your back pocket is interesting if needed.
I no doubt see third parties creating libraries that simplify the FFM API further via code-adding syntactic sugar.
The Current Solution - JNI
The current solution to calling libraries in another language is implementing a JNI adapter.
Java → JNI → Native Library
This may seem like a simple solution. However, implementing a JNI adapter involves implementing -
- A header with the function definition
- An implementation of the function that calls the native library
- A Java class containing the method that matches the function
The above approach is expensive to maintain, while data transfer to and from the JNI method is cumbersome.
Data Transfer
In the current solution, data can be transferred via Direct Buffers, an off-heap memory accessed from the C/C++ code via JNI functions.
Unfortunately, there is no way to free or unmap this memory, and it is limited to 2Gb addressing spaces.
Frameworks
Several frameworks provide syntactic sugar around the development of JNI functions. However, these frameworks have limitations and can make deployments more convoluted.
JEP-422 Foreign Functions and Memory API
Memory API - Accessing Native Memory
Memory Segment
JEP-422 introduces Memory Segment, a construct providing access to a contiguous memory section. Two types of memory need accessing.
- Heap - This is memory inside the Java Heap
- Native - This is memory outside the Java Heap
A memory segment will have the following characteristics -
- Size - the size will be defined, thereby limiting out-of-bound access
- Lifetime - Once memory has been freed, that section will no longer be accessible
- Confinement - This optional characteristic will prevent Data races.
MemorySegment point = Arena.ofAuto().allocate(8 * 2);
point.set(ValueLayout.JAVA_DOUBLE, 0, 3d);
point.set(ValueLayout.JAVA_DOUBLE, 8, 4d);
point.get(ValueLayout.JAVA_DOUBLE, 16); // ERROR - SINCE OUT OF BOUNDS
Challenges of Managing Memory in Java
Managing memory in Java comes at a tradeoff between Flexibility and Safety. The presenter illustrates that using the following diagram. The flexibility provided by a language like C comes with the price of security. Similarly, the safety provided by a language like Rust comes at the cost of flexibility.
Java does an excellent job balancing the two via its Garbage collector of on-heap Memory.
Managing off-Heap memory, however, does have challenges
- On-heap Java Buffer can point to a Big chunk of off-heap memory
- Reachability Graphs are what is constructed when attempting to clean up memory. This is expensive for off-heap memory deallocation
- The Garbage Collector cannot track if an off-heap memory is no longer used by native code.
Arena
To address the challenges listed in the previous section, JEP-422 introduces the concept of Arena
An Arena
is a construct used to model the lifecycle of one or more MemorySegment
There are four types -
- Global - This has an unbounded lifetime and is helpful for multithreaded access.
- Automatic - This has an automatic lifetime and is helpful for multithreaded access.
- Confined - This has an explicit bounded lifetime and is only used by a single thread.
- Shared - This has an explicit bounded lifetime but can be accessed by multiple threads.
MemorySegment leakedPoint = null;
try (Arena ofHeap = Arena.ofConfined()) {
MemorySegment point = leakedPoint = offHeap.allocation(8 * 2);
...
} // At this point, the Arena is freed
leakedPoint.get(ValueLayout.JAVA_DOUBLE, 0); // ERROR - No use after FREE
VarHandle
In the section on Memory Segments, we saw that we had to specify an offset (8) when setting points x and y.
JEP-422 introduces VarHandle
, which I think of as akin to a Function
A MemoryLayout
defines the properties of an object and usually maps one to one with a Java Structure.
Then, a VarHandle
can be used to define a handler for a property in MemoryLayout
This handler makes it easier to interact with a MemorySegment
without having to specify an offset
as shown in the example below -
MemoryLayout POINT_2D = MemoryLayout.structLayout(
ValueLayout.JAVA_DOUBLE.withName("x"),
ValueLayout.JAVA_DOUBLE.withName("y")
);
VarHandle xHandle = POINT_2D.varHandle(PathElement.groupLayout("x"));
VarHandle yHandle = POINT_2D.varHandle(PathElement.groupLayout("y"));
try (Arena ofHeap = Arena.ofConfined()) {
MemorySegment point = ofHeap.allocate(POINT_2D);
xHandle.set(point, 0L, 3d);
yHandle.set(point, 0L, 4d);
}
Foreign Function Access
Foreign Function access builds on the new memory concepts introduced in the previous section.
Method Handles
Downcalls - Using Java to invoke an external function
In the following example, a MethodHandle
(very much like a Varhandle
), can be used to obtain a reference to an external function
MemorySegment distanceAddress = SymblLookup.loaderLookup()
.lookup("distance").get();
MethodHandle distanceHandle = Linker.nativeLinker().downcallHandle(
distanceAddress,
FunctionDescriptor.of(JAVA_DOUBLE, POINT_2D));
This reference can then invoke the external function (within an Arena
)
double dist = distanceHandle.invokeExact(point);
The SymbolLookup
shown in the example above comes in three flavors -
- LoaderLookup - libraries from the current Loader
- DefaultLookup - standard libraries associated with the native linker
- LibraryLookup - Search a library whose lifecycle is managed by an arena
The SymbolLookup
lets you chain the above three lookups using a composition
The nativeLinker
shown in the example above, also provides a helper method to help expose Canonical layouts for common C types.
reinterpret
The reinterpret
method on a MemorySegment
lets you reinterpret a memory segment to reference all the allocated memory rather than a zero-length memory segment (traditional C behavior with malloc).
MemorySegment point = makePointHandle.invokeExact();
point = point.reinterpret(POINT_2D, ofHeap, ms -> freePointHandle.invokeExact(ms));
Upcalls - Passing Java Code to external function.
The example in the talk showed how a function that compares ints could be shipped as a pointer to an external function for use.
jextract
jextract --output classes --target-package org.stdlib /usr/include/stdlib.h
jextract
is a tool that simplifies the extraction of FFM (Foreign Function & Memory) library handles and layouts when given a library
Warnings
The assumption is that all calls to foreign functions or native methods are unsafe. Thus, these are currently treated and exposed as warnings
by the compiler. The warnings
can be suppressed by specifying which code modules allow access to native functions. In later releases of Java, these warnings are expected to become errors
.