I'm trying to use jnim to compile a shared library which can be used from Java via JNI. I have a minimal example in this debug branch.
On the Java side the code is:
public class NativeTest {
public native static void run();
public static void main(String[] args) {
System.loadLibrary("nativetest");
NativeTest.run();
}
}
On the Nim side it is just:
import jnim
proc Java_NativeTest_run*(env: JNIEnvPtr, obj: jobject) {. cdecl, exportc, dynlib .} =
system.setupForeignThreadGc()
echo "Printing from JNI..."
In general this works nicely, but unfortunately I'm getting random segfaults, with a <5% probability in this example.
I'm compiling it using nim --app:lib -o:libnativetest.so --threads:on c native.nim. In the first attempt I was omitting the system.setupForeignThreadGc() (and not enabling threads). @yglukhov suggested that there might be something wrong related to initializing TLS, so this was the first thing that came to my mind.
To rule out that it is a general JVM bug, I was doing the same using a C implementation. This does not seem to cause segfaults, so I think I might be doing something wrong here. Any ideas how to fix this?
The gdb stacktrace isn't particularly helpful to me either:
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7ffff7fc2700 (LWP 14740)]
0x00007fffe10002b4 in ?? ()
(gdb) where
#0 0x00007fffe10002b4 in ?? ()
#1 0x0000000000000246 in ?? ()
#2 0x00007fffe1000160 in ?? ()
#3 0x00007ffff7392bf0 in VM_Operation::_names () from /home/fabian/bin/jdk1.8.0_74/jre/lib/amd64/server/libjvm.so
#4 0x00007ffff7fc1980 in ?? ()
#5 0x00007ffff6ec5d2d in VM_Version::get_processor_features() () from /home/fabian/bin/jdk1.8.0_74/jre/lib/amd64/server/libjvm.so
Backtrace stopped: previous frame inner to this frame (corrupt stack?)
(gdb) list
1 events.c: No such file or directory.
never tried jnim until now, but I can see your method run is static, but your implementation still as an object argument. As far as I know, only non-static methods have a this argument.
I don't know how you got to the interface you wrote in Nim, but I would create some pipeline that creates a c header from the java class via javah, and then process that c header with c2nim, so that you do not get this type of errors.
Actually I was doing that, I only made a mistake simplifying the problem for demonstration purposes. On the master branch I have set up the whole SBT tool chain using the sbt-jni plugin, which comes with a javah interface for Scala. In my original version I obtained the correct signature from that. By the way, I'm actually pretty happy how well Nim integrates with the Scala+SBT world.
I just verified: Using the correct for the simplified example does not make a difference.
@Krux02: Everything prepared here: https://github.com/bluenote10/JniScalaNimExample
The master branch contains a SBT + Scala setup, including a customized build task (coupled to "compile") and source monitoring so that ~test / ~compile / ~run are fully functional. The debug branch is the minimal example (in Java for better reproducibility) which does segfault for me. But for instance @yglukhov already confirmed that he is not getting segfaults on MacOS with this example.
@Araq: That was also a suggestion by @yglukhov, but unfortunately it also crashes. Any pointers in which direction you would continue testing?
Hi, I wrote JNI sample code. Try this.
import jnim
import dynlib
proc Java_HelloJNI_hello*(env: JNIEnvPtr, me:jobject, arg:jstring ): jstring {.cdecl,exportc,dynlib.} =
if theEnv == nil :
{.emit: """
NimMain();
""".}
theEnv = env
var argNim:cstring = env.GetStringUTFChars(env,arg,nil)
defer:
env.ReleaseStringUTFChars(env,arg,argNim)
var ss = "Hello " & $argNim
return env.NewStringUTF(env,ss)
It varies a lot: When running the Scala example from the SBT shell it is maybe 1 out of 5. For the Java example on the debug branch I'm using the run.sh which simply performs 1000 iterations and terminates on a crash. I'm usually crashing within <200 iterations, sometimes in the first 10, max was maybe 400.
When replacing the .so by the plain C library I'm reaching 1000 iterations without a crash. But I'm still not sure if it may be a JVM bug, which is only triggered by something in the Nim implementation because:
I tried the sbt task, and no crash after more than 20 times running the run task. I simply cannot reproduce the error. I would like to have a run script for that task, too.
Btw, what is your current platform?
I am on Arch Linux, and I am on the latest development branch of Nim at the time of this writing.
I'm on Ubuntu 14.04, but with a more recent JVM (1.8.0_74-b02). Nim version is the last tagged release 0.15.2.
Regarding the repeating: You could simply use that other run.sh script and replace the java command by sbt run. Or if you want to avoid the load up time: Add addSbtPlugin("com.github.tkawachi" % "sbt-repeat" % "0.0.1") to the project/plugins.sbt and use repeat 100 run in the SBT shell.
just when I thought, that there won't be an error anymore, I got an error, but the stacktrace is weird:
> last compile:run
[debug] javaOptions: List(-Djava.library.path=src/native)
[info] Running scalanim.Main
java.lang.RuntimeException: Nonzero exit code returned from runner: 1
at scala.sys.package$.error(package.scala:27)
at sbt.BuildCommon$$anonfun$toError$1.apply(Defaults.scala:2128)
at sbt.BuildCommon$$anonfun$toError$1.apply(Defaults.scala:2128)
at scala.Option.foreach(Option.scala:236)
at sbt.BuildCommon$class.toError(Defaults.scala:2128)
at sbt.Defaults$.toError(Defaults.scala:39)
at sbt.Defaults$$anonfun$runTask$1$$anonfun$apply$46$$anonfun$apply$47.apply(Defaults.scala:769)
at sbt.Defaults$$anonfun$runTask$1$$anonfun$apply$46$$anonfun$apply$47.apply(Defaults.scala:767)
at scala.Function1$$anonfun$compose$1.apply(Function1.scala:47)
at sbt.$tilde$greater$$anonfun$$u2219$1.apply(TypeFunctions.scala:40)
at sbt.std.Transform$$anon$4.work(System.scala:63)
at sbt.Execute$$anonfun$submit$1$$anonfun$apply$1.apply(Execute.scala:228)
at sbt.Execute$$anonfun$submit$1$$anonfun$apply$1.apply(Execute.scala:228)
at sbt.ErrorHandling$.wideConvert(ErrorHandling.scala:17)
at sbt.Execute.work(Execute.scala:237)
at sbt.Execute$$anonfun$submit$1.apply(Execute.scala:228)
at sbt.Execute$$anonfun$submit$1.apply(Execute.scala:228)
at sbt.ConcurrentRestrictions$$anon$4$$anonfun$1.apply(ConcurrentRestrictions.scala:159)
at sbt.CompletionService$$anon$2.call(CompletionService.scala:28)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
It is all sbt classes, the main isn't even in the stacktrace at all. Maybe you know it better, but to me that very strange.
Yes, the traceback from within SBT indeed looks strange. To some degree this is a result of the forked VM. When running standalone or also with the pure Java example, there is no Java traceback at all, because the JVM segfaults straight away.
I was just stumbling over this comment:
There is one catchy situation in JNI code: when such a code blocks SIGSEGV signal e.g. because it blocks all signals (quite common approach in threaded C code how to ensure that only main thread will process signals) AND it calls 'back' Java VM (aka callback) then it can result in quite random SIGSEGV-triggered aborting of the process. And there is virtually nothing wrong - SIGSEGV is actually triggered by Java VM in order to detect certain conditions in memory (it acts as memory barrier … etc) and it expects that such a signal will be handled by Java VM. Unfortunately when SIGSEGV is blocked, then 'standard' SIGSEGV reaction is triggered => VM process crashes.
I have no idea how signal handling works in detail and will have to read up on it, but maybe something along the lines: The JVM trying to "communicate memory barriers via SIGSEGV" (wtf?) and the Nim signal handler catching it. Is there an easy way to test this? Like disabling Nim's signal handler altogether?