Hi all,
Trying to learn nim's FFI by importing some openssl functionalities into my nim code. The code that I ended up with performs well, and is comparable to that of shasum. Then I went on to make it a bit more "OS neutral" by adding the proper library checks from nim's openssl.nim, and immediately found it to be slow. I don't understand how that could happen, because all that we do extra is some additional checks (isn't it??)
Fast: https://gist.github.com/vimal7370/6d8e4246c5a13ed53a86e3b767a33fd7 Slow: https://gist.github.com/vimal7370/2ed516b2f02218d401acb96eca456649
Can you compile both of your tests with -d:danger and with --passC:-flto ?
Just to ensure that the reason is not related to additional debugging code or to different function inlining.
Still got similar results:
$ nim c -d:ssl -d:danger --passC:-flto trysha1.nim
$ nim c -d:ssl -d:danger --passC:-flto trysha2.nim
$ time ./trysha1
24fd6a704e0d80c4b4f9a3d17ce0db23f258a8cdcfa1eb28d7803b7d1811ee96
real 0m0.618s
user 0m0.570s
sys 0m0.037s
$ time ./trysha2
24fd6a704e0d80c4b4f9a3d17ce0db23f258a8cdcfa1eb28d7803b7d1811ee96
real 0m3.422s
user 0m3.320s
sys 0m0.064s
That is strange.
Try to echo the actual const values like DLLSSLName for both cases to ensure that really in both cases the same lib version is used.
Hi Stefan,
DLLSSLName in trysha2.nim is: "(.1.1|.38|.39|.41|.43|.44|.45|.46|.47|.10|.1.0.2|.1.0.1|.1.0.0|.0.9.9|.0.9.8|)"
So I changed the following line in trysha1.nim:
{.push callconv:cdecl, dynlib:"libssl.dylib", importc.}
to
{.push callconv:cdecl, dynlib:"libssl(.1.1|.38|.39|.41|.43|.44|.45|.46|.47|.10|.1.0.2|.1.0.1|.1.0.0|.0.9.9|.0.9.8|).dylib", importc.}
and the results remained the same (trysha1: 0m0.597s, trysha2: 0m3.214s). Have I followed your instruction correctly? What else could be the reason?
What I meant was for your slow version:
echo DLLSSLName
assert(DLLSSLName == "libssl.dylib"
Another test: In your fast version you have
{.push callconv:cdecl, dynlib:"libssl.dylib", importc.}
and in your slow version
{.push callconv:cdecl, dynlib:DLLSSLName, importc.}
So replace
#{.push callconv:cdecl, dynlib:DLLSSLName, importc.}
with
{.push callconv:cdecl, dynlib:"libssl.dylib", importc.}
I assume that you run both tests multiple time, so that not for the one test the file is loaded from a slow medium and for next test from fast buffer. And a virus checker can not be involved?
Another critical point:
const sslVersion {.strdefine.}: string = ""
I have not fully understood what the strdefine does, can it have some effect for other modules? All your other const values should have no effect, you use DLLSSLName* to export that name, maybe try without export marker, but that can not really be a problem.
I think we have to wait for mratsim.
Hi Stefan,
Actually, there is no DLLSSLName involved in the fast one (trysha1.nim), as it is loading "libssl.dylib" hard-coded.
The value of DLLSSLName in the slow one (trysha2.nim) ends up to be: "(.1.1|.38|.39|.41|.43|.44|.45|.46|.47|.10|.1.0.2|.1.0.1|.1.0.0|.0.9.9|.0.9.8|)"
As per your 2nd test, if I simply use:
{.push callconv:cdecl, dynlib:"libssl.dylib", importc.}
in trysha2.nim (slow one), it isn't slow anymore, and I get the exact results which I get from the fast one.
What do you think of this? Also, yes, tests were run multiple times and the results are always consistent. No virus checker involved.
You have for the slow case
{.push callconv:cdecl, dynlib:DLLSSLName, importc.}
with
DLLSSLName == "(.1.1|.38|.39|.41|.43|.44|.45|.46|.47|.10|.1.0.2|.1.0.1|.1.0.0|.0.9.9|.0.9.8|)"
Only numbers in that string? Can that work?
Well, at least the ored numbers seems to pick a slow lib version.
The core devs may comment on that.
Hi Stefan,
Well, at least the ored numbers seems to pick a slow lib version.
I would also like to believe the same, but if that was true, then when I use the same DLLSSLName string in the fast version, it should become slow.
I tried that (https://forum.nim-lang.org/t/6692#41524) and the fast version still remained fast, which makes me wonder whether the remaining OS related checks are the culprit in anyway?
Thanks a lot for your help, much appreciated.
Hi Stefan,
Well, at least the ored numbers seems to pick a slow lib version.
You are right. I dug further to check whether I have multiple versions of OpenSSL, and indeed I do.
Using dtruss, I came to realise that the fast one is loading libssl.0.9.8.dylib, whereas the slow one loads libssl.39.dylib (as per the order in that list), and this .39 one is too slow.
All good now. It is just an older version which was the culprit.
it is clear that this check for the presence of multiple libraries is taking almost 3 seconds
If your assumption is correct then you would get an offset in running time of about 3 seconds independent what file you process -- a tiny one or a very large one.
My assumption is more that a slow library version is choocen, so that you get a factor of about five when you test with different file sizes.
When it is indeed an offset of 3 seconds indicating that only selecting a lib takes that much time, then core devs should really consider improving it.
My assumption is more that a slow library version is choocen, so that > you get a factor of about five when you test with different file sizes.
You are correct. I indeed had multiple versions, and it chose a very old version of openssl library when using the below code:
{.push callconv:cdecl, dynlib:"libssl(.1.1|.38|.39|.41|.43|.44|.45|.46|.47|.10|.1.0.2|.1.0.1|.1.0.0|.0.9.9|.0.9.8|).dylib", importc.}
And that SSL libssl load all versions in incorrect but vodoo working order line continues to hunt nim.
To force openSSL version use -d:sslVersion=1.0.0
const sslVersion {.strdefine.}: string = ""
I have not fully understood what the strdefine does
It allows you to pass -d:sslVersion=somestring and the compiler will treat the line as if it were
const sslVersion = "somestring"