nimforum mirror - atomics: Why is interlockedCompareExchange8 "safe"?

monster (orginal) [2017-11-12T13:39:52+01:00] view original

Hi,

I'm trying to understand what goes on in "lib\system\atomics.nim". This is in part because I'm missing atomicLoadN/atomicStoreN on Windows, and I'm trying to work out how to implement that myself. I've just stumbled upon this declaration (atomics.nim, line #220):

 interlockedCompareExchange8(p: pointer; exchange, comparand: byte): byte
  {.importc: "_InterlockedCompareExchange64", header: "<intrin.h>".}

At first, I though using _InterlockedCompareExchange64 was a bug, but then I found out that there is no _InterlockedCompareExchange8.

So, I guess exchange and comparand get cast to _int64, and the return value just gets cast to byte. So far so good.

But _InterlockedCompareExchange64 assumes p points to a _int64 value, and so will overwrite the 8 bytes at that location.

How can that not go horribly wrong?

cdome (orginal) [2017-11-12T22:40:10+01:00] view original

Actually _InterlockedCompareExchange8 does exist, It was added in VS2012 silently and it is not mentioned in the docs.

jwollen (orginal) [2017-11-13T04:54:15+01:00] view original

Had various issues with the vcc version too. Just made a pull-request

Araq (orginal) [2017-11-13T10:13:57+01:00] view original

Thank you, you rock! :-)

monster (orginal) [2017-11-13T12:57:33+01:00] view original

@cdome Great!

@Araq So, is/was (I'm assuming the '8' version is going to be used in the future), the call safe? I cannot imagine it would have been used at all, if it caused random memory overwrite. Maybe the 'compare' part of the call makes this safe? But then, why even bother offering 8/16/32 versions in the Windows API? Just because it's faster to only access as much memory as you need?

Araq (orginal) [2017-11-13T15:58:04+01:00] view original

Can't really remember but it was likely something like "argh this needs to compile now and I know my data is aligned at an 8 byte boundary". :-)

mikra (orginal) [2017-11-15T15:43:32+01:00] view original

within the atomics.nim the gcc-api is very different (and better) than the vcc ones. a consolidation would be nice. AtomicLoad could be replaced by atomicInc or atomicDec but unfortunately not AtomicStore. And doing cas is not the same as AtomicStore. If you need two atomics for a specific operation it´s not atomic anymore :-(

monster (orginal) [2017-11-15T20:25:41+01:00] view original

@mikra That is basically what I was trying to do; have a single API for all platforms. I could share it, once it works, but having a unified API in the standard library would surely be beneficial to many users.

On second thought, I'm not sure I want anyone to think I have any clue about the Windows atomics APIs; pthreads makes perfect sense to me, but the Windows calls are just weird. I'm just guessing what does what based on the online M$ docs.

EDIT: @mikra I finally finished typing my "unified atomics" module, and realized that nim was actually compiling with some kind of gcc/clang under the hood. It seems to even have support for the pthread API, totally against my expectations. If the default compiler under windows does have pthread support, maybe it would be simple for you to make your own missing atomic procs? Since I'm going to have to use vcc in the end (due to independent "technical reasons"), it doesn't actually help me.

mikra (orginal) [2017-11-15T23:04:30+01:00] view original

@monster agree the windows api is weird :-). Unfortunately I have not much experience with the vcc(thread model and so on) I just need atomicLoad and atomicStore. Also I don´t know what happens if you use the Nim compiler (built with gcc) and then the vcc to compile your app. It works for me actually.

Iam using now interlockedAnd(atomicLoad) and interlockedExchange(atomicStore). see here (at line 75/180). works perfect for me.

https://github.com/mikra01/timerpool/blob/master/timerpool.nim

both native calls are missing within atomics.nim. what do you think?

monster (orginal) [2017-11-15T23:23:32+01:00] view original

@mikra I can't say yet if my code will work, as I have to get Nim to use vcc first (found this thread), but my approach is somewhat different. Firstly, I tried to always use the "right size" call, by delegating to the appropriate Windows method using "when sizeof(T) == 8: ..." style code. Secondly, I also used "exchange" to replace "store" like you; I could not find anything better, but I have seen on stack-overflow people saying you should just set it non-atomically, and call a fence afterward. Maybe it works, but I didn't like that solution. Thirdly, I think "load" is better replaced by using "_InterlockedOr"; (x | 0) makes more sense to me than (x & F...).

What I still haven't understood yet, is why there seems to exist both "_InterlockedOr64_acq" and "InterlockedOr64Acquire" (for example), doing the same thing.

mikra (orginal) [2017-11-16T08:31:47+01:00] view original

@monster you are absolutely right. The or-solution is much better for the atomicLoad substitution. For your question: have a look at: https://docs.microsoft.com/en-us/cpp/intrinsics/intrinsics-available-on-all-architectures seems to me that the "_acq" functions are ARM-platform specific

monster (orginal) [2017-11-18T15:37:17+01:00] view original

@mikra Well, I'll eat a broom! M$ is too lazy to provide any kind of specific fences on x64 Windows except "full fences" (read-write). I (stupidly) wasted hours trying to replicate the pthread atomic API using functions that only exist on ARM/Itanium Windows. :( A search shows there are some pthread compatibility APIs out there for Windows, but AFAIK, none of them support the atomic part of pthreads. On the plus side, the API will be massively reduced now. Alternatively, I could just give up on C entirely; it seems M$ has support for C++11, including the atomic part.

mikra (orginal) [2017-11-18T20:16:15+01:00] view original

@monster yep; I digged also into the m$docs a bit. here is a overview (see the concurrency part): https://msdn.microsoft.com/en-us/library/hh567368.aspx

If you like to compile with cpp "just" include <atomics.h>: https://docs.microsoft.com/de-de/cpp/standard-library/atomic but I dont´t tried it up to now.

For me, the following properitary m$ solution works (x64, windows10, vs2017 community edition) https://github.com/nim-lang/Nim/issues/6760

Mirror of forum.nim-lang.org

3324 :: atomics: Why is interlockedCompareExchange8 "safe"?