nimforum mirror - Creating a seq or openarray on unmanaged memory

snej (orginal) [2020-05-26T19:52:55+02:00] view original

I'm wrapping a C function, a getter that returns a {const void*, size_t} struct pointing into memory managed by the C library. The obvious wrapper proc would copy the bytes into a seq[byte] and return that.

However, this C library is a high-performance key-value store (a relative of LMDB) that gets a lot of its speed by using memory-mapping and avoiding copying. So I want to avoid copying the bytes.

The only collection I've found that lets me do this is openarray, and I've found the functions that let me create one from a cstring and a length, but the manual says openarray can only be used as a parameter. That doesn't seem to be enforced, however: I can declare an object type containing an openarray without errors.

I'm thinking of doing something like this:

type Transaction* = ref object
  ...
type Result* = object
  bytes*: openarray[byte]
  owner: Transaction

proc get(t: Transaction, key: string): Result =
  ...

The owner field of the Result holds onto the Transaction object, keeping it alive so the bytes remain valid. (The C API only guarantees the availability of the data during the transaction it was accessed in.)

Is this OK, or something that could cause trouble?

—Jens

spip (orginal) [2020-05-26T20:33:20+02:00] view original

I haven't enough experience with inter-library memory sharing but have a look at Keeping track of memory.

mratsim (orginal) [2020-05-26T21:51:49+02:00] view original

openarray as value don't work at the moment https://github.com/nim-lang/RFCs/issues/178

You can store as ptr UncheckedArray[byte] + len.

On use either you use directly like you would index a pointer in C or if you interface with Nim libraries there is a zero-cost transformation to openarray via toOpenArray(ptr UncheckedArray[T], start, stopInclusive)

snej (orginal) [2020-05-26T22:02:58+02:00] view original

I want to provide a (reasonably) safe interface, and returning an UncheckedArray clearly wouldn't be safe.

The other approach I'm thinking of is to make the proc take a function parameter, and pass the openarray to the callback function. It makes the call site a bit ugly, but it'll be safe.

jasonfi (orginal) [2020-05-27T05:38:28+02:00] view original

I have just been researching something similar for string slices. I found this RFC: https://github.com/nim-lang/RFCs/issues/12 which lead me to this issue: https://github.com/nim-lang/RFCs/issues/178. Right now openarray is the only method to do this, but better approaches are coming soon.

cdome (orginal) [2020-05-27T11:54:36+02:00] view original

This proc from system you will need:

proc toOpenArray*[T](x: ptr UncheckedArray[T]; first, last: int): openArray[T] {.
    magic: "Slice".}

snej (orginal) [2020-05-27T19:11:49+02:00] view original

Wrap the ptr UncheckedArray, len pair in an object

I could do that, but this object would be a second-class citizen since it's neither an array nor seq nor string. I could implement [] and len, but it still wouldn't work with e.g. sequtils, right? I guess what I'm saying is that Nim doesn't seem to have a collection/sequence abstraction the way Swift, Rust, C++, Python etc. do.

mratsim (orginal) [2020-05-27T20:59:52+02:00] view original

Only Rust has memory safe zero-copy collection that can be stored in a type (their slice type).

C++ and D can somewhat emulate that with ranges but I don't thing they have proper lifetime/escape analysis as you would need a borrow checker for that.

Nim has openarray, but it cannot be stored, you have an escape hatch in ptr UncheckedArray[T].

If you are really concerned about zero-copy you should only use the it templates of sequtils (mapIt, foldIt and friends) and they would work with any Indexable type.

Also zero-copy and functional style of sequtils are usually at odds (though zero-functional helps a lot), if perf/allocation are really critical use a for loop.

snej (orginal) [2020-05-28T19:41:03+02:00] view original

Only Rust has memory safe zero-copy collection that can be stored in a type (their slice type).

Not exactly. Only Rust has this _plus compile-time safety checks and zero runtime overhead. But Go's slices are memory-safe and zero-copy and can be stored, for example. It accomplishes this by having the slice type contain a hidden reference to the object that owns the memory; thus the GC ensures that a slice can't outlive its backing store.

That would be pretty easy to support in Nim. I could easily make such an object myself, it just wouldn't work the same as an array/seq/string, as I said above. But thanks for the pointers [sic] to the "it" templates and zero-functional; I'll check those out.

Araq (orginal) [2020-05-28T20:17:33+02:00] view original

But Go's slices are memory-safe and zero-copy and can be stored, for example.

They are technically memory-safe but prone to aliasing bugs anyway.

Mirror of forum.nim-lang.org

6380 :: Creating a seq or openarray on unmanaged memory