Based on system.nim code:
const arrayDummySize = when defined(cpu16): 10_000 else: 100_000_000
type UncheckedArray {.unchecked.} [T] = array[0..arrayDummySize, T]
var x: ptr UncheckedArray[int8]
If you want the safety (and convenience) of a pure Nim array[..., ...] or seq[...] and can accept a performance hit (small in many cases), you could use an approach like this:
proc arrayGet[T](arr: ptr T, index: cint): T {.importc: "#[#]", nodecl.}
proc arraySet[T](arr: ptr T, index: cint, val: T) {.importc: "#[#] = #", nodecl.}
proc copyArrayC[T, U](a: openarray[T], num: int, conv: proc (x: T): U): ptr U =
assert(a.len >= num)
result = createU(U, num)
for i in 0..<num:
arraySet(result, i.cint, a[i].conv())
proc cleanupArrayC[T, U](a: var openarray[T], aC: ptr U, num: int, conv: proc (y: U): T) =
for i in 0..<num:
a[i] = arrayGet(aC, i.cint).conv()
discard resize(aC, 0)
proc intToCint(i: int): cint = i.cint
proc cintToInt(ci: cint): int = ci.int
proc intsToCints(a: openarray[int], num: int): ptr cint = copyArrayC(a, num, intToCint)
proc cintsToInts(a: var openarray[int], aC: ptr cint, num: int) = cleanupArrayC(a, aC, num, cintToInt)
As I'm sure you can see, this easily generalises to float/cfloat or indeed any other Nim/C array type conversions. Also, beware that this frees your C array. Remove the resize() call if that's not what you want.
The approach suggested by @def works too, although I personally dislike having unchecked arrays scattered everywhere. Otherwise, why not just program in C?
In the rare case that cintsToInts (say) causes a performance bottleneck, we can still:
N.B. You could (possibly) make copyArray/cleanupArray even faster by rewriting them as a template taking conv: untyped.
Thinking about this a bit more, {.unchecked.} arrays aren't needed at all. We can't safely pass them around to procs that will call len(), in any case.
If you must have (effectively) zero performance overhead, why not implement a wrapper type for C arrays? For instance,
type CArray[T] = object
data: ptr T
size: int
proc `[]`(ca: CArray[T], x: Slice[int]): T
proc `[]=`(ca: var CArray[T], x: Slice[int])
You could define len, items, pairs, and replacements for everything from sequtils if you felt inclined. There may be other syntactic sugar that could be applied. Unfortunately, Nim doesn't (AFAIK) provide a way to re-use all existing openarray (i.e. array and seq) code for such a type. Perhaps concepts will one day provide an answer.
I have absolutely no idea what def's suggestion would even do. You define an arbitrary array size, then you create an array of that size. And that's it? Where does that convert an unbounded array of bytes (and its length) into a bounded list?
As for mbaulch's first suggestion, that seems sensible. You'd have to be pretty stupid to use large blobs in a database, and their data only lasts as long as the statement hasn't been reset, so converting them is definitely what I had in mind.
I'm guessing arrayGet and arraySet are some sort of magic macro-ish things, that get translated into "result = carray[i]" (except carray is an opaque pointer, far as Nim knows)? I would wonder if that couldn't be made more efficient. Shouldn't there be some sort of use of memcpy, that overwrites a block of data in managed memory?
mbaulch's second suggestion is... well, interesting at least. Not really useful for my purposes. I'd wonder how you would tell it to free the underlying C array when you're done with the wrapper object.
Anyway, thanks for answering.
Shouldn't there be some sort of use of memcpy, that overwrites a block of data in managed memory?
You could probably do this (AFAIK) for int and float arrays. It would rely of the representation of Nim arrays at the backend. There are a few caveats:
I'm not motivated to find out, because anything I learn could become quickly out of date. My understanding of the compiler, and of the guarantees Nim makes about its backend representations are both quite limited. For these reasons, I'd avoid this technique. You may feel differently.
I'd wonder how you would tell it to free the underlying C array when you're done with the wrapper object.
You're right. The underlying C array isn't freed. You'd have to do that manually.
You could probably do this (AFAIK) for int and float arrays.
What? Oh, no, no. I just wanted to only convert it to a byte array. I can scan that for more complex structure afterwards. I didn't mean a C library that passed me an array of integers where endianness matters, just raw bytes.
That's why I said int8 specifically.
You're right. The underlying C array isn't freed. You'd have to do that manually.
You could possibly do something with move semantics...
proc `=destroy`(ca: var CArray) =
if ca.data != null:
discard resize(ca, 0)
proc `=`(ca: var CArray, src: var CArray) =
ca.data = src.data
src.data = null
...or something. But again, in my case the C array is freed, and I wanted to copy it anyway, so I wanted to make it at least an array of bytes that Nim could understand.
Aah. You did say "byte blob". I focussed on the int8 and so that's why I thought endianness matters. If raw bytes is all you need, and are happy to handle endian issues yourself, copying into managed memory should be okay. Forgot about `=destroy`. Neat idea.
Good luck!
I you want to return a seq[int8], then I agree, that copy is the best way to go, since the seq type has value semantics. If you get this value, to modify it, then you should use the unchecked array way.
I remember that in Go (programming language) I once wrote a wrapper that took the pointer and size, and made a go slice out of them without copying. That has the advantage, that it really behaves like the C version, I mean you can do modifications to the data that have an effect in the C library. And the advantage, that you work with a bounds checked slice type. But that only worked, because passing a slice in Go does not copy/owns the underlying data.
I don't think this would be possible in nim, because a seq owns the data meaning, if the seq is gone, nim want's to free the content.
Well, here's my latest attempt. It works... assuming the (C backend) header for a seq[] doesn't stop being TGenericSeq. Copying the buffer into the seq's raw data area, after setting the size of the sequence sufficiently. Obviously only works generally for 8 bit item sequences, like seq[int8] or seq[char]
Just ignore the "makebuf" function. I just did that to get a C generated buffer to play with.
{.emit: """
#include <assert.h>
void memcpySeq(void** dest, void* src, int len) {
TGenericSeq* seq = ((TGenericSeq*)*dest);
assert(len <= seq->len);
memcpy(*dest+
sizeof(TGenericSeq), // header
src, len);
seq->len = len;
}
""".}
# proc memcpy[T](dest: array[0..T,int8], src: pointer, len: int) {.importc: "memcpy",header: "<string.h>".}
proc memcpy[T](dest: var seq[T], src: pointer, len: int) {.importc: "memcpySeq",header: "<string.h>".}
from macros import getType,kind,typeKind,toStrLit,`$`
import macros
# can't do a template, since we need to check the type of kind...
macro onebyte(kind: expr): string =
case kind.kind
of nnkSym:
echo("we got a symbol: ",kind)
else:
assert(false)
# assert(sizeof(kind.getType) == 1) sigh...
let skind = $kind
assert(skind == "char" or
skind == "int8" or
skind == "uint8",
"blobs can only have 1 byte items");
""
template toBlob(kind, src, size: typed): expr =
discard onebyte(kind)
var dest = newSeq[kind](size);
memcpy(dest,src,size);
dest
template toBlob(src, size: typed): expr =
toBlob(char, src, size)
# just a little (terrible) C for example
{.emit: """
#include <stdlib.h>
#include <string.h>
int makebuf(void** dest) {
*dest = malloc(0x10);
memset(*dest,'Q',0x10);
return 0x10;
}""".}
proc makebuf(dest: var pointer): int {.importc: "makebuf",nodecl.}
var a: pointer;
let c = makebuf(a);
echo("length of data is ",c);
var b = toBlob(a,c)
import typetraits
assert(b.type.name == "seq[char]")
assert(b[3] == 'Q',"the elements are not the same!")
assert(b.len == c,"The sequences are not the same length!")
echo(b)
# not sure why this is considered unsafe...
echo(cast[seq[int8]](b))
# but eh
echo(toBlob(int8,a,c))
# echo(toBlob(int,a,c))
Keeping in mind that nim won't generate the TGenericSeq structure unless there's a sequence in that very module...
Well, it's a hack, but at least it works.
{.emit: """
#include <assert.h>
void memcpySeq(void** dest, void* src, int len) {
memcpy(((char*)*dest)+
sizeof(TGenericSeq), // header
src, len);
}
""".}
# proc memcpy[T](dest: array[0..T,int8], src: pointer, len: int) {.importc: "memcpy",header: "<string.h>".}
proc memcpy[T](dest: var seq[T], src: pointer, len: int) {.importc: "memcpySeq",header: "<string.h>".}
from macros import getType,kind,typeKind,toStrLit,`$`
import macros
# can't do a template, since we need to check the type of kind...
macro onebyte(kind: expr): string =
case kind.kind
of nnkSym:
echo("we got a symbol: ",kind)
else:
assert(false)
# assert(sizeof(kind.getType) == 1) sigh...
let skind = $kind
assert(skind == "char" or
skind == "int8" or
skind == "uint8",
"blobs can only have 1 byte items");
""
template toBlob*(kind, src, size: typed): expr =
discard onebyte(kind)
var dest = newSeq[kind](size);
memcpy(dest,src,size);
dest
template toBlob*(src, size: typed): expr =
toBlob(char, src, size)
# just a little (terrible) C for example
when defined(test):
{.emit: """
#include <stdlib.h>
#include <string.h>
int makebuf(void** dest) {
*dest = malloc(0x10);
memset(*dest,'Q',0x10);
return 0x10;
}""".}
import typetraits
proc example() =
proc makebuf(dest: var pointer): int {.importc: "makebuf",nodecl.}
var a: pointer;
let c = makebuf(a);
echo("length of data is ",c);
var b = toBlob(a,c)
assert(b.type.name == "seq[char]")
assert(b[3] == 'Q',"the elements are not the same!")
assert(b.len == c,"The sequences are not the same length!")
echo(b)
# not sure why this is considered unsafe...
echo(cast[seq[int8]](b))
# but eh
echo(toBlob(int8,a,c))
# echo(toBlob(int,a,c))
example()
else:
var q: seq[int8]; # ensure we can access TGenericSeq