Hello,
Is it possible to add a codegendecl pragma to a const? I'm getting a "Cannot attach a custom pragma to 'myFile'" error:
const myFile {.codegenDecl:"$# __attribute__((section(\".myFile\")))".} = staticRead("myFile.tar.xz")
In my use case, I need to read in the executable a large file. This file will then be stripped by the executable using objcopy --remove-section, but I can't seem to declare the const with the __attribute__((section())). Perhaps you know of a way to achieve this without codegendecl?
Thanks!
codegenDecl is under-specified and should be improved
in meantime this works, however ugly:
when defined case2:
from strutils import escape
let myFile2 {.importc, nodecl.}: cstring
{.emit: [""" __attribute__((section("DATA,.myFile"))) const char* myFile2 = """, staticRead("/tmp/z02.txt").escape.static , ";" ].}
echo myFile2
Thanks for the answer! Unfortunately my file is quite big (50mb), so the VM bails out on the escape call.
Do you reckon there is a way to make the string literal of the binary work? I have tried various methods with C preprocessor macros, like:
let omni_tar {.importc, nodecl.}: cstring
{.emit: """#define MULTI_LINE_STRING(...) #__VA_ARGS__"""}
{.emit: ["""__attribute__((section("DATA,.omni_tar"))) const char* omni_tar = MULTI_LINE_STRING(""", staticRead("build/omni.tar.xz").static, ");" ].}
which is not working since the binary data contains parenthesis ) that are not escaped, making the macro actually ending before the end of the file.
there are better ways to embed large binary data in an executable.
I think it'd be worthwhile adding APIs to make this easier (both to store and retrieve), this can be useful and it's tricky (but doable) to make this work cross-platform (hence the need for an API to implement it)
eg, see:
I tried on OSX, it does work, eg https://csl.name/post/embedding-binary-data/
nasm -fmacho64 $timn_D/tests/nim/all/t12414.asm -o /tmp/z01b.o
{.emit:"""
extern const char cat_start;
extern const char cat_end;
extern const int cat_size;
""".}
# nasm -fmacho64 $timn_D/tests/nim/all/t12414.asm -o /tmp/z01b.o
{.link: "/tmp/z01b.o".}
let cat_start {.importc, nodecl.}: ptr char
let cat_end {.importc, nodecl.}: ptr char
let cat_size {.importc, nodecl.}: cint
proc main()=
let n = cast[int](cat_end.unsafeAddr) - cast[int](cat_start.unsafeAddr)
doAssert n == cat_size.int
for i in 0..<cat_size.int: # or: UncheckedArray, etc
let c = cast[ptr char](cast[int](cat_start.unsafeAddr) + i)[]
echo (i, c)
main()
Thanks again,
Unfortunately I need to find a portable multiplatform solution. I know have a "builder" file that will output the contents of the binary to a txt file, which is then read in using the emit approach. Unfortunately, though, escape does not seem to be the right function here, as it triplicates the size of the txt file, and it does not represent the file as staticRead does (by looking at the resulting C code). Is there a way to get the exact C representation that staticRead does as a string?
builder.nim
import strutils
let omni_tar = readFile("build/omni.tar.xz")
writeFile("omni_tar.txt", omni_tar.escape("", ""))
writeFile("omni_tar_len.txt", $omni_tar.len)
reader.nim
{.emit:"""STRING_LITERAL(omni_tar_xz, "omni.tar.xz", 11);""".}
{.emit:["""__attribute__((section(".omni_tar,\"aw\""))) string_literal(omni_tar,"""", staticread("omni_tar.txt").static, "\",", staticread("omni_tar_len.txt").static, ");"].}
For anyone coming across this. This is a solution that is working flawlessly for me. I basically split the const part that loads the binary into its own module. Then, I overwrite the STRING_LITERAL macro in order to contain the __attribute__((section)) that I need. This way the overwriting of the macro only affects the one string that is contained in the module, which is the const:
#Redefining STRING_LITERAL to be including __attribute__(section).
#It needs to be in its own module or it will overwrite all implementations of STRING_LITERAL
{.emit:
"""
#define STRING_LITERAL(name, str, length) \
__attribute__((section(".omni_tar,\"aw\""))) static const struct { \
TGenericSeq Sup; \
NIM_CHAR data[(length) + 1]; \
} name = {{length, (NI) ((NU)length | NIM_STRLIT_FLAG)}, str}
"""
.}
#Embed the tar.xz file
const omni_tar_xz_file* = staticRead("build/omni.tar.xz")
#Throw / catch the exception where needed
type OmniStripException* = ref object of CatchableError
#Keep the write function local so that the const will be defined in this module, instead of being
#copied over to where it's used! writeFile will raise an exception after 'strip -R .omni_tar' has been used
proc omniUnpackTarXz*() =
try:
writeFile("omni.tar.xz", omni_tar_xz_file)
except:
raise OmniStripException()
this is not a good solution for embedding large binaries as the string encoding blows up by a factor 4X.
Really, the correct way is to build upon what I wrote in https://forum.nim-lang.org/t/8117#52130 and make it cross-platform and wrapped with an easy to use API , it's possible and not necessarily that hard.
I don't really see the 4x factor increase, but actually a 1 to 1 mapping with my approach.
The 45mb .tar.xz file is exactly of that size on the resulting executable. The size was blowing up when using the escape approach that was mentioned earlier.
I don't really see the 4x factor increase, but actually a 1 to 1 mapping with my approach.
the 3X or 4X increase is in the C file generated, since it contains escape sequences, eg:
STRING_LITERAL(TM__1m9cQUY9abSjxCA8ws9b1wH9ag_4, "\317\372\355\376\007\000\
the object file and binary will be 5MB du -sh /Users/timothee/git_clone/nim/timn/build/nimcache/@mt12666b.nim.c.o 4.9M /Users/timothee/git_clone/nim/timn/build/nimcache/@mt12666b.nim.c.o
the fact that this approach requires encoding/decoding and blows up intermediate size by 3-4X is not good if your embedded data is large (affects memory, intermediate disk size and compile times);
the other approach i suggested didnt' have this problem