I know that some library like the following zlib.nim enable to compress a string.
https://github.com/nim-lang/zip/blob/master/zip/zlib.nim
But I can't write it because the compressed string is not ascii or unicode. Maybe some illegal character contains it. Then, I wonder how to do it. Should I restrict characters to use compressed string?Or should I escape some characters? Both of them makes the size of source code bigger and I want to avoid it...
const s = r"hogehoge" # this is compressed string of non-ascii code
load(s) # I can write it by macro
if it's not valid unicode, your best bet is to store it in a separate file and use staticRead to load it into a string at compile-time.
const s = staticRead("myfile.bin")
load(s)
If I understand correctly, you would like to include a non-printable byte, like 0x15 (ASCII NAK), in a string literal in source code. If so you can do it using "x" escapes like this:
const s = "hello\x15\x2f"
This does not increases (binary) code size.
You can combine exelotl's method with compression so that the embedded asset takes even less space. A good example is:
https://github.com/guzba/supersnappy/blob/master/examples/compiletime.nim
Thanks for these replies! I am searching exelotl's method.
What I want to do is gather all compressed data(source code) to one .nim files. I am attending AtCoder(programming contest) using Nim. In this contest, submission by only one file is pemitted and cannot submit multiple files. Then, the contest server compile the submit code and run the tests. Since, there is a restriction for the size of submit code(not the size of binary), I want to compress the code.
I want to write the information of zip file in some way (not only by string). I didn't know how to escape non-printable byte in string, now I solved it. But it may makes the code size larger...
I am also finding another way to write them other than raw string.
If your data is somehow mostly ASCII, then the 4x cost of "x1b" might be fine. But compression into base64 is probably ideal, and base64's in the stdlib.
Here is my code. It can be compiled. I want to compile with commented out part which is same operation in compile time. But I couldn't do it even though I put {.compiletime.} pragma...
const libz = "libz.so.1"
type
Uint* = cuint
Ulong* = culong
Ulongf* = culong
Pulongf* = ptr Ulongf
Pbyte* = cstring
Pbytef* = cstring
Allocfunc* = proc(p: pointer, items: Uint, size: Uint): pointer{.cdecl.}
FreeFunc* = proc(p, address: pointer){.cdecl.}
InternalState*{.final, pure.} = object
ZStream*{.final, pure.} = object
nextIn*: Pbytef
availIn*: Uint
totalIn*: Ulong
nextOut*: Pbytef
availOut*: Uint
totalOut*: Ulong
msg*: Pbytef
state*: ptr InternalState
zalloc*: Allocfunc
zfree*: FreeFunc
opaque*: pointer
dataType*: cint
adler*: Ulong
reserved*: Ulong
const
ZLIB_VERSION = "1.2.11"
Z_NO_FLUSH = 0
Z_OK = 0
Z_STREAM_END = 1
Z_BUF_ERROR = -5
Z_NO_COMPRESSION* = 0
MAX_WBITS = 15
proc inflate*(strm: var ZStream, flush: cint): cint{.cdecl, dynlib: libz, importc: "inflate".}
proc inflateEnd*(strm: var ZStream): cint{.cdecl, dynlib: libz, importc: "inflateEnd".}
proc inflateInit2u*(strm: var ZStream, windowBits: cint, version: cstring, streamSize: cint): cint{.cdecl, dynlib: libz, importc: "inflateInit2_".}
proc inflateInit2(strm: var ZStream, windowBits: cint): cint = inflateInit2u(strm, windowBits, ZLIB_VERSION, sizeof(ZStream).cint)
proc uncompress*(sourceBuf: cstring, sourceLen: Natural): string =
assert (not sourceBuf.isNil) and sourceLen >= 0
var z: ZStream
var d = ""
var sbytes, wbytes = 0
z.availIn = 0
var wbits = MAX_WBITS + 32
var status = inflateInit2(z, wbits.cint)
if status != Z_OK: assert false
while true:
z.availIn = (sourceLen - sbytes).Uint
if sourceLen-sbytes<=0: break
z.nextIn = sourceBuf[sbytes].unsafeaddr
while true:
if wbytes >= d.len:
let n = if d.len == 0: sourceLen*2 else: d.len*2
if n < d.len: discard inflateEnd(z); assert false
d.setLen(n)
let space = d.len - wbytes
z.availOut = space.Uint;z.nextOut = d[wbytes].addr;status = inflate(z, Z_NO_FLUSH)
if status.int8 notin {Z_OK.int8, Z_STREAM_END.int8, Z_BUF_ERROR.int8}:discard inflateEnd(z);assert false
wbytes += space - z.availOut.int
if not (z.availOut == 0):break
if (status == Z_STREAM_END):break
discard inflateEnd(z)
if status != Z_STREAM_END:assert false
d.setLen(wbytes)
swap result, d
proc uncompress*(sourceBuf: string):string = uncompress(sourceBuf, sourceBuf.len)
import base64
const s = "eJxLTc7IV1DySM3JyVcIzy/KSVECADp4BiA="
#static:
# var sd = s.decode
# echo uncompress(sd)
var sd = s.decode
echo uncompress(sd)
If you need it to be in 1 file, maybe you could try something like this?
#[
testing, put your dirty unprintable string here
]#
const s = staticRead("main.nim")[3..49]
echo s
Output:
testing, put your dirty unprintable string here
The compressed code is embedded into the block comment. The program staticRead's itself and takes a slice to get only the stuff inside the comment. You are then free to decompress it and use it to generate the real code at compile time.
Wow! This method is fantustic!
Thank you very much!
well, I thought it was a neat concept but I actually don't think it's necessary. It seems like you can just straight up put invalid unicode in a string literal.
const s = """
�����������
"""
echo s
works for me. I filled out the string with FFFFF... in a hex editor, which I believe is invalid. The compiler doesn't care, and successfully compiles it anyways.I tried your code. Actually string with FFFFF... was not failed. But unfortunetely, the other case of general string output by zip was failed. Your method of comment out by #[ ]# also failed...
Moreover, I noticed these code with invalid char cannot paste to this thread.
The compiler doesn't like something, maybe it's the null bytes?
I had to
echo "echo \"Hello, World\"" | gzip | base64
#H4sIAAAAAAACA0tNzshXUPJIzcnJ11EIzy/KSVHiAgBv5X7/FAAAAA==
import macros
macro exec(s:static string):untyped = s.parseStmt
exec:
staticExec "cut -c 2- in.nim | base64 -d | zcat"
Would cheating by using external utilities (zcat, cut, base64) work?
Thank you! I tried it in the code test in AtCoder environment and found it works! I felt writing decompression code in nim makes the submit source code a bit larger. So, your method of calling external utilities is better!
By the way your bash code in StaticExec outputs the following message in addition to nim code. I am studing commands to fix them.
base64: invalid input
echo "Hello, World"
gzip: stdin: decompression OK, trailing garbage ignored