I recently watched a presentation on duckdb. It is an embeddable SQL OLAP database management system. Similar to SQlite but optimized for analytics.
I'm curious if anyone has been using this db with Nim?
> C++11, no dependencies, single file build
So building bindings shouldn't be too hard. You can check Nimterop, c2nim tools to automate it.
I tested it and simple connection works fine via python more advanced usage with nim needs consideration of how to pass strings from nim to duckdb so that thet appear single quoted after being passed through nimpy.
pip install duckdb==0.2.3 nimble install nimpy
nimduck.nim
import nimpy
let duckdb = pyImport("duckdb")
let os = pyImport("os")
let conn = duckdb.connect()
let cursor = conn.cursor()
discard conn.execute("CREATE TABLE test_table (i INTEGER, j STRING)")
# add some data
discard conn.execute("""INSERT INTO test_table VALUES (1, 'one')""")
discard conn.execute("""INSERT INTO test_table VALUES (1, 'two')""")
discard conn.execute("""INSERT INTO test_table VALUES (1, 'three')""")
discard conn.execute("""INSERT INTO test_table VALUES (1, 'four')""")
# as pandas df
echo (conn.execute("SELECT * FROM test_table").fetchdf())
echo()
echo()
# as numpy array
echo (conn.execute("SELECT * FROM test_table").fetchnumpy())
I have done a similar approach with Firebird3.0 server and their latest python firebird driver , which now works nicely for simple queries. Firebird can also work as embedded server and packs much more power than sqlite3.
This sounds to noob ears like mine that it might be possible to use the existing nim sqlite libraries?
Possibly. The issues with doing this, is that you're going to create conflict with code that include sqlite.
I have no real exp in C and C++
The API used is still a C-API that you will to understand. This can be a good learning exercise, the API doesn't look too convoluted. Understand the C examples and translate them in Nim.
writing a wrapper seems a bit beyond me at the moment though it is something I do hope to learn about eventually.
Creating a C wrapper from a single header with no dependencies is quite easy. Nimterop litterally does the job for you. Here, if you have trouble getting started : https://github.com/Clonkk/duckdb_wrapper .
Note that I don't plan to maintain this or write any tests or documentation, so you would be wise to create your own clean repository for a serious usage.
The following code works.. my understanding is that memory doesn't need to be allocated or freed:
var res: duckdb_result
if duckdb_query(con, "SELECT * FROM integers", addr(res)) == DuckDBError:
echo "Error"
# 3 Rows
for row in 0..<3:
# 2 Col
for col in 0..<2:
var val = duckdb_value_int64(addr(res), col.idx_t, row.idx_t)
echo val
This code however, produces unpredictable results:
var res : duckdb_result
if duckdb_query(con, "SELECT * FROM integers", addr(res)) == DuckDBError:
echo "Error"
# 3 Rows
for row in 0..<3:
# 2 Col
for col in 0..<2:
let val = duckdb_value_varchar(addr(res), col.csize_t, row.csize_t)
echo val
dealloc(val)
The documentation with it from nimterop says "! Converts the specified value to a string. Returns nullptr on failure or NULL. The result must be freed with free."
I suspect I may not be properly allocating and freeing memory. Any suggestions is appreciated.
Nim's dealloc is not C's free! You need to call C's free:
proc free(p: pointer) {.importc: "free", header: "<stdlib.h>".}
free(p)
@Araq Thank you.
I should have provided the generated proc. It returns a cstring, so would the gc automatically take care of this?
proc duckdb_value_varchar*(result: ptr duckdb_result; col: idx_t; row: idx_t): cstring {.
importc, cdecl, impduckdbDyn.}
! Converts the specified value to a string. Returns nullptr on failure or NULL. The result must be freed
It returns a cstring, so would the gc automatically take care of this?
Definitely not.
Won't compile because free expects a pointer.
However, I changed free to:
proc free(cstr: ptr cstring) {.importc: "free", header: "<stdlib.h>".}
Then free(addr(val)) doesn't fail and life goes on or so I hope.var val = duckdb_value_varchar(addr(res), col.csize_t, row.csize_t)
free(addr(val))
This code is wrong because you are passing the address of a variable to free. You have to pass val to free. Changing free 's parameter from pointer to ptr cstring does not fix the problem. Don't you get any runtime error in free(addr(val))?
It should be like this as Araq said:
proc free(p: pointer) {.importc: "free", header: "<stdlib.h>".}
# duckdb_value_varchar returns cstring
var val = duckdb_value_varchar(addr(res), col.csize_t, row.csize_t)
echo val
free(val)
It seems writing C library binding without knowing C is dangrous as C compiler cannot checks some errors like this. I think you need to learn C language. When you learn about a C library, you need to read documents written for C programmer. Most of C library bindings for Nim only provides Nim example codes, don't provide detailed documents of each procedures for Nim programmer. When you got a bug in C library, you need to read C code to find it and have to write a minimum and runnable C code that reproduce the bug when you report about the bug to library auther.
Just when I thought I was making progress :)... so any suggestions on how to make free(val) compile?
I apprecaite the warnings. Trust me, I would rather use one of DuckDBs client APIs. But I also agree with @Clonk that it could be a good learning experience.
what @demotomohiro said.
a cstring is a pointer to char, Nim is stricter about that than C, so you need to free(cast[pointer](val)) alternatively your free overload should be:
proc free(cstr: cstring) {.importc: "free", header: "<stdlib.h>".}
# Just a function returning `char *`
{.emit:"""char *duckdb_value_varchar(int x) {
char *result = malloc(20); // 20 should be enough for any `int`
sprintf(result, "%d", x);
return result;
}""".}
proc duckdb_value_varchar*(x: cint): cstring {.importc, cdecl.}
proc free(p: pointer) {.importc: "free", header: "<stdlib.h>".}
var val = duckdb_value_varchar(12)
echo val
free(val)
A dummy example using strdup as a C-function that returns an allocated char * that needs to be free'd:
proc free(p: pointer) {.importc: "free", header: "<stdlib.h>".}
proc strdup(c: cstring) : cstring {.importc: "strdup", header: "<stdlib.h>".}
proc main() =
var foo: cstring = "abcdefg"
# strdup in C does a memory allocation
# I just use this to emulate a proc that return an allocated pointer
var bar = strdup(foo)
echo bar
free(bar)
when isMainModule:
main()
Let's check with Valgrind that there is no memory leak :
==9640== Memcheck, a memory error detector ==9640== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al. ==9640== Using Valgrind-3.17.0 and LibVEX; rerun with -h for copyright info ==9640== Command: ./simpletest ==9640== abcdefg ==9640== ==9640== HEAP SUMMARY: ==9640== in use at exit: 0 bytes in 0 blocks ==9640== total heap usage: 3 allocs, 3 frees, 1,048 bytes allocated ==9640== ==9640== All heap blocks were freed -- no leaks are possible ==9640== ==9640== For lists of detected and suppressed errors, rerun with: -s ==9640== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
The examples above seem to work for me and appreciate them. I'm on windows right now wrt to valgrind. I appreciate that it is hard to figure out what is happening without more code and that this thread may be getting worn out.
... same as code from prev snippet
var foo: cstring = "abcdefg"
var bar = strdup(foo)
free(bar) # works no prob.
var val = duckdb_value_varchar(addr(res), 0.idx_t, 0.idx_t)
free(val) # SIGSEGV: Illegal storage access.
I see *duckdb_value_varchar uses strdup.
char *duckdb_value_varchar(duckdb_result *result, idx_t col, idx_t row) {
Value val = GetCValue(result, col, row);
return strdup(val.ToString().c_str());
}
https://github.com/duckdb/duckdb/blob/master/src/main/duckdb-c.cpp