Hi All, I am very new to Nim, but it seems highly interesting for data science. My nooby question is that is there any comprehensive tutorial or documentation (with best practices etc.), how-to write a wrapper for ML libraries, which otherwise expose a C++ API? E.g.
XGBoost: https://xgboost.readthedocs.io/en/stable/c++.html
Catboost: https://catboost.ai/en/docs/concepts/c-plus-plus-api
LightGBM: https://lightgbm.readthedocs.io/en/latest/C-API.html
I already know that there is a PyTorch wrapper (I am not sure, how up-to-date or ready it is), but I would happily learn how to write bindings for further ML/DL libraries, too. Thanks for your support!
XGBoost has a C API: https://github.com/dmlc/xgboost/blob/master/include/xgboost/c_api.h
I've never used CatBoost though.
it seems highly interesting for data science
I agree :)
is there any comprehensive tutorial or documentation (with best practices etc.), how-to write a wrapper for ML libraries, which otherwise expose a C++ API?
not that I know of.
For xgboost there is already a wrapper, from what I can tell it binds the dll, not sure how the wrapper was generated: https://github.com/jackhftang/xgboost.nim
There was a time (6 years ago) where I was interested in reimplementing Sklearn but unfortunately I'm too busy with other projects.
However, I would go like this.
This is something that Nvidia recently had to do with rapids.ai and CuDF. And they did that within the past 3 years so there are a lot of lessons learned there and also strategies to minimize time-to-market.
The main issue in porting Sklearn is getting lost in the wild wild west of contributions with varying degrees of maturity.
I'm currently exploring this topic. i found this page very useful as an introduction: https://livebook.manning.com/book/nim-in-action/chapter-8/18
I'm also experimenting with automatic wrappers. Only with Futhark for the moment, but I will try c2nim and nimterop also.
My only knowledge of C is some introduction more than 20 years ago, but I managed to create bindings for raylib and make the example here run (https://github.com/raysan5/raylib) , so it's definitively doable :)
I think