Hey I recently added a PR for adding a 'bitops' module. It essentially add procs for optimized bit manipulation using compiler intrinsics when possible.
However I can only test on Linux x64 using gcc/clang. Preliminary support for other compilers is added but probably needs some testing/polishing.
Review, suggestions and criticism are welcome, here in the forum thread or on github.
see PR at: link