Hello, I am a noob trying to learn low level programming from handmade hero. I really like nim and im trying to learn how to do C stuff in nim, is that wrong? I'm trying to translate this code into nim but it's not working, If anyone has any tips I'd be very happy to hear them. Maybe I'm approaching this the wrong way? Thank you very much.
static void
RenderWeirdGradient(game_bitmap_buffer *BitmapBuffer,
int XOffset, int YOffset, int ZOffset)
{
int Width = BitmapBuffer->Width;
int Height = BitmapBuffer->Height;
int Pitch = Width * BitmapBuffer->BytesPerPixel;
uint8* Row = (uint8*)BitmapBuffer->Memory;
for (int Y = 0;
Y < Height;
++Y)
{
uint32* Pixel = (uint32*)Row;
for (int X = 0;
X < Width;
++X)
{
uint8 Red = ( uint8 ) ( X + XOffset );
uint8 Green = ( uint8 ) ( Y + YOffset );
uint8 Blue = ( uint8 ) ( sin ( ZOffset ) + 255 );
*Pixel++ = (((Red << 16) | Green << 8) | Blue);
}
Row += Pitch;
}
}
nim
var bytesPerPixel = 4
var bitmapMemorySize = (bitmapWidth * bitmapHeight) * bytesPerPixel
bitmapMemory = VirtualAlloc( nil, cast[SIZE_T](bitmapMemorySize), MEM_COMMIT, PAGE_READWRITE )
var pitch = width * bytesPerPixel
var row:ptr uint8 = cast[ptr uint8](bitmapMemory)
for y in 0 ..< bitmapHeight:
var pixel:ptr uint32 = cast[ptr uint32](row)
echo pixel.repr
for x in 0 ..< bitmapWidth:
var red = cast[uint8](x)
var green = cast[uint8](y)
var blue = cast[uint8](0)
pixel[] = ( ( ( red shl 16'u8 ) or green shl 8 ) or blue )
I tried doing the same thing a few years ago. For the most part, the translation is pretty smooth, but I got stuck when Casey started working on the sound stuff - I think because I couldn't find a good wrapper for direct sound back then. I'm not sure if there is one now.
Can you describe what isn't working, or what errors you are getting? One thing is you aren't incrementing your pixel with each increment of x or incrementing your row with y.
Also, I used a seq instead of allocating the bitmap array manually.
Like I said, I went a bit further than this, but I think this code is from around the same place, and should work:
import
winim/lean,
osproc,
ptrmath
var
globalRunning = false
bitmapInfo: BITMAPINFO
bitmapMemory: seq[uint32]
bitmapWidth: int32
bitmapHeight: int32
bitmapPointer: pointer
let bytesPerPixel = 4
proc renderWeirdGradient(xOffset, yOffset: int) =
var pitch = bitmapWidth * bytesPerPixel
var row = cast[ptr uint8](bitmapPointer)
for y in 0 ..< bitmapHeight:
var pixel: ptr uint32 = (ptr uint32)row
for x in 0 ..< bitmapWidth:
let blue = cast[uint8](x + xOffset)
let green = cast[uint8](y + yOffset)
pixel[] = ((uint32)green) shl 8 or blue
# pixel[] = pixel[] shl 8 or blue
# or blue
pixel += 1
row += pitch
and this is what I used for ptrmath (credit to Jehan from this thread https://forum.nim-lang.org/t/1188#7366):
template `+`*[T](p: ptr T, off: int): ptr T =
cast[ptr type(p[])](cast[ByteAddress](p) +% off * sizeof(p[]))
template `+=`*[T](p: ptr T, off: int) =
p = p + off
template `-`*[T](p: ptr T, off: int): ptr T =
cast[ptr type(p[])](cast[ByteAddress](p) -% off * sizeof(p[]))
template `-=`*[T](p: ptr T, off: int) =
p = p - off
template `[]`*[T](p: ptr T, off: int): T =
(p + off)[]
template `[]=`*[T](p: ptr T, off: int, val: T) =
(p + off)[] = val
Hello, thanks for your reply! Here is my code, it doesn't paint the bitmap with blue as it's supposed to. This is my first time writing Nim like this so i'm not sure my code is clean.
import winim/lean
template `+`*[T](p: ptr T, off: int): ptr T =
cast[ptr type(p[])](cast[ByteAddress](p) +% off * sizeof(p[]))
template `+=`*[T](p: ptr T, off: int) =
p = p + off
template `-`*[T](p: ptr T, off: int): ptr T =
cast[ptr type(p[])](cast[ByteAddress](p) -% off * sizeof(p[]))
template `-=`*[T](p: ptr T, off: int) =
p = p - off
template `[]`*[T](p: ptr T, off: int): T =
(p + off)[]
template `[]=`*[T](p: ptr T, off: int, val: T) =
(p + off)[] = val
#GLOBAL VARIABLE !!!!!!!
var Running {.global.} :bool = false
var bitmapInfo:BITMAPINFO
var bitmapMemory:pointer
var bitmapWidth:int32
var bitmapHeight:int32
proc Win32ResizeDIBSection(width:int32, height:int32) =
if (bitmapMemory != nil ):
VirtualFree(bitmapMemory, 0, MEM_RELEASE)
bitmapWidth = width
bitmapHeight = height
bitmapInfo.bmiHeader.biSize = cast[DWORD](sizeof(bitmapInfo.bmiHeader))
bitmapInfo.bmiHeader.biWidth = bitmapWidth
bitmapInfo.bmiHeader.biWidth = -bitmapHeight
bitmapInfo.bmiHeader.biPlanes = 1
bitmapInfo.bmiHeader.biBitCount = 32
bitmapInfo.bmiHeader.biCompression = BI_RGB
#bitmapInfo.bmiHeader.biSizeImage = 32
var bytesPerPixel = 4
var bitmapMemorySize = (bitmapWidth * bitmapHeight) * bytesPerPixel
bitmapMemory = VirtualAlloc( nil, cast[SIZE_T](bitmapMemorySize), MEM_COMMIT, PAGE_READWRITE )
var pitch = width * bytesPerPixel
var row = cast[ptr uint8](bitmapMemory)
for y in 0 ..< bitmapHeight:
#var pixel = cast[ptr uint32](row)
var pixel = cast[ptr uint8](row)
for x in 0 ..< bitmapWidth:
var red = cast[uint8](x)
var green = cast[uint8](y)
var blue = cast[uint8](0)
#pixel[] = ( ( ( red shl 16 ) or green shl 8 ) or blue )
pixel[] = 255;
pixel += 1
pixel[] = 0;
pixel += 1
pixel[] = 0;
pixel += 1
pixel[] = 0;
pixel += 1
row += pitch
proc Win32UpdateWindow(deviceContext:HDC, windowRect:RECT, x:int, y:int, width:int, height:int) =
var windowWidth = windowRect.right - windowRect.left
var windowHeight = windowRect.bottom - windowRect.top
#echo windowWidth, windowHeight
StretchDIBits(deviceContext,
#[x, y, width, height,
x, y, width, height,]#
0, 0, bitmapWidth, bitmapHeight,
0, 0, windowWidth, windowHeight,
bitmapMemory,
bitmapInfo,
DIB_RGB_COLORS, SRCCOPY)
proc Win32MainWindowCallback(Window:HWND, Message:UINT, WParam:WPARAM, LParam:LPARAM):LRESULT {.stdcall.} =
var Result:LRESULT = 0
case Message:
of WM_SIZE:
echo "WM_SIZE"
OutputDebugStringA("WM_SIZE\n")
var clientRect:RECT
GetClientRect(Window, addr(clientRect))
var width = clientRect.right - clientRect.left
var height = clientRect.bottom - clientRect.top
Win32ResizeDIBSection(width, height)
of WM_DESTROY:
echo "WM_DESTROY"
#TODO(ibra): fix this global variable
Running = false
OutputDebugStringA("WM_DESTROY\n")
of WM_CLOSE:
echo "WM_CLOSE"
#TODO(ibra): fix this global variable
Running = false
OutputDebugStringA("WM_CLOSE\n")
of WM_ACTIVATEAPP:
echo "WM_ACTIVATEAPP"
OutputDebugStringA("WM_ACTIVATEAPP\n")
of WM_PAINT:
echo "WM_PAINT"
var paint:PAINTSTRUCT
var deviceContext:HDC = BeginPaint(Window, paint)
var x = paint.rcPaint.left
var y = paint.rcPaint.top
var width = paint.rcPaint.right - paint.rcPaint.left
var height = paint.rcPaint.bottom - paint.rcPaint.top
# var Operation {.global.}= WHITENESS
# PatBlt(deviceContext, x, y, width, height, Operation)
# if(Operation == WHITENESS):
# Operation = BLACKNESS
# else:
# Operation = WHITENESS
var clientRect:RECT
GetClientRect(Window, clientRect)
Win32UpdateWindow(deviceContext, clientRect, x, y, width, height)
EndPaint(Window, paint)
else:
Result = DefWindowProc(Window, Message, WParam, LParam)
return Result
proc main =
#MessageBox(0, "Hello, world !", "Nim is Powerful", 0)
var
instance = GetModuleHandle(nil)
windowclass: WNDCLASS
appname = "NimMadeHero"
message:MSG
windowclass.style = CS_OWNDC or CS_HREDRAW or CS_VREDRAW
windowclass.lpfnWndProc = Win32MainWindowCallback
windowclass.hInstance = instance
windowclass.lpszClassName = appname
if RegisterClass(addr(windowclass)) != 0:
var windowhandle:HWND = CreateWindowEx(0,
windowclass.lpszClassName,
appname,
WS_OVERLAPPEDWINDOW or WS_VISIBLE,
CW_USEDEFAULT,
CW_USEDEFAULT,
CW_USEDEFAULT,
CW_USEDEFAULT,
0,
0,
instance,
nil
)
#ShowWindow(windowhandle, SW_SHOW)
#UpdateWindow(windowhandle)
if windowhandle != 0:
Running = true
while Running:
#echo "loop"
var messageResult = GetMessage(message,0,0,0 )
if messageResult:
TranslateMessage(message)
DispatchMessage(message)
else:
#TODO(ibra):LOGGING
discard
else:
#TODO(ibra):LOGGING
discard
main()
I decided to simplify and go back a bit just to troubleshoot this part and setting a pixel byte to a solid color but it's still not working. Maybe I should try to use a seq like you have? instead of allocating manually? Btw you could just wrap parts of the direct sound library and use only the few procs from there. it's quite a big lib (2000LOC). Did you end up continuing handmade hero?using an UncheckedArray is easier than pointer math for raw buffers
let bitmapWidth = 5'i32
let bitmapHeight = 6'i32
var bitmapMemory = cast[ptr UncheckedArray[int32]](alloc0(bitmapWidth * bitmapHeight * sizeof(int32)))
for y in 0..<bitmapHeight:
for x in 0..<bitmapWidth:
bitmapMemory[x+y*bitmapWidth] = ((x shl 16) or (y shl 8))
Ok, i obviously don't know what the hell i'm doing.
Any resources for understanding this low level style of nim programming?
I was thinking maybe i should just do the low level stuff in C and then bind that to Nim and make a DSL.
i like http://zevv.nl/nim-memory/ for understanding Nim's memory model. reading the generated C code can be enlightening as well, and of course the docs/manual And keep asking questions here,or on irc/discord/reddit/stackoverflow
I'd recommend staying in Nim for the low level stuff, your DSL will be more powerful if it can access all aspects of your code.
Thanks shirleyquirk for the encouragement and the link! I played around with the stuff and this is what I got so far.
# object is stored on the stack
# ref is traced
# ptr is manual
proc echoSize[T](x:T)=
echo $sizeof(x) & " byte(s), " & $(sizeof(x)*8) & "-bits"
proc mutate (x: ptr int) =
x[] += 1
proc main()=
var x:int = 19
echo typeof(x)
echo "x = " & $x
echoSize x
echo " "
var p = addr(x)
mutate(p)
echo typeof(p)
echo "p = " & $p[]
echoSize p
echo p.repr
var y = cast[ptr UncheckedArray[int]](alloc0(3 * sizeof(int)))
#echo typeof(y)
#echo y.addr.repr
for i in 0 ..< 3:
y[i] = i*2+2
echo y[i]
echo y.repr
var q:ptr int = cast[ptr int](y)
#q[] *= 2
echo q[]
#var loc = cast[int](y)
#loc += 1
#y = cast[ptr UncheckedArray[int]](loc)
q = cast[ptr int](cast[int](q) + sizeof(int))
echo q[]
main()
yes, the dangerous stuff is intentionally ugly in Nim, it screams unsafe.
bear in mind you've got a whole runtime behind you, ready and willing to do your memory management, so unless it's absolutely necessary, (i.e. you're interfacing with C code) avoid using ptr and try using value types or ref types instead, you'll have a much nicer time.
If you know the buffer size at compile time, use an array
If its a resizeable array, just use a seq
If you truly need manual memory management, make yourself a constructor and at least an '=destroy' proc. read over https://nim-lang.org/docs/destructors.html to get an idea of how that works.
pointer math considered harmful, and shouldn't be necessary unless you're writing an allocator or something. if you find yourself reaching for it, there is almost certainly a cleaner, less error prone way.
Well. I don't want the runtime at all. Not because it is slow, on the contrary, it is very fast, but manual is faster ( check out Araq's fosdem 2020 when he compares ARC to memory pool ).
That's my main target, allocate a giant pool and subdivide it ( which I am still learning about )
I will also need some assembly intrinsics later on for stuff like rdtsc() or SIMD.
I wish it wasn't intentionally ugly.
I wish it wasn't intentionally ugly.
So write 4 helper procs/templates to make it as sexy as you need it.
check out Araq's fosdem 2020 when he compares ARC to memory pool
Yes, for a program that does allocations and deallocations and nothing else. Maybe your program does something useful instead...
And code like
var pitch = width * bytesPerPixel
var row = cast[ptr uint8](bitmapMemory)
for y in 0 ..< bitmapHeight:
#var pixel = cast[ptr uint32](row)
var pixel = cast[ptr uint8](row)
for x in 0 ..< bitmapWidth:
var red = cast[uint8](x)
var green = cast[uint8](y)
var blue = cast[uint8](0)
#pixel[] = ( ( ( red shl 16 ) or green shl 8 ) or blue )
pixel[] = 255;
pixel += 1
pixel[] = 0;
is not "efficient", it's what you write when you have no idea that OpenGL/Direct X exist, libraries which try hard to run these snippets on the GPU. It's not efficient for the hardware, it's not efficient to debug, it's not efficient to write. You'll spend your time debugging subtle crashes and not get anything done.
That's my main target, allocate a giant pool and subdivide it ( which I am still learning about )
Er. If you "subdivide" it, you turn it into a more general purpose allocator and by then it's not faster than the other general purpose allocators... Most likely it's slower as it isn't an O(1) allocator with proven bounds on fragmentation.
Language runtimes and graphic libraries are not written by idiots who don't understand the superiority of "handmade hero"'s approach to software development. I do understand it, it's called "superstition" and simply doesn't work as well as science.
That looks very cool! Did you make it? I really like the environment!
Yeah you can actually go very far with the default SDL2 renderer but once you require shaders, you need to add something like GLAD and let SDL2 handle the windowing and inputs (...) only.
Btw you can go really far with only a software renderer but of course you will hit a plateau at some point.
once you require shaders, you need to add something like GLAD and let SDL2 handle the windowing and inputs (...) only.
Not necessarily. If you use Vladar4's excellent SDL2 wrapper you can use SDL GPU to get actual shaders with a SDL-friendly API.
Thanks amalek, that looks interesting indeed, thanks for sharing!
I managed to rewrite the ugly pointer code using a seq from the first try! I love Nim !
proc resizeTextureAuto(renderer:var Renderer, width: int, height:int ) =
if globalTexture != nil:
destroyTexture(globalTexture)
if pixelSeq.len != 0:
pixelSeq.setLen(0)
globalTexture = createTexture(renderer, PIXELFORMAT_ABGR8888,
TEXTUREACCESS_STREAMING,
globalWidth, globalHeight)
for y in 0 ..< globalHeight.int:
for x in 0 ..< globalWidth.int:
var red = cast[uint8](x)
var green = cast[uint8](y)
var blue = cast[uint8](0)
var compute = (cast[uint32](red ) shl 16) or (cast[uint32](green ) shl 8) or (cast[uint32](blue shl 0))
pixelSeq.add(compute)
How can I make sure I'm getting maximum performance when writing code that uses traced data types? Is there a set of rules/guidelines?
I also noticed that move semantics come into play when I want to move this data around in the codebase.
Thanks!
There is a set of rules but as Nim's optimizer is getting more aggressive the set of rules becomes more complex.
Compile with --gc:orc and let Nim's optimizer deal with it. To see what the optimizer does, use --expandArc:functionNameHere.
But you still don't need "maximum performance". You need simple code that doesn't crash.
Question beside: The loop could be written using let instead of var. Should one really care about it at trivial places like this in order to maximize performance? Or is the compiler clever enough anyway?
for y in 0 ..< globalHeight.int:
for x in 0 ..< globalWidth.int:
let red = cast[uint8](x)
let green = cast[uint8](y)
let blue = cast[uint8](0)
let compute = (cast[uint32](red ) shl 16) or (cast[uint32](green ) shl 8) or (cast[uint32](blue shl 0))
pixelSeq.add(compute)
In the new ARC/ORC runtime let vs var makes no difference. In the old runtime let can be faster, but don't worry about it.
What you can worry about is pixelSeq.add in the nested loop, preallocate the pixelSeq via pixelSeq = newSeqOfCap[TypeHere](globalHeight.int * globalWidth.int).