Hey guys, I've been tinkering with FigDraw. It's a split out of the opengl rendering engine in Figuro, which itself was a modification of the one in Fidget.
The main goal is to support using SDFs for drawing the 2D GUI primitives. It still supports the texture atlas and images. It's gone pretty well thanks to @treeform's very handy Shady library.
My next step is to do some bench-marking and memory usage tests. A big limit in Figuro was the memory usage of the rendering engine. Even using 9-patch for rectangles, corners, shadows, etc, the texture atlas could grow pretty large. 9-patches on shadows introduce annoying artifacts as well. SDFs seem like they should be much more memory efficient and fast so far.
The 2D render library is designed to support run on it's own thread with arc/orc/atomicArc. The core Fig render nodes are value objects passed via a few seqs through a ring buffer channel. It supports z-index layering and multiple roots. Though that stuff needs documenting.

Made it to version 0.2.0 with some big performance improvements!
The default rendering mode now uses SDFs with pure texture mode as a backup. SDF wins with FPS increase with the number of fig nodes rendered. I'm now running 60+ FPS with 6000 elements where the texture based approach runs at ~30 FPS and crashes quite a lot. I'm running on both AMD embedded GPU and a Macbook Pro M3.
The bigger win however is memory usage! It's essentially flat regardless of the number of fig nodes. In particular with shadows and corners with changing sizes where a texture atlas based approach has to re-render the shadows and corners constantly. This meant Figuro and Fidget both need lots of GPU memory to run with fancy styling.
Using SDFs for main 2D GUI elements saves memory for images and text glyphs and also looks better (e.g. trying to use 9-patch with shadows plagued me with 1-pixel artifacts).
Here's the FPS for the demo shown below. It's fully dynamic with all most of the shadows and node corners dynamically changing sizes. Note that making 6001 fig elements only takes ~2ms!
fps: 66.7533297876409 | elems: 6001 | makeRenderTree avg(ms): 2.044776119402985 | renderFrame avg(ms): 4.014925373134329

There's still some work to do. SDFs don't seem to be working with OpenGL ES mode, but that could be machine specific.
Note: It really is fun vibe coding with OpenAI/Codex and Nim. Just ask it about obscure OpenGL stuff and point it in the right direction.
BTW, my goal is to get it working and well tested as a simple and stable back-end for any 2D GUIs for the Nim ecosystem. Maybe also adding Metal and Vulkan shaders as well.
I've removed the windowing dependencies so it's just a single renderFrame call now. Though FigDraw currently creates and manages the OpenGL context. That could probably change. Also I had to drop Shady as it broke on my macbook.
I also want to add a C API to FigDraw so other languages can use it. Honestly I'm a bit underwhelmed with the 2D GUI renderers out there. Aside from Rive which is complicated to compile and to use, but can handle real vector graphics.
I also want to add a C API to FigDraw so other languages can use it.
Meh, C programmers don't use non-C-stuff anyway, what's the point.
Why not? It’d be like 30 lines of code. Probably won’t get any C devs, but hey it’d be proof of what’s possible in a small GPU based 2D GUI renderer. It’d make it easy to make a small dynlib too.
Who knows, maybe it’d get included in some benchmark somewhere.

Similar to before but prints the current FPS in the corner (this is with -d:debug):

Anything other core features to add?
I'm unsure about doing SDF fonts. Looks like Valve used 64x64 textures for their fonts, however 2D UI's don't go from 12pt to 48pt fonts dynamically. That said I'm unsure it'd be more memory efficient if most of the time the fonts are statically sized.
Though it'd be interesting to convert arbitrary vector scenes to Valve style SDF distance masks that could then be rendered quickly at many sizes. That'd let it compete with ThorVG, Skia, etc. Setup Pixie to handle vector apis, and then rasterize it and store it as a precomputed sign field. Tempting..
@ASVI could you try out FigDraw v0.21.2?
I added a couple of kinds of sub-pixel font rendering techniques on (OpenGL/Vulkan) mainly for Linux/Windows. It's hard to see much benefit on my monitor on Wayland as it's still a 4k monitor and my eyes aren't as young as they used to be. ;)
Siwin Wayland now supports fractional scaling (PR23)! Lack of fractional scaling was a large source of font fuzziness on Wayland in my case as the compositor would re-scale the app at a pixel level.
- `FIGDRAW_TEXT_LCD_FILTERING=1`
- `FIGDRAW_TEXT_SUBPIXEL_POSITIONING=1`
- `FIGDRAW_TEXT_SUBPIXEL_GLYPH_VARIANTS=1`
Running examples/siwin_text.nim/examples/windy_text.nim allows you to enable the different options at runtime using some keys.
I think this should be closer to a proper lcd implementation:
proc applyLcdFilter*(image: var Image) =
## Applies FreeType's default 5-tap LCD filter horizontally.
# It works but in fact, to get real LCD filter, pixie need LCD render mode that
# don't use alpha channel or make image with 3x size and downscale it (but it perfomance kill)
# real purpose of lcd filter - preventing color fringing after correct rasterization.
if image.width <= 0 or image.height <= 0:
return
let src = image.data
var filtered = newSeq[type(src[0])](src.len)
let maxX = image.width - 1
for y in 0 ..< image.height:
let rowStart = y * image.width
for x in 0 ..< image.width:
var sumR, sumG, sumB: int32
# it actually all need pixels since we have 9 subpixels
# (of which 5 are enough)
let
oldPixel = src[rowStart + max(x - 1, 0)] # maybe incorrect for image edges
currPixel = src[rowStart + x]
nextPixel = src[rowStart + min(x + 1, maxX)]
let sp = [
oldPixel.r.int32, oldPixel.g.int32, oldPixel.b.int32,
currPixel.r.int32, currPixel.g.int32, currPixel.b.int32,
nextPixel.r.int32, nextPixel.g.int32, nextPixel.b.int32
]
# we need sum subpixels, not actual pixels
for i, weight in lcdFilterWeights:
sumR += sp[i + 1] * weight
sumG += sp[i + 2] * weight
sumB += sp[i + 3] * weight
let idx = rowStart + x
filtered[idx] = src[idx]
filtered[idx].r = uint8((sumR + 128'i32) shr 8)
filtered[idx].g = uint8((sumG + 128'i32) shr 8)
filtered[idx].b = uint8((sumB + 128'i32) shr 8)
# assert src[idx].a == 255
filtered[idx].a = 255 # no really concept of alpha in lcd rendering
image.data = move(filtered) The text started looking crisp right after fractional scaling support was added. .. Overall, the text looks good now.
Excellent! Yeah fractional scaling really helps.
The only annoying thing at the moment is that window resizing lags. I’ve checked, and the main bottleneck is generateGlyphImages, which takes up 70-80% of the render time (2900 ms). This happens as the atlas size grows; once it reaches 8192 or more, rasterization becomes very slow.
Thanks for checking into that. Definitely some optimizations to look into, especially for Vulkan.
Note though that both windy_text and siwin_text adjusts the UI scale based on the window width. That triggers regenerating fonts at every point size!
I was using it to stress test the fonts. Normal applications don't do that, and shouldn't fill up the atlas unless they use lots of font sizes. For example running Neonim over many days remains very stable memory wise but only uses one font size.
I'm not sure if the texture atlas filling is incremental (like it is in Flutter Impeller, Skia, etc.) - it feels as if it's trying to process the entire massive texture instead of just a small portion.
Font glyphs are generated individually and uploaded incrementally. Though Vulkan is a beast and it could be doing something weird.
Besides incrementality, MSDF rendering might help: a single small texture atlas is generated on the CPU from the font and then scaled via a shader.
I did try an MSDF font rendering path and didn't like it. It had annoying artifacts. It was very slow compared to generating glyphs with Pixie. Likely due to not having SIMD and/or the cost of the SDF field over a 64x64 glyph. It was a bit of a letdown. It seemed expensive on the GPU as well.
My thinking currently is to make the atlas font handling smarter. For example tracking and clearing out unused font glyph sizes at the end of each rendering pass. Though that might require a bigger atlas redesign.
Any ideas on that?
Font glyphs are generated individually and uploaded incrementally. Though Vulkan is a beast and it could be doing something weird.
This is actually quite strange: latencies on Vulkan were quite low (around 7-8 ms). The main load was in generateGlyphImages for the large atlas file.
My thinking currently is to make the atlas font handling smarter. For example tracking and clearing out unused font glyph sizes at the end of each rendering pass. Though that might require a bigger atlas redesign.
Maybe I found problem: grow() function in vulkan_context.nim: ctx.entries.clear() ctx.atlasPixels = newImage(...) It means that glyphs re rendered
Also vkCmdCopyBufferToImage uses full atlasSize instead of copying only changed glyphs (and at first copy it should also copy only generated glyphs rect (smaller than atlas size)).
So main requirements for font atlas:
I did try an MSDF font rendering path and didn't like it. It had annoying artifacts. It was very slow compared to generating glyphs with Pixie. Likely due to not having SIMD and/or the cost of the SDF field over a 64x64 glyph. It was a bit of a letdown. It seemed expensive on the GPU as well.

This is actually quite strange: latencies on Vulkan were quite low (around 7-8 ms). The main load was in generateGlyphImages for the large atlas file.
Ah that makes sense. It iterates over all the text and generates glyphs Rune.
Though 2800 ms seems really long. Are you running with -d:release? Pixie render path can run fairly slow in debug mode.
Maybe I found problem: grow() function in vulkan_context.nim: ctx.entries.clear() ctx.atlasPixels = newImage(...) It means that glyphs re rendered UPD: seems that it have some caching but it simply not work. Anyway, it harder than pages
Hmmm, looks like there's a recreateAtlasGpu after that that copies the image? Though yeah no idea if it's broken. Atlas resizing is should be pretty rare with SDF GUI elements so I haven't tested it as much.
Still Vulkan runs ~10-15% slower than OpenGL on my Wayland/FreeBSD machine. So any ideas welcome!
Moving to lifetime objects to handle buffers might help too at least by cleaning up the code. Moving to memory pools helped the Metal backend performance a lot.
Also vkCmdCopyBufferToImage uses full atlasSize instead of copying only changed glyphs (and at first copy it should also copy only generated glyphs rect (smaller than atlas size)).
If I understand correctly it's not just copying the sub-image like it should be?
Use pages (i.e seq of font texture atlases, texture arrays). It better than realloc since it hard for gpu.
I was thinking more of modifying and image hash table. Though pages sound like a better solution.
I also tried rendering fonts through MSDF, and I got these artifacts:
It's annoying. There seems to be issues with rendering MSDF and adjusting for the intensity range. That's probably the cause of the artifacts due to how the RGB layers overlap.