nimforum mirror - How to find bottlenecks?

severak (orginal) [2022-01-28T13:09:34+01:00] view original

Hi, I am writing Openstreetmap renderer in Nim. Basically it's server which takes Vector tiles from MBTiles file and render them (on the fly) to PNG image using pixie lib.

It works well for project just week old, but I ran into performance issues. It seems that some part of it is really slow but I don't know how to figure out which one. Is there any way how to measure how many time takes which part of program?

Also as this is my second serious project in Nim so I probably messed up some parts of the tile decoder. If there is somebody willing to look into my code (https://github.com/severak/lunarender3/blob/main/src/tile.nim) and spot some obvious mistakes any help would be appreciated.

Source code here - https://github.com/severak/lunarender3 - if anybody is interested.

ynfle (orginal) [2022-01-28T13:28:56+01:00] view original

Try https://github.com/treeform/hottie

Yardanico (orginal) [2022-01-28T13:31:36+01:00] view original

You can also just use any generic C/C++ profiler if you compile with --debugger:native

auxym (orginal) [2022-01-28T13:57:31+01:00] view original

Does it work on linux? I read the first line in the readme as implying it only works on windows.

ynfle (orginal) [2022-01-28T14:05:16+01:00] view original

It works on linux, you need clang and pass additional params

Nycto (orginal) [2022-01-28T15:40:12+01:00] view original

This guide gives you a few options:

https://nim-lang.org/blog/2017/10/02/documenting-profiling-and-debugging-nim-code.html

treeform (orginal) [2022-01-28T17:08:08+01:00] view original

As other mentioned you can try my profiler https://github.com/treeform/hottie also you can make benchmark cases with https://github.com/treeform/benchy .

You are using our libs pixie and zippy thank you! They should be pretty fast as they beat cairo and zlib in many of our tests.

Some things I noticed going through your code:

Make sure to compile --d:release or even with --d:danger for non debug builds. Using --gc:orc might help as well.

 font = readFont("Gidole-Regular.ttf")

- you seem to load fonts a lot. I recommend loading font once. Fonts have a character cache that should really help.

You seem to be loading your sqlite db on every request result = db_sqlite.open(fileName, "", "", ""), just like with fonts I would read it once and use it.

In your tile code you are parsing html color over and over:

(debugLayerColor[feat.layer])

you could try parseHtmlColor during load and have the debugLayerColor and have the Color type needed.

Your imgSize = 1024 tiles appear to be huge, usually they are like 128 or 256?

I though you might be reading whole map every time to generate a small part of it. I would suggest some thing like https://github.com/treeform/spacy. But it looks like you sqllite db has tiles tiles organized:

"SELECT tile_data FROM tiles WHERE zoom_level = ? AND tile_column = ? AND tile_row = ?"

Probably good?

You could cache the tiles after you generate them, but maybe some thing like a CloudFlare cache is better, and just keep your server simple.

I would recommend timing with benchy how long it takes to generate a tile with font and sqlite loading and with font and sqlite already loaded before hand. As you make small improvements you should see time going down.

I hope that helps?

Minor nitpick: you are using 4 space indents like python, Nim's standard is 2. You are using camelCase and snake_case and even camel_Snake_Case probably should stick to Nim's standard camelCase?

I am was working on an "open sea map" UI viewer it could be cool to not only have the viewer but also the generate the tiles with Nim. Right now I am just using their png tiles. https://map.openseamap.org/ is an offshoot of https://www.openstreetmap.org/ . Maybe we could collaborate?

severak (orginal) [2022-01-29T13:38:08+01:00] view original

Thanks for looking into it.

It looks like a benchy is the thing I was looking for.

You seem to be loading your sqlite db on every request

I was not sure if was safe to to share SQLite instance between server processes. But will try it (it's read only after all).

In your tile code you are parsing html color over and over

Not expected this can take much time.

Your imgSize = 1024 tiles appear to be huge, usually they are like 128 or 256?

For webmaps standard tile sizes are 256 and 512, lunarender will support both of them. But internally vector tiles objects have coords in 0-4096, so I use 1024 to more easily spot bugs in debug. Also for zoom levels above 14 tiles are actually derived (by "zooming in") from those at 14 zoom. This will be also implemented later.

But it looks like you sqllite db has tiles tiles organized

Yes. MBTiles files came with data already preprocessed and splittted to vector tiles itself.

Caching of resulting tiles is definitely planned and it will be supported in lunarender itself (and I plan to have Caddy server proxy in front of it on demo server).

I am was working on an "open sea map" UI viewer it could be cool to not only have the viewer but also the generate the tiles with Nim.

This is definitely possible. Once you are able to process data into vector tiles (using tilemaker or similar software) it will be possible to render it with lunarender. Styles will be customizable with lua language and it should be flexible. I am interested in Open sea viewer and possible future colaboration.

severak (orginal) [2022-02-08T20:37:47+01:00] view original

Finally I got some time to add benchmark. It turns out that when compiled with --d:release (without any other changes) decoding is actually 17 times faster and fast enough for rendering test map for browser. I wonder what are those slow things involved (in debug) and if I am gonna miss some of those (I am expecting some crashes).

Compiled with debug mode:


name ............................... min time      avg time    std dv   runs
opening tileset .................... 0.075 ms      0.080 ms    ±0.007  x1000
getting tile from set .............. 0.065 ms      0.068 ms    ±0.008  x1000
decoding tile .................... 994.636 ms   1041.867 ms   ±65.151     x5
draw test image .................. 103.480 ms    109.454 ms    ±7.046    x44

Release mode:


name ............................... min time      avg time    std dv   runs
opening tileset .................... 0.075 ms      0.078 ms    ±0.004  x1000
getting tile from set .............. 0.058 ms      0.062 ms    ±0.007  x1000
decoding tile ..................... 53.434 ms     58.503 ms    ±5.444    x84
draw test image ................... 14.997 ms     16.127 ms    ±1.264   x303

I need to fix some bugs and there is definitely some room for faster code too, but this speedup is enough for today. Thanks for suggestions.

severak (orginal) [2022-02-10T10:22:09+01:00] view original

it starts looking usable

Mirror of forum.nim-lang.org

8850 :: How to find bottlenecks?