Documenting a small discussion we had in Nim's IRC channel. I asked what would be the correct way of getting a pair of random and non-equal integers, besides a rand and another one in a while a==b loop, and got a few answers.
In the spirit of non-apologetic bikeshedding I wrote a simple scaffolding to test the results for different routines and output it in gnuplot-friendly format. I'm still casually exploring Nim and in my experience small practical tasks are the most efficient way to learn. Any thoughts and suggestions are greatly appreciated.
The code: https://play.nim-lang.org/#ix=2EmY
Plot gallery: https://imgur.com/a/kHYWPw9
The resulting heatmaps. Colors are not consistent between images (pay attention to the temperature scale to the right). The only outlier here is the second image, which represents the results of a "coin toss" strategy: get a number N, flip a coin to choose if we get the second number from low..<N or N+1..high.
I saw this earlier today and decided to replace the gnuplot plotting by ggplotnim. Had to fix a small issue with reversal of discrete axes, but with this done, here's a solution that directly creates the grid of the plots shown in the OP in one ggplot call:
https://gist.github.com/Vindaar/1a16594fe6f2830faa17d6f6d770de4f
which creates the following plot:
https://user-images.githubusercontent.com/7742232/99320865-d892a400-286c-11eb-8f1a-4fb65bf68924.png
(I did not embed it, because it's saved as 1600x1200)
This relies on the changes of this PR here: https://github.com/Vindaar/ggplotnim/pull/98 otherwise the scale_y_reverse is broken and one cannot set the range of the color scale using scale_fill_continuous.
They already do share the exact same scale, ranging from 900 to 1100. Hence why most are ~yellow.
Or do you mean the opposite that each has an independent color scale?
What threw me off is that we have values < 900 and they got clamped to the lowest temperature of the scale.
Which points me to an idea of a feature: separate colour for overshoots and undershoots.
Also, the default temp gradient has lower contrast and discernibility in the lower part than the gnuplot one. It's much harder to see the diff in colour in 900-950 range than in 1000-1050. There probably needs to be a point with more saturated navy somewhere.
What threw me off is that we have values < 900 and they got clamped to the lowest temperature of the scale.
Of course they get clamped to the lowest / highest end of the scale if the scale is taken smaller than the full data range. Clamping the color scale to fixed range means: "I only care about change in the given range. Data outside is no different than data at the edges". It's literally the same as restricting your x or y axis to a subrange of the data with (by default) plotting outliers at the edge where they are clamped to tell the user "hey, there's more here that's not properly represented".
Which points me to an idea of a feature: separate colour for overshoots and undershoots.
Yes, having the option to give outliers specific colors (e.g. white for outliers above and black for those below) is a useful option. Not hard to implement, but doesn't exist in ggplotnim for now.
Also, the default temp gradient has lower contrast and discernibility in the lower part than the gnuplot one. It's much harder to see the diff in colour in 900-950 range than in 1000-1050. There probably needs to be a point with more saturated navy somewhere.
That is not a bug, but a feature of the color scale! The scale you see is called Viridis and there's a lot more thought put into it than you might think:
https://www.youtube.com/watch?v=xAoljeRJ3lU
Gnuplot's color scale having a pretty stark contrast at the very low end from black to purple means it's presenting you information that is not there. Your brain sees the contrast and thinks "oh, this is interesting" when in reality there's the same amount of change as on a different part of the color scale covering the same interval. E.g. take the gnuplot plot in the top right corner. The scale there from 0 - 200 changes extremely noticeably. In comparison the range from 600 - 800 is barely visible.
There's two cases where you might not want to treat two ranges of the same width as equal.
A color scale is simply something that maps a third dimension (i.e. the z axis) to something that can be represented in 2D. Thus it is important that it does not fool you. You don't want your x axis to look like "0, 2, 8, 9, 10" either I imagine.
Hope this was informative. I simply have to often deal with people who abuse color scales for things which they are simply not designed for (or just use bad color scales and wonder why there's a "feature" in their data...).