nimforum mirror - Getting random non-equal int pairs, comparison of routines.

Zoom (orginal) [2020-11-16T15:14:28+01:00] view original

Documenting a small discussion we had in Nim's IRC channel. I asked what would be the correct way of getting a pair of random and non-equal integers, besides a rand and another one in a while a==b loop, and got a few answers.

In the spirit of non-apologetic bikeshedding I wrote a simple scaffolding to test the results for different routines and output it in gnuplot-friendly format. I'm still casually exploring Nim and in my experience small practical tasks are the most efficient way to learn. Any thoughts and suggestions are greatly appreciated.

The code: https://play.nim-lang.org/#ix=2EmY

Plot gallery: https://imgur.com/a/kHYWPw9

The resulting heatmaps. Colors are not consistent between images (pay attention to the temperature scale to the right). The only outlier here is the second image, which represents the results of a "coin toss" strategy: get a number N, flip a coin to choose if we get the second number from low..<N or N+1..high.

treeform (orginal) [2020-11-16T19:48:10+01:00] view original

Wow Nikki's max substitution is pretty clever. You generate first number normally but generate second number with 1 less range. When you get a-b=0 on the send number you go for the highest value. Effectively making second be a number of full range but without 0s.

Vindaar (orginal) [2020-11-17T00:41:14+01:00] view original

I saw this earlier today and decided to replace the gnuplot plotting by ggplotnim. Had to fix a small issue with reversal of discrete axes, but with this done, here's a solution that directly creates the grid of the plots shown in the OP in one ggplot call:

https://gist.github.com/Vindaar/1a16594fe6f2830faa17d6f6d770de4f

which creates the following plot:

https://user-images.githubusercontent.com/7742232/99320865-d892a400-286c-11eb-8f1a-4fb65bf68924.png

(I did not embed it, because it's saved as 1600x1200)

This relies on the changes of this PR here: https://github.com/Vindaar/ggplotnim/pull/98 otherwise the scale_y_reverse is broken and one cannot set the range of the color scale using scale_fill_continuous.

Zoom (orginal) [2020-11-17T17:08:58+01:00] view original

That's super cool. Is it possible to modify the results so all the graphs would share the same temperature scale?

Vindaar (orginal) [2020-11-17T17:34:32+01:00] view original

They already do share the exact same scale, ranging from 900 to 1100. Hence why most are ~yellow.

Or do you mean the opposite that each has an independent color scale?

Zoom (orginal) [2020-11-18T23:01:20+01:00] view original

What threw me off is that we have values < 900 and they got clamped to the lowest temperature of the scale.

Which points me to an idea of a feature: separate colour for overshoots and undershoots.

Also, the default temp gradient has lower contrast and discernibility in the lower part than the gnuplot one. It's much harder to see the diff in colour in 900-950 range than in 1000-1050. There probably needs to be a point with more saturated navy somewhere.

Vindaar (orginal) [2020-11-19T10:18:08+01:00] view original

What threw me off is that we have values < 900 and they got clamped to the lowest temperature of the scale.

Of course they get clamped to the lowest / highest end of the scale if the scale is taken smaller than the full data range. Clamping the color scale to fixed range means: "I only care about change in the given range. Data outside is no different than data at the edges". It's literally the same as restricting your x or y axis to a subrange of the data with (by default) plotting outliers at the edge where they are clamped to tell the user "hey, there's more here that's not properly represented".

Which points me to an idea of a feature: separate colour for overshoots and undershoots.

Yes, having the option to give outliers specific colors (e.g. white for outliers above and black for those below) is a useful option. Not hard to implement, but doesn't exist in ggplotnim for now.

Also, the default temp gradient has lower contrast and discernibility in the lower part than the gnuplot one. It's much harder to see the diff in colour in 900-950 range than in 1000-1050. There probably needs to be a point with more saturated navy somewhere.

That is not a bug, but a feature of the color scale! The scale you see is called Viridis and there's a lot more thought put into it than you might think:

https://www.youtube.com/watch?v=xAoljeRJ3lU

Gnuplot's color scale having a pretty stark contrast at the very low end from black to purple means it's presenting you information that is not there. Your brain sees the contrast and thinks "oh, this is interesting" when in reality there's the same amount of change as on a different part of the color scale covering the same interval. E.g. take the gnuplot plot in the top right corner. The scale there from 0 - 200 changes extremely noticeably. In comparison the range from 600 - 800 is barely visible.

There's two cases where you might not want to treat two ranges of the same width as equal.

when you want to consider relative changes in particular. In this case the solution is not to try to apply a broken color scale to "hope and pray" that it highlights the right thing in your data. But rather to transform your data such that a linear color scale highlights the correct change. This is an important distinction, because the transformation of your data is under your control.

you do not care about changes in values, but about different categories, e.g. 0, 0 < x < 200, 200 < x < 400, ... In this case one has to apply a discrete label to your data and use the label to plot the categories instead of again relying on a continuous color scale to hope that it properly represents the categories one has in mind.

A color scale is simply something that maps a third dimension (i.e. the z axis) to something that can be represented in 2D. Thus it is important that it does not fool you. You don't want your x axis to look like "0, 2, 8, 9, 10" either I imagine.

Hope this was informative. I simply have to often deal with people who abuse color scales for things which they are simply not designed for (or just use bad color scales and wonder why there's a "feature" in their data...).

Mirror of forum.nim-lang.org

7102 :: Getting random non-equal int pairs, comparison of routines.