So, I'm wondering if there is any localization-friendly sorting library in Nim?
In Ruby, I've been using twitter_cldr and lots of different ones, which seem to be working great.
Basically, let's put it like that:
E.g. if the strings are in Spanish:
So, any ideas?
Of course you may wrap and use external C libs.
But Nim has sort procs where you can pass a cmp() proc.
So what you really need is a cmp() proc for unicode. Nim has a unicode module, see https://nim-lang.org/docs/unicode.html. There you find cmpRunesIgnoreCase() or <=%. You may try that?
With unidecode (unicode is not sufficient on its own) you can sort that out (playground):
import algorithm, unicode, unidecode
echo sorted(@["one", "óne", "pir"], cmp=cmpRunesIgnoreCase) # @["one", "pir", "óne"]
doAssert "óne".unidecode == "one"
proc cmpUnidecode(a, b: string): int =
result = system.cmp[string](a.unidecode, b.unidecode)
if result == 0:
return cmpRunesIgnoreCase(a, b)
echo sorted(@["one", "óne", "pir"], cmp=cmpUnidecode) # @["one", "óne", "pir"]
This does not really support locales, such as localCompare in javascript or possibly the other libraries you are used to in ruby, but it is better than nothing.
An alternative would be to call strcoll() from the C standard library.
Of course this works only if the encoding of your locale setting matches the encoding of the strings passed to strcoll(). This used to be a problem on Windows which did't support UTF-8 locales. But now Windows 10 seems to support UTF-8 locales (didn't try myself).