Would you like to accept a PR with some useful functions for utf-8 strings: uppercase, lowercase, word wrapping, test for uppercase. Actually it should duplicate almost every function from strutils. Currently unicode supports case conversion only for single rune, not for string.
Also there are naming conventions under question. Currently there only one function from unicode module that does string processing (reversed) which looks very similar to reverse, but reverse does not support utf-8 strings while reversed does.
Some functions are very easy to implement, unicode module already have all necessary api:
import unicode
# add those functions to unicode module?
proc utf8Capitalize(s: string): string =
## Return a copy of the string with its first character capitalized and
## the rest lowercased.
if s.len == 0:
return s
let firstRune = s.runeAt(0)
let firstRuneUp = firstRune.toUpper
if firstRuneUp == firstRune:
return s
result = newStringOfCap(s.len)
result.add(firstRuneUp.toUTF8)
result.add(s[result.len..s.high])
proc utf8Lower(s: string): string =
## Return a copy of the string with all the cased characters converted to
## lowercase.
result = newStringOfCap(s.len)
for rune in s.runes:
result.add(rune.toLower.toUTF8)
proc utf8Upper(s: string): string =
## Return a copy of the string with all the cased characters converted to
## uppercase.
result = newStringOfCap(s.len)
for rune in s.runes:
result.add(rune.toUpper.toUTF8)
echo utf8Capitalize("привет")
echo utf8Lower("Привет")
echo utf8Upper("Привет")
Actually every .toUTF8 creates a small portion of garbage and triggers GC very frequently. I guess this will be very slow. Quick testing discovered a hidden hell of utf-8 support: utf-8 version of toLover 30 times slower than strutils.
UPD. Compared with GLib g_utf8_strup - it still 10 times slower than ascii version. Not sure that putting all eggs together is a good idea.