Hello again =)!
I'm doing advent of code to improve my Nim and encountered a problem today trying to create a string type with specific constraints. I'd like to understand type constraints if they exist, not just for this problem but generally.
Today's problem deals with hands of poker represented as 5 character strings containing any of the following cards/chars {'A', 'K', 'Q', 'J', 'T', '9', ..., '2'}. So a valid hand would look like 'AKT33' for example, while non valid hand would look like 'AKT333' (6 chars) or 'ABCDE' (wrong character set).
I tried with concepts:
type
HandLike = concept x
x is string
len(x) == 5
HashSet(x) == {"A", "K", "Q", "T", "9", "8", "7", "6", "5", "4", "3", "2"}
But after reading this question, in which they have a very similar problem (DNA strings), it seems that's not what concepts are for.
My second idea is to do something like:
type
Card = enum
CA, CK, CQ, CJ, CT, C9, C8, C7, C6, C5, C4, C3, C2, C1
Hand = object
card1, card2, card3, card4, card5: Card
But it's going to require a lot of boilerplate to convert, compared to just applying constraints on a string type. Also in an hypothetical where I have more than 5 cards, for example to represent a 52 cards deck, it starts to become unwieldy.
Their answer to the DNA problem was just to parse it with a distinct string but I'd like to encode the constraints directly into the type, is it possible?
I could just link my solution for this Advent of Code day. But I wouldn't want to spoil part 2 for you, so here is the snippet:
type
Card = enum
C2 = (2, "2"), C3 = "3", C4 = "4", C5 = "5", C6 = "6", C7 = "7", C8 = "8", C9 = "9",
T, J, Q, K, A
Hand = array[5, Card]
Enum values have a string and ordinal value associated to them. Here I am assigning string "2" and value 2 to enum C2. C2 = (2, "2")
Notice how I only assign value to first item, because every item after it will have value 1 higher than previous. E3 = 3, E4 = 4, etc. etc.
Also notice that I didn't change string values of T, J, K, A because by default string value of item is his name T = "T", J = "J", etc. etc.
But it's going to require a lot of boilerplate to convert
Not really. Later in the code I parse input for hand string with a for loop and parseEnum():
func parseHand(hand: string): Hand =
for i, c in hand:
result[i] = parseEnum[Card]($c)
And making a set of cards from array[5, Card] is just as easy as with strings:
import std/setutils
let setOfCards = [T, C8, K, Q, C5].toSet()
Also in an hypothetical where I have more than 5 cards, for example to represent a 52 cards deck...
Then you could just change array[5, Card] to array[52, Card].
If you need more context you can read full solution here .
Thank you, that's helpful! Indeed that's not as much boilerplate as I expected.
I did solve today's problem already, I just didn't take advantage of the type system and got an error from it at some point that could have been avoided.
I did:
for line in lines("input.txt"):
let info = line.split(" ")
var (hand, bid) = (info[0], info[1].parseInt())
let score = getScore(line) # <- spot the error here
# I'm passing line instead of hand, and since my hand is just a string just like line,
# it's happy to do its work and output a wrong value without erroring
Still I wonder, on a general basis, if there is a way to enforce more esoteric constraints (some that might not fit in existing types) in a type?
something has to parse it. This is AdventOfParsing after all.
if you know all the input strings at compile time one way could be:
import macros
const validCard: set[char] = {'A','J','Q','K','0'..'9'}
type
Hand = distinct string
macro validate(s: static string) =
if (s.len!=5):
error "length must be 5"
for i in s:
if (not (i in validCard)):
error "invalid card: " & i
proc toHand(s:static string):Hand =
static: validate(s)
Hand(s)
proc `$`(x:Hand):string = "Hand: " & string(x)
echo toHand("A3QJ4")
static: assert not compiles(toHand("ABCDE"))
to do your enum solution you might:
import std/[enumutils,sequtils]
macro generateEnum(validCards:static set[char]):type =
result = nnkEnumTy.newNimNode().add(newEmptyNode())
for c in validCards:
result.add nnkEnumFieldDef.newNimNode().add(ident ("C" & c), newLit(c))
#echo result.treeRepr
type
Card = generateEnum({'A','J','Q','K','T','2'..'9'})
Hand2 = array[5,Card]
#still need to validate the input string,
proc toHand2(s:static string):Hand2 =
static: validate(s)
for i,c in s:
result[i] = Card(c)#warning: conversion to enum with holes is unsafe
echo toHand2("A3QJ4") # [CA, C3, CQ, CJ, C4]
I ended up with a solution that uses strings only, with regards to the advent problem:
import std/sequtils
import std/strutils
import std/enumerate
import std/tables
import std/algorithm
var
input: string
hands, lines: seq[string]
handTypes = initTable[string, int]()
bids = initTable[string, int]()
hand: string
bid, score: int
input = readFile("input_7.txt")
input = input.replace("A", "Z").replace("K", "Y").replace("Q", "X").replace("J", "1").replace("T", "V")
lines = input.split("\n")
proc getScore(hand: string): int =
var counts = newCountTable(hand)
if '1' in counts and len(counts)>1:
var numJs = counts['1']
counts.del('1')
var l = largest(counts) # We removed Js, so it won't be the largest
counts[l.key] = counts[l.key] + numJs
var score = case largest(counts).val:
of 1:
1
of 2:
if len(counts) == 3: 3 else: 2
of 3:
if len(counts) == 2: 5 else: 4
of 4:
6
of 5:
7
else:
0
assert score != 0
return score
for line in lines:
let info = line.split(" ")
var (hand, bid) = (info[0], info[1].parseInt())
let score = getScore(hand)
hand = $score & hand # This enables us to sort by score
hands.add(hand)
bids[hand] = bid # All hands are unique
assert len(hands) == len(bids) # assert all hands are indeed unique
var result: int = 0
sort(hands)
for i, hand in enumerate(hands):
result += (i+1) * bids[hand]
echo result
But on a more general basis I'm not satisfied with a parser, as they suggest in the DNA question, because what guarantees do you have that some guy doesn't just bypass it and hands you an unvalidated variable if the constraints aren't encoded in the type.
Like in this snippet:
import std/strutils
import std/sets
import std/setutils
type Hand = distinct string
proc validateHand(input: string): Hand =
assert len(input) == 5, "Wrong hand size: " & input & " is " & $len(input) & " chars."
var forbiddenChars: set[char] = input.toSet() - {'A', 'K', 'Q', 'J', 'T', '9', '8', '7', '6', '5', '4', '3', '2'}
assert len(forbiddenChars) == 0, "Hand includes forbidden cards: " & $forbiddenChars
return Hand(input)
var a: Hand = validateHand("AKT33")
var b: Hand = Hand("ABCDEFGHIJKL")
echo cast[string](a)
echo cast[string](b)
lol. fair. me too.
i agree the Hand(string) constructor for distinct types is an attractive nuisance. we can get rid of that. but you'll have to have the self-control not to use cast.
what you can do is, put your Hand type, and its constructor in a separate module, and only export the type and its constructor.
#hand.nim
type Hand* = object
val: string
proc init*(t:typedesc[Hand],s:string):Hand = #...validate...#
then when you
#main.nim
import hand
let x = Hand(val: "INVALID") #static error, the field 'val' is not accessible
let y = Hand.init("INVALID") #runtime error
Yes, this is the right way of doing it.
As @shirleyquick pointed out in his earlier reply you would need to validate inputs at the boundary to your API. This makes intuitive sense when we look at things like web API which needs to make sure one user can't access another users data for example, or that it can't set their own name to be an empty string.
Your scenario appears to be trying to protect users from doing something wrong by accident (you will never manage to prevent malice as long as they're importing a local module). The solution is, as @shirleyquirk displayed, to not export the constructs required for doing bad things. If you want the user to be able to read or write fields (but only with valid values) it's trivial to set up getters and setters for your fields in your module.
And @Araq, I agree that for this example it doesn't make much sense to keep the string around instead of parsing to an enum. But for more complex types where you don't want to pay the penalty of demarshaling or converting data types it makes more sense. For example a library like my jsonschema which validates that a given JSON object follows a schema and returns a distinct JsonNode type with only acessors for the defined fields.
For example a library like my jsonschema which validates that a given JSON object follows a schema and returns a distinct JsonNode type with only acessors for the defined fields.
Can you publish my JSON schema implementation?
@shirleyquick That's great, I didn't know about the requiresInit pragma, thanks! I looked for something like a private constructor on google but those were the wrong keywords. It's too bad about the casting, I wish there were a way to forbid it.
@Arak I'm thinking of data science workflows I've had to work on in the past, in that case I'd want something very easily manipulable and that doesn't require a lot of conversion back of forth. But point taken on a general basis.
@PMunch Indeed, I'm just looking to enforce good code by default. I read your jsonschema, it sounds inline with the kind of things I had in mind. I couldn't grok the implementation because I'm not fluent enough in ast manipulation, so it's bookmarked for later =)!
As @shirleyquirk suggested, you could use an object with a string field. That would be very helpful if you need to be treating the value like a string often.
This would be what a full string-compatible example would look like, complete with implicit conversion:
type HandLike* {.requiresInit.} = object
## String of 5 characters containing only {'A', 'K', 'Q', 'T', '2', '3', '4', '5', '6', '7', '8', '9'}
val: string
proc hl*(str: string): HandLike =
## Converts a `string` to `HandLike`
if str.len != 5:
raise newException(ValueError, "HandLike must be 5 characters long")
for i in 0 ..< str.len:
let c = str[i]
if c != 'A' and c != 'K' and c != 'Q' and c != 'T' and (c < '2' or c > '9'):
raise newException(ValueError, "HandLike cannot contain \"" & c & "\" characters")
return HandLike(val: str)
# Using the `converter` keyword, we can implicitly convert `HandLike` to `string` without any extra steps
converter `$`*(this: HandLike): string =
return this.val
proc firstCard*(hand: HandLike): char =
## Returns the first card of a hand as a `char`
# As you can see, we can use the `[]` operator on `hand` because it's automatically being converted to `string` for us
return hand[0]
let myHand = hl"AQT25"
echo firstCard(myHand) # 'A'
requiresInit is kinda like the deleted default constructor from c++, but cleverer. it prevents the situation where you can
var h: Hand
echo h #oh no, it's an empty string that invalidates our preconditions
but still allows
var h: Hand
h = initHand("AQJ54")
the 'private constructor' in nim is produced by not exporting the member variable that's the absence of the '*' on val which makes the default Hand(val: "...") constructor not work
casting is always gonna be a thing, this is a systems language, and sometimes we need to tell the compiler 'shut up i know what im doing'. Nim is more strictly typed than other languages, but less strict about what you can cast to what. you are explicitly throwing the type system out the window when you use cast.