Hi, everybody!
I was trying to implement epoll based event driven state machines engine in nim-lang and encountered somewhat strange problem with adding objects to a sequence. See comments in the code below:
import edsm
proc sm_init_enter(me: ptr StageMachine) =
Message(src: me, dst: me, code: M0.uint8).`>--`
proc sm_work_enter(me: ptr StageMachine) =
echo "Hello!"
me.signals[0].enable
# me.timers[0].start(1000.msec)
proc sm_work_T0(me, src: ptr StageMachine, data: pointer) =
echo "tick-tack!"
# re-enable timer
proc sm_work_S0(me, src: ptr StageMachine, data: pointer) =
# this will terminate event loop (dst is nil)
Message(src: me).`>--`
proc sm_work_leave(me: ptr StageMachine) =
echo "Goodbye!"
var
sm = StageMachine(name: "the-machine")
tm = sm.addTimer(1000.msec)
sg = sm.addSignal(EDSM_SIGINT)
let init: ptr Stage = sm.addStage("INIT", sm_init_enter, nil) # state #0
let work: ptr Stage = sm.addStage("WORK", sm_work_enter, sm_work_leave) # state #1
#init.addReflex(M0, work) # does not work... WHY?!?
#echo sm.stages[0].reflexes # prints '@[]'
# this works as expected
addReflex(sm.stages[0].addr, M0, work) # transition INIT -> WORK
echo sm.stages[0].reflexes # prints '@[(kind: rkTransition, nextStage: ...)]'
# but this also works as needed!
work.addReflex(tm, sm_work_T0) # action on timer
work.addReflex(sg, sm_work_S0) # action on signal
echo sm.stages[1].reflexes
# prints '@[(kind: rkAction, action: ...), (kind: rkAction, action: ...)]'
echo sm
sm.run
loop()
In brief: I have two pointers to some objects (init and work, these are pointers to machine states) and the problem is that init.addReflex() does not work, but work.addReflex() works as expected. Edsm module imported in the beginning is a little bit lengthy, but if anyone would like to take a look at it, i will post it.
I don't think we're going to be able to help you without the code.
I'd recommend using ref or var instead of ptr.
Seems like this. I changed addStage so that it returns nothing and then:
sm.addStage("INIT", sm_init_enter, nil)
sm.addStage("WORK", sm_work_enter, sm_work_leave)
let init = me.stages[0].addr
let work = me.stages[1].addr
After this change everything is ok. Thanx for the answers See my edit for an answer without the continued ptr footguns.
Nim has safer reference types for a reason, you don't need to write C anymore.
I don't know how addStage is implemented so cannot be sure.
It was:
proc addStage*(me: var StageMachine, sn: string, ef: EnterFunc, lf: LeaveFunc): ptr Stage =
let s = Stage(name: sn, enter: ef, leave: lf)
me.stages.add(s)
return me.stages[me.stages.high].addr
> Maybe a reallocation happened when
Yes, add can relocate
addStage was called, which made init a invalid pointer, because sm.stages was already moved.
And despite this init still points to somewhere and nothing fatal happens during addReflex. The crash occured later when attempting to access 'reflexes' of machine in INIT state
See my edit for an answer without the continued ptr footguns.
Initially I tried to use refs everywhere, but... I've been studying nim for a couple of weeks only and after some struggling with compiler decided to try raw pointers
Nim has safer reference types for a reason, you don't need to write C anymore.
Yes, I know, but it's not very easy to get rid of 20 years C-habits in 2 weeks:)
See my edit for an answer without the continued ptr footguns.
Wow. You wrote almost exactly the same code as I have, see:
type
Action = proc(me: ptr StageMachine, src: ptr StageMachine, data: pointer)
EnterFunc = proc(me: ptr StageMachine)
LeaveFunc = proc(me: ptr StageMachine)
ReflexKind = enum rkAction, rkTransition
EventSourceKind* = enum eskStageMachine, eskIo, eskSignal, eskTimer, eskLast
InternalMessage* = enum M0, M1, M2, M3, M4, M5, M6, M7, M8
Reflex = object
case kind: ReflexKind
of rkAction: action: Action
of rkTransition: nextStage: ptr Stage
... and so on. But what is 'wrong' in having nextStage (for example) as 'ptr Stage'? I keep all stages for a given instance of SM in heap (in a sequence) and reflex of kind 'transition' holds a pointer to some stage. I was trying to implement epoll based event driven state machines engine in nim-lang and
Well... for the moment I have working example with timers, signals and reading from stdin. See here
what is 'wrong' in having nextStage (for example) as 'ptr Stage'? I keep all stages for a given instance of SM in heap (in a sequence)
As you discovered, a pointer to a member of a seq can become invalid when the seq is resized (or destroyed).
With Stage as a ref type instead, the seq is an array of pointers under the hood, which get copied across when the seq gets resized, and any references you're still holding when the seq is destroyed will be kept alive.
var
s = @[123]
p = addr s[0]
s.add 321
s[0] = 0
# `p` might not point to s[0]
echo p[]
Just using a pointer that you get before changing seq length is unsafe
Yes, yes, yes, now it's absolutely obvious for me, and I just take addresses of sequence elements after the sequence is filled completely and (in case of my code) will never change anymore.
As you discovered, a pointer to a member of a seq can become invalid when the seq is resized
This remembered me one case that happened a loooong time ago. I reallocated some array of some structs, but forgot to change some pointers, which pointed to the elements of that array.
But! In case of C reallocation is explicit (I did it myself) and there is nobody to blame except myself :)
With Stage as a ref type instead, the seq is an array of pointers under the hood
So, if we have something like
type
SomeObj = object
...
var someSeq = seq[SomeObj]
then
var obj = SomeObj()
someSeq.add(obj)
will add the object itself to the sequence.
But if we have
type
SomeObj = ref object
...
var someSeq = seq[SomeObj]
then
var obj = SomeObj()
someSeq.add(obj)
will
Right?
BTW, did I get it right that seq is a nim's synonym for, say, D's 'dynamic array'?
In the 2nd case, obj is the name of a variable on the stack, with an address (of an object allocated on > the heap) stored in it. You can view ref as auto-managed pointers in other language.
Ok. Then there is a question.
Is doing like this
type
TheObject = object
f1: int
...
TheObjectRef = ref TheObject
TheObjectPtr = ptr TheObject
considered good practice?
According to the style guide, yes.
"When naming types that come in value, pointer, and reference varieties, use a regular name for the variety that is to be used the most, and add a "Obj", "Ref", or "Ptr" suffix for the other varieties. If there is no single variety that will be used the most, add the suffixes to the pointer variants only"
Sounds a little bit fuzzy. I would always use Ptr for ptr, Ref for ref and no suffix for value variant, regardless of which one is the most used.