nimforum mirror - How Seq work with objects ?

Pixeye (orginal) [2020-03-22T08:54:30+01:00] view original

Hi everyone :) Let's say I have a very "huge" object

type
 Test = object
    id1, id2, id3, id4, id5, id6, id7, id8, id9, id10, id11, id12, id13, id14,
        id15, id16: uint32

I make a seq of Tests

var buff = newSeq[Test](10000)

Then I delete some elements from seq

for i in 2521..3201:
  buff.delete(i)

How I see it works:

Test is an object without reference ( I treat it as a struct )

By removing something from the seq it needs to be reordered. ( shift elements )

If it's a referenced object you shift a pointer, if it's an object you need to copy all object data (?)

It should be much slower than copying reference pointers (?)

But after making some profiling I got the results like : **** Time elapsed obj removing for 0.06516200304031372 ***** **** Time elapsed ref obj removing for 0.09633999317884445 *****

I look from C# background where I have classes and structs and I understand that structs are copied when you use a list (analog of seq?) and the bigger the struct is the slower it gets to copy it

What I'm missing in Nim? :) Thanks in advance

Stefan_Salewski (orginal) [2020-03-22T09:14:09+01:00] view original

A Nim seq stores its entries in a consecutive memory area indeed. If you want to delete an arbitrary element, you have two possibilities: If you need to keep order, maybe as seq is sorted, you move all elements after the element do delete one position forward, which is a mem copy. Or, if order is not important, you can move last element in seq to place of element to delete. In all cases you have to set new length to len - 1 of course. We have del() and delete() for this. When you have to move, moving pointers is only 4 or 8 bytes each, and moving value objects is moving size of object each. So for really large elements moving pointers is faster of course. But generally a memcopy is very fast.

Pointers can be moved faster in seq maybe, but when you access the elements, then pointers are a indirection, and generally when you use pointers, then the elements itself may be distributed in Ram, so that there is nearly no cache support even when you process the data in consecutive order.

mashingan (orginal) [2020-03-22T10:32:00+01:00] view original

By removing something from the seq it needs to be reordered. ( shift elements )

If you don't need the ordering, use del

Try the testing by choosing the span element randomly instead of pre-determined value. It will ensure compiler couldn't optimize the entire loop-deleting. Or from value from program arguments.

Also object (pretty much) possibly allocated on stack while ref allocated in heap. Stack allocation is always faster.

cumulonimbus (orginal) [2020-03-22T16:53:05+01:00] view original

As Stefan_Salewski wrote,

When you delete reference objects, the object itself where the references pointed to has to be deleted too, at least when there is no other reference to it to keep it alive. And that delete costs time too, generally alloc() and dealloc() calls are expensive.

This is almost surely the reason. You can verify that by making another copy of the ref "buff" before starting the deletion, so something still holds the references. Also, you're likely to find big differences if you measure gc:arc, gc:boehm or the old mark&sweep.

Mirror of forum.nim-lang.org

6101 :: How Seq work with objects ?