Hi,
I started a small library, which tries to implement D style array slices (without copying the data - just keep a reference and the length). Initial code is here https://github.com/AdrianV/nimrod-tools - comments and critics are welcome
here is the basic idea:
var test = @[1,2,3,4,5,6,7,8,9].all()
var stest = test.slice(2..4) # [3,4,5] slice of test - nothing copied
var t2 = test[1..6] # [2,3,4,5,6,7] - nothing copied
var t3 = t2[0..1000] # [2,3,4,5,6,7,8,9] slice of t2 - access the underlying test - nothing copied
test[3] = 12 # [1,2,3,12,5,6,7,8,9]
# stest = [3,12,5]
stest[1] = 5 # [3,5,5]
# t2 is still [2,3,12,5,6,7]
I only skimmed your code and don't know if it's safe. ;-)
I'd use some ptr array [0..1000_000] instead of the raw pointer for convenience. Well I'd use the array if system.shallowCopy for sequences wouldn't exist...
Your way is still too complicated: ;-)
# since it has reference semantics, I choose 'P'
type
PSeqSlice {.final, pure, shallow.}[T] = object
s: seq[T]
a, b: int
proc slice*[T](fseq: seq[T], x: TSlice[int]): PSeqSlice[T] =
# create a slice from a seq[T]; we need to copy here as it could be
# a constant sequence not allocated on the heap:
result.s = fseq
result.a = max(0, x.a)
result.b = min(fseq.high, x.b)
proc `[]`* [T](s: PSliceSeq[T], i: int): var T =
assert s.a + i <=% s.b
result = s.s[s.a + i]
Well, you get the idea... ;-)
Yes that's simpler and close to the solution I had first in mind ;-), but then I wanted access performance equal to normal arrays, without extra costs, therefor I cache the base. But I didn't create a benchmark yet to see if this extra bookkeeping pays out.
edit: did some basic benchmarks against the standard slices. My implementation shows some benefits, if the slice length gets longer (which is not a not surprise).
What I don't understand fully yet is the benefit of the shallow pragma in your example
edit: found out - the slices itself are shallow copied
Anyway nimrods flexibility is fascinating. If you want D like slices - you can create it quite easily, but your are not forced to use them (and they can be really confusing if you are not used to them)
Well, you get the idea...
Yes ;-)
now I did some benchmarks. There is no measurable difference between your solution and mine. So I will go with the cleaner (your suggestion).
I did the following test. Fill an array of byte with 10_000_000 random values. go over the array and create ~1_000_000 slices with a length ~10 - simulating a parse. Than go over the created slices and sum up all the values inside:
average slice length 10:
create 1_000_000 slices:
standard slices from system: 270 ms
seq slices (both implementations): 140 ms, dmd: 110 ms
non copying slicing pays off by factor 2
now the interesting part:
calc sum over all slices:
standard slices from system: 370 ms
seq slices (both implementations): 130 ms, dmd: 60 ms
slices seem also faster to process, because they are more cache efficient. When you increase the length and reduce the number it is even more significant.
average slice length 100:
create ~100_000 slices:
standard slices from system: 122 ms
seq slices (both implementations): 91 ms, dmd: 85 ms
calc sum over all slices:
standard slices from system: 250 ms
seq slices (both implementations): 15 ms, dmd: 50 ms
so the benefit is not in creation (as what I had expected), but in processing - more than factor 15 in this case. Something I would have never thought
edit: values for dmd added
I haven't looked at the produced C code nor did any measurements, but in general the compiler is still missing an important type specialization optimization for assignments. This is not far away from being implemented and may lead to improvements of a factor of 2-3 for assignment heavy code for types like PSeqSlice. Perhaps we beat DMD then on more of the tests...
Btw I am not particularly interested in efficient slices:
found out about Nimrod today (it sure looks great) and this is one of the first things I checked for.
Just wanted to say that for certain types of progrmas slices really help. Many linear algebraic, machine learning and statistical and other number crunching algorithms can be expressed more succinctly using the slice syntax. Since the algorithms themselves are often written in terms slices or other sub-arrays of the original, the translation from an algorithmic description to code is less error prone.
This is one of the reasons for popularity of Numpy. Its semantics makes a lot of copying unavoidable, so one has to drop down to C, C++ or Cython.
This is just in case it kilndles some interest in efficient slices.
I will dig more into the language as time permits.
p.s. Is there a way to receive new posts/followups in one's inbox ?
I think Nimrod's term rewriting macros will help these algorithms more than slices will but we'll see what reality has to say about that. ;-)
PS: Sorry, not yet.