nimforum mirror - An Attempt to Access NumPy Array Data rom Nim

zarican (orginal) [2019-01-29T16:59:53+01:00] view original

Hi, I am new to nim but I plan to invest time to learn and use it for my scientific computing work. Numpy in Python has an important place for computing tasks and until we have an equivalent (or better) package in nim, we may need to delegate some work to NumPy (or vice versa). Thus, I tried to pass a numpy array to a nim proc and modify its data to practice.

A sample python code and nim code is given below:

Python code:

import numpy as np
import time
from copy import deepcopy

import nimpy_numpy as nn

a = np.random.rand(1000,1000)
b = deepcopy(a)
start_time = time.time()
b[b > 0.5] = 1
time_elapsed = time.time() - start_time
print(time_elapsed)
b = deepcopy(a)
start_time = time.time()
nn.clip_up(b, 0.5)
time_elapsed = time.time() - start_time
print(time_elapsed)

Nim code:

import nimpy
import nimpy/raw_buffers
proc clip_up(o: PyObject, thres: float) {.exportpy.} =
  var aBuf: RawPyBuffer
  o.getBuffer(aBuf, PyBUF_WRITABLE or PyBUF_ND)
  let num_of_items : int = int(aBuf.len / aBuf.itemsize)
  for idx in 0..<num_of_items:
    var x  = cast[ptr cdouble](cast[uint](aBuf.buf) + cast[uint](idx * aBuf.itemsize))
    if x[] > thres:
      x[] = 1
  aBuf.release()

nim compilation:


nim c --opt:speed --app:lib --out:nimpy_numpy.so nimpy_numpy

processing time:

numpy: 0.7431097030639648

nim: 0.9256880283355713

I used nimpy and numpy example in tpyfromnim.nim in nimpy/tests based on python buffer protocol.

Based on my attempt, I am interested in possible answers to the following questions:

Is there a way to make ptr casting generic in var x = castptr cdouble (a generic way instead of cdouble)

Is there a simpler way to access an element using indexes rather than ptr casting?

When compiled with --opt:speed flag, the speeds of numpy and nim are comparable but nim implementation is still slow. What are possible bottleneck points in the implementation?

Can data access to numpy be performed in a simpler way?

Thanks, Zafer

zarican (orginal) [2019-01-29T20:06:27+01:00] view original

With --d:release flag,


nim c --opt:speed --d:release --app:lib --out:nimpy_numpy.so nimpy_numpy

It is now faster than NumPy.

processing time:

numpy: 0.6880900859832764

nim: 0.5004160404205322

miran (orginal) [2019-01-29T21:48:39+01:00] view original

Numpy in Python has an important place for computing tasks and until we have an equivalent (or better) package in nim, we may need to delegate some work to NumPy (or vice versa).

I'm not saying it is an equivalent to numpy, but I had a good experience with using Neo for some stuff which I would have used numpy otherwise. See if it is useful for your use cases.

mitai (orginal) [2019-01-29T22:19:06+01:00] view original

there is also: https://github.com/mratsim/Arraymancer

zarican (orginal) [2019-01-30T11:53:29+01:00] view original

Thanks. Indeed, neo, arraymancer and nimtorch are feasible alternatives.

Regarding accessing elements using ptr arithmetic, is there a simpler way?

Is there a generic way for pointer casting (instead of ptr cdouble)?

mratsim (orginal) [2019-02-01T08:56:58+01:00] view original

Hey there, authore of Arraymancer here.

Accessing elements using ptr arithmetic:

Usually you when you need pointer arithmetics you actually need pointer indexing similar to:

let rawBuffer = [0, 1, 2, 3, 4, 5]
    
    # use unsafeAddr if let variable
    # or addr if var variable
    # (Security so you don't escape with a pointer to an immutable variable
    #  by mistake, say when refactoring)
    let a = cast[ptr UncheckedArray[int]](rawBuffer[0].unsafeAddr)
    
    echo a[0]
    echo a[4]

i.e. ptr UncheckedArray gives you array semantics on pointers.

For pointer arithmetics, if you really needs them, you can follow this post for a safe well defined scope in which pointer arithmetics is allowed: https://forum.nim-lang.org/t/1188#7366

template ptrMath*(body: untyped) =
      template `+`*[T](p: ptr T, off: int): ptr T =
        cast[ptr type(p[])](cast[ByteAddress](p) +% off * sizeof(p[]))
      
      template `+=`*[T](p: ptr T, off: int) =
        p = p + off
      
      template `-`*[T](p: ptr T, off: int): ptr T =
        cast[ptr type(p[])](cast[ByteAddress](p) -% off * sizeof(p[]))
      
      template `-=`*[T](p: ptr T, off: int) =
        p = p - off
      
      template `[]`*[T](p: ptr T, off: int): T =
        (p + off)[]
      
      template `[]=`*[T](p: ptr T, off: int, val: T) =
        (p + off)[] = val
      
      body
    
    when isMainModule:
      ptrMath:
        var a: array[0..3, int]
        for i in a.low..a.high:
          a[i] += i
        var p = addr(a[0])
        p += 1
        p[0] -= 2
        echo p[0], " ", p[1]

Alternatively, for lower overhead this is what I use in Laser, the future very low-level machine and deep learning backend of Arraymancer:

# Warning for pointer arithmetics be careful of not passing a `var ptr`
    # to a function as `var` are passed by hidden pointers in Nim and the wrong
    # pointer will be modified. Templates are fine.
    
    func `+`*(p: ptr, offset: int): type(p) {.inline.}=
      ## Pointer increment
      {.emit: "`result` = `p` + `offset`;".}

+= should be defined in terms of + because var are passed by hidden pointers and the wrong pointer will be updated if you use "emit".

zarican (orginal) [2019-02-02T21:22:16+01:00] view original

Hi @mratsim,

First, you have done great job with arraymancer. Kudos!

Thanks a lot for the detailed explanation about ptr arithmetic to access buffer elements. Particularly, casting to UncheckedArray is very useful in my case. Other ways have many use cases too.

I am going to update my Numpy array access example using the new info you provided.

Mirror of forum.nim-lang.org

4606 :: An Attempt to Access NumPy Array Data rom Nim