1

I have a working and functional python code, but I wander if it could be faster.

We have, in total, four input arrays of the same shape with two dimensions. Three of these arrays contain integers (which will be used as indices), the other array contains doubles.

The function to be optimised needs to add the value of the double-containing array to a 3D array, at a position defined by the indices-containing arrays. My code does this as following:

    array_3D[index_one, index_two, index_three] += array_with_doubles

So the question is: is this an efficient way of programming? I'm not sure, but I hope that the [ ] indexing notation could be replaced by something efficient. This function is called a lot, and takes +- 50% of my executing time (according to snakeviz).

A different strategy could be to reduce the dimensions of the 3D-array, although I can imagine that the code would lose a lot in readability.

hpaulj
  • 221,503
  • 14
  • 230
  • 353
RJ45
  • 11
  • 3
  • If the 4 arrays match in size you may not need the `ravel`. But if you the same indexes repeatedly, operating on `.flat` output array might be faster. – hpaulj Aug 31 '15 at 14:57
  • Oh, indeed - the `ravel` function isn't necessary! I'm now trying to do something with `np.take` to operate, since it should (?) be faster than the `[ ... ] ` fancy indexing. – RJ45 Aug 31 '15 at 15:19
  • You wander. I do too bro. – Derrops Aug 31 '15 at 16:05
  • Is `np.put` any faster? – TheBlackCat Aug 31 '15 at 16:32
  • If you were doing `a[i, j, k] = v` you could use `np.put` but since you're using `a[i, j, k] += v`, I'm not sure if you can. And `np.put` seems slower anyway. – askewchan Aug 31 '15 at 16:35
  • 1
    You use `+=`. Are the indexed points unique (each a different point)? Or can be there be repeats? And if repeats, what is supposed to happen? – hpaulj Aug 31 '15 at 16:40

1 Answers1

1

A simpler 2d case:

In [48]: index1=np.array([1,1,2,2,3,3,4,4]);
     index2=np.array([0,2,1,2,3,4,4,5])
In [49]: data=np.arange(1,9)
In [50]: target=np.zeros((5,6))
In [53]: target[index1,index2]=data

In [54]: target
Out[54]: 
array([[ 0.,  0.,  0.,  0.,  0.,  0.],
       [ 1.,  0.,  2.,  0.,  0.,  0.],
       [ 0.,  3.,  4.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  5.,  6.,  0.],
       [ 0.,  0.,  0.,  0.,  7.,  8.]])

If you 'ravel' the indexes, you can use put or target.flat:

In [51]: flatindex=np.ravel_multi_index((index1,index2),target.shape)
In [52]: flatindex
Out[52]: array([ 6,  8, 13, 14, 21, 22, 28, 29], dtype=int32)
In [58]: np.put(target,flatindex,data)
In [61]: target.flat[flatindex]=data

Some quick time comparisons (for =data, not +=data):

In [63]: timeit target[index1,index2]=data
100000 loops, best of 3: 6.63 µs per loop

In [64]: timeit np.put(target,flatindex,data)
100000 loops, best of 3: 2.47 µs per loop

In [65]: timeit target.flat[flatindex]=data
100000 loops, best of 3: 2.77 µs per loop

In [66]: %%timeit
   ....: flatindex=np.ravel_multi_index((index1,index2),target.shape)
   ....: target.flat[flatindex]=data
   ....: 
100000 loops, best of 3: 7.34 µs per loop

target.flat[]= is the winner - if the raveled index is already available. This could be the case if you are repeatedly applying this calculation with the same index arrays. Keep in mind that timetests on small arrays might not scale the same with big ones.

With += instead, put does not work. flat has a speed advantage, even when the ravel has to be calculated:

In [78]: timeit target[index1,index2]+=data
100000 loops, best of 3: 16.2 µs per loop

In [79]: timeit target.flat[flatindex]+=data
100000 loops, best of 3: 7.45 µs per loop

In [80]: %%timeit                          
flatindex=np.ravel_multi_index((index1,index2),target.shape)
target.flat[flatindex]+=data
   ....: 
100000 loops, best of 3: 13.4 µs per loop

HOWEVER - if the indexes have repeats, and you want all the data values to be added, the problem changes significantly. Direct indexing like this uses buffering, so only the last addition for a point applies.

See this recent SO question for discussion of this buffering issue and alternatives

Vector operations with numpy

Community
  • 1
  • 1
hpaulj
  • 221,503
  • 14
  • 230
  • 353