3

I have the following problem: I have index arrays with repeating indices and would like to add values to an array like this:

grid_array[xidx[:],yidx[:],zidx[:]] += data[:]

However, as I have repeated indices this does not work as it should because numpy will create a temporary array which results in the data for the repeated indices being assigned several times instead of being added to each other (see http://docs.scipy.org/doc/numpy/user/basics.indexing.html).

A for loop like

for i in range(0,n):
    grid_array[xidx[i],yidx[i],zidx[i]] += data[i]

will be way to slow. Is there a way I can still use the vectorization of numpy? Or is there another way to make this assignment faster?

Thanks for your help

numberCruncher
  • 595
  • 1
  • 6
  • 25

4 Answers4

3

How about using bincount?

import numpy as np

flat_index = np.ravel_multi_index([xidx, yidx, zidx], grid_array.shape)
datasum = np.bincount(flat_index, data, minlength=grid_array.size)
grid_array += datasum.reshape(grid_array.shape)
Bi Rico
  • 25,283
  • 3
  • 52
  • 75
2

This is a buffering issue. The .at provides unbuffered action http://docs.scipy.org/doc/numpy/reference/generated/numpy.ufunc.at.html#numpy.ufunc.at

np.add.at(grid_array, (xidx,yidx,zidx),data)
hpaulj
  • 221,503
  • 14
  • 230
  • 353
  • Thanks for making me aware of this. – numberCruncher Mar 20 '15 at 20:59
  • Thanks for the great answer. This helped me a lot! And I think this answer is much better than the currently accepted one... In fact, if I ever get >= 150 points, I will award you 50 as a bounty! :) – PhoemueX May 13 '16 at 14:11
0

For add an array to elements of a nested array you just can do grid_array[::]+=data :

>>> grid_array=np.array([[1,2,3],[4,5,6],[7,8,9]])
>>> data=np.array([3,3,3])
>>> grid_array[::]+=data
>>> grid_array
array([[ 4,  5,  6],
       [ 7,  8,  9],
       [10, 11, 12]])
Mazdak
  • 105,000
  • 18
  • 159
  • 188
-1

I think I found a possible solution:

def assign(xidx,yidx,zidx,data):
    grid_array[xidx,yidx,zidx] += data
    return

map(assign,xidx,yidx,zidx,sn.part0['mass'])
numberCruncher
  • 595
  • 1
  • 6
  • 25
  • `map` just hides the loop. – hpaulj Mar 20 '15 at 15:29
  • I guess it must be handled differently than the for loop, as it runs a lot faster. – numberCruncher Mar 20 '15 at 15:31
  • In my small tests I don't get much difference in time with various looping constructs. – hpaulj Mar 20 '15 at 16:46
  • Using map for in place operations (often called side-effects) is a bad idea. If for no other reason, you should avoid it because in python3 map returns a generator and you'll need to write `for item in map(assign,xidx,yidx,zidx,sn.part0['mass']): pass` which is really ugly. – Bi Rico Mar 20 '15 at 19:34