numpy array of array adding up another array

Question

I am having the following array of array

a = np.array([[1,2,3],[4,5,6]]) 
b = np.array([[1,5,10])

and want to add up the value in b into a, like

np.array([[2,7,13],[5,10,16]])

what is the best approach with performance concern to achieve the goal?

Thanks

score 0 · Answer 1 · answered Aug 20 '21 at 10:17

0

Broadcasting does that for you, so:

>>> a+b

just works:

array([[ 2,  7, 13],
       [ 5, 10, 16]])

And it can also be done with

>>> a  + np.tile(b,(2,1))

which gives the result

array([[ 2,  7, 13],
       [ 5, 10, 16]])

answered Aug 20 '21 at 10:17

Bedir Yilmaz

3,823
5
34
54

eroot163pi · Answer 2 · 2021-08-20T12:34:24.280

Depending on size of inputs and time constraints, both methods might be of consideration

Method 1: Numpy Broadcasting

Operation on two arrays are possible if they are compatible
Operation generally done along with broadcasting
broadcasting in lay man terms could be called repeating elements along a specified axis
Conditions for broadcasting
- Arrays need to be compatible
- Compatibility is decided based on their shapes
- shapes are compared from right to left.
- from right to left while comparing, either they should be equal or one of them should be 1
- smaller array is broadcasted(repeated) over bigger array

a.shape, b.shape
((2, 3), (1, 3))

From the rules they are compatible, so they can be added, b is smaller, so b is repeated long 1 dimension, so b can be treated as [[ 5, 10, 16], [ 5, 10, 16]]. But note numpy does not allocate new memory, it is just view.

a + b

array([[ 2,  7, 13],
       [ 5, 10, 16]])

Method 2: Numba

Numba gives parallelism
It will convert to optimized machine code
Why this is because, sometimes numpy broadcasting is not good enough, ufuncs(np.add, np.matmul, etc) allocate temp memory during operations and it might be time consuming if already on memory limits
Easy parallelization
Using numba based on your requirement, you might not need temp memory allocation or various checks which numpy does, which can speed up code for huge inputs, for example. Why are np.hypot and np.subtract.outer very fast?

import numba as nb

@nb.njit(parallel=True)
def sum(a, b):
    s = np.empty(a.shape, dtype=a.dtype)
    # nb.prange gives numba hint to what to parallelize
    for i in nb.prange(a.shape[0]):
        s[i] = a[i] + b
    return s
sum(a, b)

Hi, thanks for the answer. I have tried both method, the numba method seems like perform worse than the other one, with the A has got 100 elements. Am I right that numba would perform better when the size of A become very large? — user4603876, Aug 20 '21 at 12:29
I also have another question about filtering the array. https://stackoverflow.com/questions/68860546/numpy-array-of-array-adding-up-another-array/68861371#68861371 Would there be also a solution in numba that would deal with large size of the array? — user4603876, Aug 20 '21 at 12:30
I mean like 10000000 elements, numba's first execution is always slow due to compilation. 100 is too small — eroot163pi, Aug 20 '21 at 12:30
I am talking about problems like https://stackoverflow.com/questions/68591676/why-are-np-hypot-and-np-subtract-outer-very-fast — eroot163pi, Aug 20 '21 at 12:31

numpy array of array adding up another array

2 Answers2