Depending on size of inputs and time constraints, both methods might be of consideration
Method 1: Numpy Broadcasting
- Operation on two arrays are possible if they are compatible
- Operation generally done along with broadcasting
- broadcasting in lay man terms could be called repeating elements along a specified axis
- Conditions for broadcasting
- Arrays need to be compatible
- Compatibility is decided based on their shapes
- shapes are compared from right to left.
- from right to left while comparing, either they should be equal or one of them should be 1
- smaller array is broadcasted(repeated) over bigger array
a.shape, b.shape
((2, 3), (1, 3))
From the rules they are compatible, so they can be added, b is smaller, so b is repeated long 1 dimension, so b can be treated as [[ 5, 10, 16], [ 5, 10, 16]]. But note numpy does not allocate new memory, it is just view.
a + b
array([[ 2, 7, 13],
[ 5, 10, 16]])
Method 2: Numba
- Numba gives parallelism
- It will convert to optimized machine code
- Why this is because, sometimes numpy broadcasting is not good enough, ufuncs(np.add, np.matmul, etc) allocate temp memory during operations and it might be time consuming if already on memory limits
- Easy parallelization
- Using numba based on your requirement, you might not need temp memory allocation or various checks which numpy does, which can speed up code for huge inputs, for example. Why are np.hypot and np.subtract.outer very fast?
import numba as nb
@nb.njit(parallel=True)
def sum(a, b):
s = np.empty(a.shape, dtype=a.dtype)
# nb.prange gives numba hint to what to parallelize
for i in nb.prange(a.shape[0]):
s[i] = a[i] + b
return s
sum(a, b)