0

I am using theano.clip to limit values of my numpy array. For e.g.

array = np.array([[  1.,  -1.,  -3.,   1.,   1.],
         [  3.,  -4.,  -5.,   0.,  -1.],
         [  8.,  -3.,  -7.,  -3.,  -3.],
         [  8.,   2.,  -2.,  -3.,  -3.],
         [  7.,   0.,   0.,   1.,   0.]])

max_val = np.array([2.0]).astype('float32')

T.clip(array, -max_val, max_val).eval()

Output:

array([[ 1., -1., -2.,  1.,  1.],
       [ 2., -2., -2.,  0., -1.],
       [ 2., -2., -2., -2., -2.],
       [ 2.,  2., -2., -2., -2.],
       [ 2.,  0.,  0.,  1.,  0.]])

I want to calculate how many values were clipped after the clipping operation. Is it possible?

blackbug
  • 1,098
  • 3
  • 13
  • 40
  • After or before, if in anyway it's possible. I need to limit the values but for stats purpose I need to calc how much data is lost after clipping. – blackbug Feb 27 '17 at 12:26

2 Answers2

1

If your array's name is a, you can do

np.logical_or(a >= 1, a <= -1).sum()

You won't count elements twice, since - max_val < max_val. However, this requires two passes on a.

P. Camilleri
  • 12,664
  • 7
  • 41
  • 76
1

Here's one approach with np.count_nonzero on a mask of values beyond the limits computed with comparison against the min and max limits -

np.count_nonzero((array < -max_val) | (array > max_val))

np.count_nonzero is meant for performance, as it operates on a mask/boolean array to get the total count pretty efficiently.

Alternatively, a shorter version using absolute values as the min and max limits as in this case they are just negative and positive values of the same limiting number -

np.count_nonzero(np.abs(array) > max_val)

Sample run -

In [267]: array
Out[267]: 
array([[ 1., -1., -3.,  1.,  1.],
       [ 3., -4., -5.,  0., -1.],
       [ 8., -3., -7., -3., -3.],
       [ 8.,  2., -2., -3., -3.],
       [ 7.,  0.,  0.,  1.,  0.]])

In [268]: max_val = np.array([2.0]).astype('float32')

In [269]: np.count_nonzero((array < -max_val) | (array > max_val))
Out[269]: 13

In [270]: np.count_nonzero(np.abs(array) > max_val)
Out[270]: 13
Community
  • 1
  • 1
Divakar
  • 218,885
  • 19
  • 262
  • 358
  • Thanks for the ans. I am using "where" `array[np.where(abs(array) > max_val)]` at the moment. Which one you think will be much faster? – blackbug Feb 27 '17 at 12:48
  • 1
    @blackbug Don't use `np.where`. Directly use the mask : `array[abs(array) > max_val]`. But that gives you the elements themselves, not the count, so that's a different thing than counting the number of clipped elements. – Divakar Feb 27 '17 at 12:49