I have a question similar to one that was already answered before, with slight modifications: I have a 2-D numpy array with values -1,0,1. I want to find the cluster sizes of elements with values 1 or -1 (separately). Do you have any suggestions on how to do this? Thanks!
Asked
Active
Viewed 444 times
2 Answers
0
Are you trying to count the number of 1 or -1 existing in the 2D array?
If so, then it is fairly easy. Say
ar = np.array([1, -1, 1, 0, -1])
If so,
n_1 = sum([1 for each in ar if each==1]);
n_m1 = sum([1 for each in ar if each==-1])

tomkaith13
- 1,717
- 4
- 27
- 39

Canrong Qiu
- 136
- 1
- 6
-
Thanks for the reply. This is actually not the problem I'm trying to solve - I have a 2-D array of let's say random arrangement of -1,0,1 values. If two connected elements in the array have the same value, we say that is a cluster of size 2. For example: [1 0 0 0 \n 1 1 0 0 \n 0 0 0 0] has a cluster of size 3 in it, since 3 ones are connected. I want to find the clusters of type 1, and clusters of time -1 in my array. – Nir Livne Mar 29 '20 at 12:01
-
Do you also want to distinguish the arrangement of clusters with same size? cluster size of [1 0 0 0 \n 1 1 0 0 \n 0 0 0 0] is equivalent to the size of [1 1 1 0 \n 0 0 0 0 \n 0 0 0 0] but they have differnt arrangement in the 2D array. – Canrong Qiu Mar 29 '20 at 12:21
-
I don't mind the arrangement, I just want the size of the clusters. There's a good way to find clusters using the reference I've added in the initial question, but I can't figure out how to distinguish the different clusters by their type (-1 or 1) and to ignore the 0's. Thanks for taking the time to help!! – Nir Livne Mar 29 '20 at 13:07
0
This is my solution:
import numpy as np
import copy
arr = np.array([[1,1,-1,0,1],[1,1,0,1,1],[0,1,0,1,0],[-1,-1,1,0,1]])
print(arr)
col, row = arr.shape
mask_ = np.ma.make_mask(np.ones((col,row)))
cluster_size = {}
def find_neighbor(arr, mask, col_index, row_index):
index_holder = []
col, row = arr.shape
left = (col_index, row_index-1)
right = (col_index,row_index+1)
top = (col_index-1,row_index)
bottom = (col_index+1,row_index)
left_ = row_index-1>=0
right_ = (row_index+1)<row
top_ = (col_index-1)>=0
bottom_ = (col_index+1)<col
#print(list(zip([left,right,top,bottom],[left_,right_,top_,bottom_])))
for each in zip([left,right,top,bottom],[left_,right_,top_,bottom_]):
if each[-1]:
if arr[col_index,row_index]==arr[each[0][0],each[0][1]] and mask[each[0][0],each[0][1]]:
mask[each[0][0],each[0][1]] = False
index_holder.append(each[0])
return mask,index_holder
for i in range(col):
for j in range(row):
if mask_[i,j] == False:
pass
else:
value = arr[i,j]
mask_[i,j] = False
index_to_check = [(i,j)]
kk=1
while len(index_to_check)!=0:
index_to_check_deepcopy = copy.deepcopy(index_to_check)
for each in index_to_check:
mask_, temp_index = find_neighbor(arr,mask_,each[0],each[1])
index_to_check = index_to_check + temp_index
# print("check",each,temp_index,index_to_check)
kk+=len(temp_index)
for each in index_to_check_deepcopy:
del(index_to_check[index_to_check.index(each)])
if (value,kk) in cluster_size:
cluster_size[(value,kk)] = cluster_size[(value,kk)] + 1
else:
cluster_size[(value,kk)] = 1
print(cluster_size)
cluster_size is a dictionary, the key is a two member tuple (a,b), a give the value of the cluster (that's what you want to solve, right), b gives the counts of that value. The value for each key is the number of cluster.

Canrong Qiu
- 136
- 1
- 6
-
wow! that works perfectly! I was thinking that there should be a some way to use the existing cluster analysis methods, but your is great! thanks! – Nir Livne Mar 30 '20 at 09:20