0

I have a question similar to one that was already answered before, with slight modifications: I have a 2-D numpy array with values -1,0,1. I want to find the cluster sizes of elements with values 1 or -1 (separately). Do you have any suggestions on how to do this? Thanks!

Nir Livne
  • 3
  • 2

2 Answers2

0

Are you trying to count the number of 1 or -1 existing in the 2D array?

If so, then it is fairly easy. Say ar = np.array([1, -1, 1, 0, -1])

If so,

 n_1 = sum([1 for each in ar if each==1]);
 n_m1 = sum([1 for each in ar if each==-1]) 
tomkaith13
  • 1,717
  • 4
  • 27
  • 39
Canrong Qiu
  • 136
  • 1
  • 6
  • Thanks for the reply. This is actually not the problem I'm trying to solve - I have a 2-D array of let's say random arrangement of -1,0,1 values. If two connected elements in the array have the same value, we say that is a cluster of size 2. For example: [1 0 0 0 \n 1 1 0 0 \n 0 0 0 0] has a cluster of size 3 in it, since 3 ones are connected. I want to find the clusters of type 1, and clusters of time -1 in my array. – Nir Livne Mar 29 '20 at 12:01
  • Do you also want to distinguish the arrangement of clusters with same size? cluster size of [1 0 0 0 \n 1 1 0 0 \n 0 0 0 0] is equivalent to the size of [1 1 1 0 \n 0 0 0 0 \n 0 0 0 0] but they have differnt arrangement in the 2D array. – Canrong Qiu Mar 29 '20 at 12:21
  • I don't mind the arrangement, I just want the size of the clusters. There's a good way to find clusters using the reference I've added in the initial question, but I can't figure out how to distinguish the different clusters by their type (-1 or 1) and to ignore the 0's. Thanks for taking the time to help!! – Nir Livne Mar 29 '20 at 13:07
0

This is my solution:

import numpy as np
import copy

arr = np.array([[1,1,-1,0,1],[1,1,0,1,1],[0,1,0,1,0],[-1,-1,1,0,1]])
print(arr)
col, row = arr.shape
mask_ = np.ma.make_mask(np.ones((col,row)))

cluster_size = {}

def find_neighbor(arr, mask, col_index, row_index):
    index_holder = []

    col, row = arr.shape
    left = (col_index, row_index-1)
    right = (col_index,row_index+1)
    top = (col_index-1,row_index)
    bottom = (col_index+1,row_index)

    left_ = row_index-1>=0
    right_ = (row_index+1)<row
    top_ = (col_index-1)>=0
    bottom_ = (col_index+1)<col

    #print(list(zip([left,right,top,bottom],[left_,right_,top_,bottom_])))
    for each in zip([left,right,top,bottom],[left_,right_,top_,bottom_]):
        if each[-1]:
            if arr[col_index,row_index]==arr[each[0][0],each[0][1]] and mask[each[0][0],each[0][1]]:
                mask[each[0][0],each[0][1]] = False
                index_holder.append(each[0])

    return mask,index_holder

for i in range(col):
    for j in range(row):
        if mask_[i,j] == False:
            pass
        else:
            value = arr[i,j]
            mask_[i,j] = False
            index_to_check = [(i,j)]
            kk=1
            while len(index_to_check)!=0:
                index_to_check_deepcopy = copy.deepcopy(index_to_check)
                for each in index_to_check:
                    mask_, temp_index = find_neighbor(arr,mask_,each[0],each[1])
                    index_to_check = index_to_check + temp_index
                    # print("check",each,temp_index,index_to_check)
                    kk+=len(temp_index)
                for each in index_to_check_deepcopy:
                    del(index_to_check[index_to_check.index(each)])
            if (value,kk) in cluster_size:
                cluster_size[(value,kk)] = cluster_size[(value,kk)] + 1
            else:
                cluster_size[(value,kk)] = 1
print(cluster_size)

cluster_size is a dictionary, the key is a two member tuple (a,b), a give the value of the cluster (that's what you want to solve, right), b gives the counts of that value. The value for each key is the number of cluster.

Canrong Qiu
  • 136
  • 1
  • 6
  • wow! that works perfectly! I was thinking that there should be a some way to use the existing cluster analysis methods, but your is great! thanks! – Nir Livne Mar 30 '20 at 09:20