Complexity of searching sorted matrix

Question

Suppose we have a matrix of size NxN of numbers where all the rows and columns are in increasing order, and we want to find if it contains a value v. One algorithm is to perform a binary search on the middle row, to find the elements closest in value to v: M[row,col] < v < M[row,col+1] (if we find v exactly, the search is complete). Since the matrix is sorted we know that v is larger than all elements in the sub-matrix M[0..row, 0..col] (the top-left quadrant of the matrix), and similarly it's smaller than all elements in the sub-matrix M[row..N-1, col+1..N-1] (the bottom right quadrant). So we can recursively search the top right quadrant M[0..row-1, col+1..N-1] and the bottom left quadrant M[row+1..N-1, 0..col].

The question is what is the complexity of this algorithm ?

Example: Suppose we have the 5x5 matrix shown below and we are searching for the number 25:

0 10 20 30 40
1 11 21 31 41
2 12 22 32 42
3 13 23 33 43
4 14 24 34 44

In the first iteration we perform binary search on the middle row and find the closest element which is smaller than 25 is 22 (at row=2 col=2). So now we know 25 is larger than all items in the top-left 3x3 quadrant:

0 10 20
1 11 21
2 12 22

Similary we know 25 is smaller than all elements in the bottom right 3x2 quadrant:

32 42
33 43
34 44

So, we recursively search the remaining quadrants - the top right 2x2:

30 40
31 41

and the bottom left 2x3:

3 13 23
4 14 24

And so on. We essentially divided the matrix into 4 quadrants (which might be of different sizes depending on the result of the binary search on the middle row), and then we recursively search two of the quadrants.

what do you want? time complexity, or (efiiency in comparison to other algorithms) — rohankvashisht, Jul 26 '15 at 07:10
time complexity compared to the well known algorithm of walking the matrix from the top right corner, by comparing the value to the current cell, and moving down if it's larger, or left if it's smaller. This algorithm is O(N), so I was wondering if the recursive partitioning algorithm I described in the original question is better or worse than O(N). — gen-y-s, Jul 26 '15 at 08:03
Related: [How do I search for a number in a 2d array sorted left to right and top to bottom?](https://stackoverflow.com/q/2457792) — Bernhard Barker, Jan 02 '18 at 13:19

score 2 · Answer 1 · answered Jul 26 '15 at 20:50

The worst-case running time is Theta(n). Certainly this is as good as it gets for correct algorithms (consider an anti-diagonal, with elements less than v above and elements greater than v below). As far as upper bounds go, the bound for an n-row, m-column matrix is O(n log(2 + m/n)), as evidenced by the correct recurrence

                  m-1
f(n, m) = log m + max [f(n/2, j) + f(n/2, m-1 - j)],
                  j=0

where there are two sub-problems, not one. This recurrence is solvable by the substitution method.

        ?
f(n, m) ≤ c n log(2 + m/n) - log(m) - 2  [hypothesis; c to be chosen later]

                  m-1
f(n, m) = log m + max [f((n-1)/2, j) + f((n-1)/2, m-j)]
                  j=0

                    m-1
        ≤   log m + max [  c (n/2) log(2 + j/(n/2)) - log(j) - 2
                         + c (n/2) log(2 + (m-j)/(n/2))] - log(m-j) - 2]
                    j=0

        [fixing j = m/2 by the concavity of log]

        ≤ log m + c n log(2 + m/n) - 2 log(m/2) - 4

        = log m + c n log(2 + m/n) - 2 log(m) - 2

        = c n log(2 + m/n) - log(m) - 2.

Set c large enough that, for all n, m,

c n log(2 + m/n) - log(m) - 2 ≥ log(m),

where log(m) is the cost of the base case n = 1.

Anmol Singh Jaggi · Answer 2 · 2015-07-26T13:21:44.020

0

~~The complexity of this algorithm will be -:~~

O(log₂(n*n)) = O(log₂(n))

This is because you are eliminating half of the matrix in one iteration.

EDIT -:

Recurrence relation -:

Assuming n to be the total number of elements in the matrix,

=> T(n) = T(n/2) + log(sqrt(n))
=> T(n) = T(n/2) + log(n^(1/2))
=> T(n) = T(n/2) + 1/2 * log(n)

Here, a = 1, b = 2. Therefore, c = log_b(a) = log₂(1) = 0
=> n^c = n^0
Also, f(n) = n^0 * 1/2 * log(n)

According to case 2 of Master Theorem,

T(n) = O((log(n))^2)

edited Jul 26 '15 at 13:21

answered Jul 26 '15 at 09:26

Anmol Singh Jaggi

8,376
4
36
77

You are right! Updated the complexity calculations. – Anmol Singh Jaggi Jul 26 '15 at 12:53
The answer is incorrect - the result should be O(sqrt(n)). Please also see my posted answer. The wrong starting point is: T(n)=T(n/2)+... which should actually be T(n)=2*T(n/4)+... - leading to a wrong ("too good") result. – eci Jan 22 '21 at 13:15

omikad · Answer 3 · 2015-07-26T11:45:02.593

0

If you find your element after n steps, then the searchable range has size N = 4^n. Then, time complexity is O(log base 4 of N) = O(log N / log 4) = O(0.5 * log N) = O(log N).

In other words, your algorithm is two times faster then binary search, which is equal to O(log N)

edited Jul 26 '15 at 11:45

answered Jul 26 '15 at 10:14

omikad

979
8
10

score 0 · Answer 4 · answered Jul 26 '15 at 12:49

A consideration on binary search on matrices:

Binary search on 2D matrices and in general ND matrices are nothing different than binary search on sorted 1D vectors. Infact C for instance store them in row-major fashion(as concat of rows from: [[row0],[row1]..[rowk]] This means one can use the well-known binary search on matrix as following (with complexity log(n*m)):

template<typename T>
bool binarySearch_2D(T target,T** matrix){
    int a=0;int b=NCELLS-1;//ROWS*COLS
    bool found=false;
    while(!found && a <= b){
        int half=(a+b)/2;
        int r=half/COLS;
        int c=half-(half/COLS)*COLS;
        int v =matrix[r][c];
        if(v==target)
            found=true;
        else if(target > v)
            a=half+1;
        else //target < v
            b=half-1;
    }
    return found;
}

score 0 · Answer 5 · answered Jan 22 '21 at 13:14

You can use a recursive function and apply the master theorem to find the complexity.

Assume n is the number of elements in the matrix. Cost for one step is binary search on sqrt(n) elements and you get two problems, in worst case same size each with n/4 elements: 2*T(n/4). So we have:

T(n)=2*T(n/4)+log(sqrt(n))

equal to

T(n)=2*T(n/4)+log(n)/2

Now apply master theorem case 1 (a=2, b=4, f(n)=log(n)/2 and f(n) in O(n^log_b(a))=O(n^(1/2)) therefore we have case 1)

=> Total running time T(n) is in O(n^(a/b)) = O(n^(1/2))

or equal to

O(sqrt(n))

which is equal to height or width of the matrix if both sides are the same.

mostruash · Answer 6 · 2015-07-26T13:19:23.860

Let's assume that we have the following matrix:

1 2 3
4 5 6
7 8 9

Let's search for value 7 using binary search as you specified:

Search nearest value to 7 in middle row: 4 5 6, which is 6. Hmm we have a problem, 7 is not in the following submatrix:

6
9

So what to do? One solution would be to apply binary search to all rows, which has a complexity of nlog(n). So walking the matrix is a better solution.

Edit:

Recursion relation:

T(N*N) = T(N*N/2) + log(N)

if we normalize the function to one variable with M = N^2:

T(M) = T(M/2) + log(sqrt(M))
T(M) = T(M/2) + log(M)/2

According to Master Theorem Case #2, complexity is

(log(M))^2
=> (2log(N))^2
=> (log(N))^2

Edit 2:

Sorry I answered your question from my mobile, now when you think about it, M[0...row-1, col+1...N-1] doesn't make much sense right? Consider my example, if you search for a value that is smaller than all values in the middle row, you'll always end up with the leftmost number. Similarly, if you search for a value that is greater than all values in the middle row, you'll end up with the rightmost number. So the algorithm can be reworded as follows:

Search middle row with custom binary search that returns 1 <= idx <= N if found, idx = 0 or idx = N+1 if not found. After binary search if idx = 0, start the search in the upper submatrix: M[0...row][0...N]. If the index is N + 1 start the search in the lower submatrix: M[row+1...N][0...N]. Otherwise, we are done.

You suggest that complexity should be: 2T(M/4) + log(M)/2. But at each step, we divide the whole matrix by two and only process one of them. Moreover, if you agree that T(N*N) = T(N*N/2) + log(N) is correct, than you can substitute all N*N expressions with M.

No, you got the algorithm wrong. After we search for 7 in the middle row, and find 6 as the closest element smaller than 7 (at row=1 col=2), we recursively search M[0..row-1, col+1..N-1] and M[row+1..N-1, 0..col]. The first one is M[0..0, 3..2] which is out-of-bounds, and the second one is M[2..2, 0..2] which is the whole last row. So in the recursive step you search the last row and find 7. — gen-y-s, Jul 26 '15 at 11:17
Hmm... I think if you use M=N^2, then the recursive formula should be T(M)=2T(M/4)+log(sqrt(M)) — gen-y-s, Jul 26 '15 at 12:25

Complexity of searching sorted matrix

6 Answers6