numpy - sort matrix of arrays with another matrix of arrays

Question

I have a matrix called inference, made of arrays those are storing itemIds. each array represents recommendation for one customer. Here is a snapshot of just one array.

inference[0] = array([ 1, 17,  0, 29, 33, 10, 23, 18,  4, 25, 37, 41, 19,  7, 45, 44, 28,
        5, 21, 30, 27,  6, 16, 32,  3, 46, 47, 11, 24, 35, 39, 15, 22, 31,
       43], dtype=int32)

I also have a matrix called score, made of arrays those are storing prediction score for corresponding index's itemId. each array represents set of itemId scores for one customer. Here is a snapshot of just one array.

score[0] = array([ 4.66423448e-01,  3.04435879e-01,  1.20756114e-01,  7.42338740e-03,
        1.00917931e-02,  3.40771784e-02,  2.95762312e-02,  4.64895252e-03,
       -4.86475747e-02, -5.37142403e-03, -2.96056704e-04, -3.23560827e-05,
       -2.89172482e-02, -3.72408911e-02, -6.24527574e-01, -1.06988378e-04,
       -1.80022987e-03, -3.40648238e-02, -2.07088395e-02, -2.53725616e-03,
       -2.20156523e-02, -3.26039633e-02, -5.12802875e-02, -1.61312032e-03,
       -1.99290374e-01, -1.46841628e-04, -8.44907165e-01, -1.73397407e-01,
       -3.57963537e-02, -1.43663881e-03, -1.67909664e-03, -5.75751424e-03,
       -2.39864983e-02, -3.77825587e-03, -9.72822814e-04])

so, for customer 0, model's prediction score for itemId#1 is 4.66423448e-01, score for itemId#17 is 3.04435879e-01... and so on.

I would like to sort that inference matrix by score matrix. for e.g

sorted_matrix[0] = array([ 1, 17,  0, 10, 23, 33, 29, 18, 41, 44, 46, 37, 43, 35, 32, 39, 28,
       30, 31, 25, 15, 21, 27, 22, 19,  6,  5, 24,  7,  4, 16, 11,  3, 45,
       47], dtype=int32)

At array level, I simply did

inference[0][np.argsort(-1 * (score))[0]]

and it works. However, when I try to sort whole matrix with

new_inference = inference[np.argsort(-score)]

it resulted in nested matrix, where new_inference[0] becomes a 35x35 matrix itself, not array.

What am I doing wrong with np.argsort() here?

You want to sort the 'columns', not the rows? What shape should `new_inference` supposed to have? — hpaulj, Jun 21 '23 at 02:24
shape of new_inference would be same as original inference. I want to sort rows - but just within same row. not comparing value of one row to other values in other rows. — DS Park, Jun 21 '23 at 03:07

jared · Accepted Answer · 2023-06-21T16:52:08.197

3

What you seem to want is the inference rows reordered according to the sorted rows of scores. To sort each row of scores, use np.argsort(scores, axis=1). To be able to use that result to reorder inference, I will use this answer by hpaulj.

import numpy as np

def indices_for_2d_sort(a, ascending=True):
    """https://stackoverflow.com/a/33141247/12131013"""
    m = 1 if ascending else -1
    i = np.argsort(m*a, axis=1)
    return (np.arange(a.shape[0])[:,None], i)

rng = np.random.default_rng()

M = 5
N = 10
inference = np.ones((M, N))*np.arange(N)
[rng.shuffle(row) for row in inference]   # randomizes inference

scores = rng.random((M, N))

new_inference = inference[indices_for_2d_sort(scores)]

edited Jun 21 '23 at 16:52

answered Jun 21 '23 at 03:56

jared

4,165
1
8
31

1

changing np.argsort(a, axis=1) to np.argsort(-a, axis=1) brought me exact thing I wnated. thank you! – DS Park Jun 21 '23 at 16:44
I take it that you want the sort in descending order then. I've added an option to the function to change that. Then you won't have to deal with that negative sign yourself. – jared Jun 21 '23 at 16:52

numpy - sort matrix of arrays with another matrix of arrays

1 Answers1