Is any information lost by converting a fully dense array to a sparse matrix?

Question

Let's suppose that A is a (scipy) sparse matrix with tf-idf values and B is a (numpy) array with some additional features of my data.

Each of the rows of A and B correspond to the same observation.

I want to concatenate these matrices/arrays because then I want to pass them to a sklearn ML model to train it and I do not think that I can pass them separately.

According, to this answer (https://stackoverflow.com/a/49420566/9024698) there are two ways to concatenate these arrays:

Convert the sparse array (A) to a dense array and then concatenate
Convert the fully dense array (B) to a sparse matrix

However, (1) in my case is basically impossible because A in my case is too big.

Therefore, I can think of converting my fully dense array (B) to a sparse array.

However, my question is do I lose any information by doing this (i.e. by converting a fully dense array to a sparse one)?

This post (How to combine TFIDF features with other features) is related to my post but it does not explicitly give an answer to my question.

Nope, sparse storage is not lossy. You can verify that yourself by creating a sparse matrix from your dense array, converting back (using `.A` or `.todense()` attribute) and comparing to the original array. — Paul Panzer, Aug 05 '19 at 17:17
@PaulPanzer, ok so you mean that in the case of Adense -> Asparse -> Adense_again then Adense and Adense_again are absolutely the same? — Outcast, Aug 05 '19 at 17:19
Yes, exactly. You can even directly compare `Adense==Asparse` and you will get a (dense) array filled with `True`s. — Paul Panzer, Aug 05 '19 at 17:25
@PaulPanzer, ok sounds pretty good, thank you. Although I am not sure if this sparse representation makes any (considerable) difference to my ML model. — Outcast, Aug 05 '19 at 17:30

score 0 · Answer 1 · answered Aug 05 '19 at 17:25

0

No you don't lose any information. Sparse/Dense are two different representation of the same data in this case. See https://machinelearningmastery.com/sparse-matrices-for-machine-learning/ for more details

answered Aug 05 '19 at 17:25

cookiemonster

1,315
12
19

Is any information lost by converting a fully dense array to a sparse matrix?

1 Answers1