Just to explain some things more about my use case, A
is a sparse matrix with tf-idf values and B
is an array with some additional features of my data.
I have already splitted to training and test sets so A
and B
in my example are only about the training set. I (want to) do the same for the test set after this code.
I want to concatenate these matrices/arrays because then I want to pass them to a sklearn
ML model to train it and I do not think that I can pass them separately.
So I tried to do this:
C = np.concatenate((A, B.T), axis=1)
where A is a <class 'scipy.sparse.csr.csr_matrix'>
and B is a <class 'numpy.ndarray'>
.
However, when I try to do this then I get the following error:
ValueError: zero-dimensional arrays cannot be concatenated
Also, I do not think that the idea of `np.concatenate` a numpy array with a sparse matrix is very good in my case because
- it is basically impossible to covert my sparse array
A
to a dense array because it is too big - I will lose (or not actually??) information if I convert my fully dense array
B
to a sparse array
What is the best way to pass to an sklearn
ML model a sparse and a fully dense array concatenated by rows?