2

How to convert dataframe into sparse matrix? For example, please see below the dataframe. Need to see this dataframe as a whole factor with levels 1~10. Convert levels into 10 columns. Convert 0 to 1 based on what factor it has it each Row in V1~V4. Which will become something like the Expected Outcome.

Take Row 1 in Expected Outcome for example, column No1, No2, No5, No8 == 1 because V1~V4 has the number 1,8,2,5.

Update: Have tried solution Create Sparse Matrix from a data frame , cannot get it to match my Expected Outcome

Dataframe

    V1   V2   V3   V4  
1   1    8    2    5
2   6    7    9    3
3   6    2    3    2
4   5    8    9    10
5   4    3    5    1
6   3    9    1    10

Expected Outcome

    V1   V2   V3   V4   No1 No2 No3 No4 No5 No6 No7 No8 No9 No10 
1   1    8    2    5     1   1   0   0   1   0   0   1   0   0
2   6    7    9    3     0   0   1   0   0   1   1   0   1   0
3   6    2    3    4     0   1   1   1   0   1   0   0   0   0  
4   5    8    9    10    0   0   0   0   1   0   0   1   1   1
5   4    3    5    1     1   0   1   1   1   0   0   0   0   0
6   3    9    1    10    1   0   1   0   0   0   0   0   1   1
Intern Ne
  • 65
  • 7
  • There was a question like yours here: https://stackoverflow.com/a/26208307/4525807. Try those solutions out. – JellisHeRo Jun 23 '17 at 23:24
  • @JellisHeRo I've tried it, cannot make it something like my `Expected Outcome` , that's why I'm asking this question. – Intern Ne Jun 23 '17 at 23:53
  • Is this the question? – Evan Friedland Jun 24 '17 at 00:00
  • Yes. That's right. – Intern Ne Jun 24 '17 at 00:02
  • 1
    It won't let me answer the question because it is locked but all you need to do is: `dat <- as.data.frame(matrix(c(1,8,2,5,6,7,9,3,6,2,3,2,5,8,9,10,4,3,5,1,3,9,1,10), byrow = T, ncol = 4)) ; sparse <- as.data.frame(matrix(0, nrow = nrow(dat), ncol = max(dat))) ; for(i in 1:nrow(dat)) sparse[i, as.matrix(dat[i,])] <- 1 ; colnames(sparse) <- paste0("No",full); cbind(data.frame(dat), sparse)` – Evan Friedland Jun 24 '17 at 00:29
  • @EvanFriedland Thanks for the reply, I got this `Error in paste0("No", full) : object 'full' not found` after executing the codes. – Intern Ne Jun 24 '17 at 01:09
  • Oh woops, instead of full just use `1:max(dat)` – Evan Friedland Jun 24 '17 at 01:10
  • 1
    `dat <- as.data.frame(matrix(c(1,8,2,5,6,7,9,3,6,2,3,2,5,8,9,10,4,3,5,1,3,9,1,10), byrow = T, ncol = 4)) ; sparse <- as.data.frame(matrix(0, nrow = nrow(dat), ncol = max(dat))) ; for(i in 1:nrow(dat)) sparse[i, as.matrix(dat[i,])] <- 1 ; cbind(dat, setNames(sparse, paste0("No",1:max(dat))))` – Evan Friedland Jun 24 '17 at 01:13
  • What if the data is too big? eg: `Error: cannot allocate vector of size 165.5 Gb` when executing `sparse <- as.data.frame(matrix(0, nrow = nrow(dat), ncol = max(dat))) ` – Intern Ne Jun 24 '17 at 01:29

0 Answers0