Melt a dummy matrix to a column

Question

If I have a factor variable, say x = factor(c(1, 2, 3)), then I can use model.matrix function to generate a dummy matrix:

model.matrix(~x + 0)

and I will get a matrix like:

My question is that, if I already have a large dummy matrix, how could I melt it back to a (factor) column?

In another world, is there an inverse function of model.matrix?

assuming your data is named df you can try `apply(df, 1, which.max)`. — Mamoun Benghezal, Mar 31 '15 at 12:33
@MamounBenghezal that won't work for any other vector than `c(1,2,3)` — David Arenburg, Mar 31 '15 at 13:27
@DavidArenburg, why is that ? Here is a proof of good faith `set.seed(1); x <- as.factor(sample(5, 10, replace = T)); mat <- model.matrix(~x-1); par <- as.factor(apply(mat, 1, which.max)); identical(par, x) # TRUE`. This seems to work quite good to me. — Mamoun Benghezal, Mar 31 '15 at 13:37
@MamounBenghezal you are trying too hard, try setting `x <- factor(c(55,3))` or `x <- factor(c(1,1,3)` and then run your code. — David Arenburg, Mar 31 '15 at 13:47
@DavidArenburg, Ok I understand what you are saying, but this is a label switching issue. Since, it is easy to get back the original levels, by using `levels(par) <- levels(x)`. — Mamoun Benghezal, Mar 31 '15 at 14:05
@MamounBenghezal the only problem with that methodology is that the OP will need to have `x` too. If they already have the `x`, they can skip that whole process and just use it. It seems to me that OP is looking for a solution when they have *only* the model matrix and they are trying to find the `x`. Otherwise there is no sense in that question whatsoever. — David Arenburg, Mar 31 '15 at 19:19
Anyway, it seems like the best solution is `factor(sub("x", "", colnames(modmat)[max.col(modmat)], fixed = TRUE))`. The only problem with it is that you have to know what was the name of the vector that was passed into `model.matrix` (in this case it was `x`) — David Arenburg, Mar 31 '15 at 19:31

Özgür · Answer 1 · 2015-04-30T16:24:18.123

apply is suitable for this.

I will use caret package's cars data, which has 1-0 data instead of car types in factor format. Let's convert these 5 columns (convertible, coupe, hatchback, sedan, wagon) to single factor variable, Type.

library(caret)
data(cars)
head(cars[,-c(1:13)])

  convertible coupe hatchback sedan wagon
1           0     0         0     1     0
2           0     1         0     0     0
3           1     0         0     0     0
4           1     0         0     0     0
5           1     0         0     0     0
6           1     0         0     0     0


cars$Type = as.factor(apply(df,1,function(foo){return(names(df)[which.max(foo)])}))

head(cars[,-c(1:13)])

  convertible coupe hatchback sedan wagon        Type
1           0     0         0     1     0       sedan
2           0     1         0     0     0       coupe
3           1     0         0     0     0 convertible
4           1     0         0     0     0 convertible
5           1     0         0     0     0 convertible
6           1     0         0     0     0 convertible

Melt a dummy matrix to a column

1 Answers1