Questions tagged [model.matrix]
113 questions
20
votes
2 answers
Big matrix to run glmnet()
I am having a problem to run glmnet lasso with a wide data set. My data has N=50, but p > 49000, all factors. So to run glmnet i have to create a model.matrix, BUT i just run out of memory when i call model.matrix(formula, data), where formula =…

Flavio Barros
- 996
- 1
- 11
- 29
8
votes
1 answer
Warning message - dummy from dummies package
I am using the dummies package to generate dummy variables for categorical variables, some with more than two categories.
testdf<- data.frame(
"A" = as.factor(c(1,2,2,3,3,1)),
"B" = c('A','B','A','B','C','C'),
"C"=…

Max_IT
- 602
- 5
- 15
5
votes
2 answers
Speed up this loop to create dummy columns with data.table and set in R
I have a data table and I want to create a new column for each unique day, and then assign a 1 in each row where the day matches the column name
I have done this using a for loop but I was wondering if there was any way to optimise it using…

MidnightDataGeek
- 938
- 12
- 21
5
votes
1 answer
use model.matrix through rpy2?
I prefer python over R for my work. From time to time, I need to use R
functions, and I start to try Rpy2 for that purpose.
I tried but failed to find out how to replicate following with Rpy2
design <- model.matrix(~Subject+Treat)
I have gone as…

xyliu00
- 726
- 1
- 9
- 24
4
votes
1 answer
How does model.matrix select levels for interaction terms
model.matrix returns fewer levels if lower order terms are included with interaction terms. If two-factor variables have na and nb levels, respectively. In a complete model.matrix with interaction terms,
model.matrix(~ A + B + A:B), shouldn't I have…

Shubham Gupta
- 650
- 6
- 18
4
votes
1 answer
Force model.matrix to follow the order of the terms in the formula in R
Lets create a matrix with fake data:
data_ex <- data.frame(y = runif(5,0,1), a1 = runif(5,0,1), b2 = runif(5,0,1),
c3 = runif(5,0,1), d4 = runif(5,0,1))
> data_ex
y a1 b2 c3 d4
1 0.162 0.221 0.483 0.989…

FR_
- 147
- 9
4
votes
0 answers
Variable order in interaction terms
I'm trying to fit a number of linear models as shown below. It is important that all interaction terms are sorted lexicographically. Note that the second model is missing the main effect for x.
x = rnorm(100)
y = rnorm(100)
z = x + y +…

rimorob
- 624
- 1
- 5
- 16
4
votes
1 answer
Is there a function to return the matching response vector to model.matrix?
In glmnet() I have to specify the raw X matrix and response vector Y (different than lm where you can specify the model formula). model.matrix() will correctly remove incomplete observations from the X matrix, but it doesn't include the response in…

Robert Kubrick
- 8,413
- 13
- 59
- 91
4
votes
1 answer
Melt a dummy matrix to a column
If I have a factor variable, say x = factor(c(1, 2, 3)), then I can use model.matrix function to generate a dummy matrix:
model.matrix(~x + 0)
and I will get a matrix like:
x1 x2 x3
1 1 0 0
2 0 1 0
3 0 0 1
My question is that, if I…

Bayes
- 67
- 7
4
votes
1 answer
Rownames for data.table in R for model.matrix
I have a data.table DT and I want to run model.matrix on it. Each row has a string ID, which is stored in the ID column of DT. When I run model.matrix on DT, my formula excludes the ID column. The problem is, model.matrix drops some rows because…

DavidR
- 810
- 2
- 8
- 16
4
votes
2 answers
model.matrix using multiple columns
I'm trying to use multiple columns from a data.frame in a model.matrix.
The data frame looks like this:
df1 <- data.frame(id=seq(1,10,1), zip1=(round(runif(10)*100000,0)), zip2=(round(runif(10)*100000,0))
…

screechOwl
- 27,310
- 61
- 158
- 267
3
votes
1 answer
How can I obtain a minimal data frame of only the variables used in a statistical model in R?
Take the following example:
fit <- lm(Sepal.Length ~ log(Sepal.Width), data = iris)
I would like a copy of iris that only includes the variables that were involved in making fit. I think model.matrix() or model.frame() don't quite do it because of…

cgmil
- 410
- 2
- 18
3
votes
0 answers
Make a model matrix if missing the response variable and where matrix multiplication recreates the predict function
I want to create a model matrix for a test dataset which is missing the response variable, and where I can perfectly replicate the results of calling predict() on the model if building predictions using matrix multiplication. See code below for…

jruf003
- 980
- 5
- 19
3
votes
1 answer
model.matrix Error: $ operator is invalid for atomic vectors
I ran into this error when using 'model.matrix'.
data_A <- data.frame(X1 = c("Y","N"), X2 = c(20,24), Y = c("N","Y"))
data_A
model.matrix("Y ~ X1 + X2", data_A)
Error: $ operator is invalid for atomic vectors
What's causing the problem?

LeGeniusII
- 900
- 1
- 10
- 28
3
votes
2 answers
Check that model has only one factor covariate
I am writing an R package, where the main function takes a model, which may only have a single factor covariate (offsets are allowed). To make sure the user complies with this rule I need to check this.
As an example, let's look at the following…

Heidi
- 187
- 9