When I calculate a linear model in R
via the lm()
function, it is possible to pass a character vector of variables into the lm()
formula. (E.g. like described here or here.) However, if I apply the same method to the selection()
function of the sampleSelection
package, it appears the following error:
Error in detectModelType(selection, outcome) : argument 'selection' must be a formula in function 'selection()'
Question: Is there a way to pass a character vector of variables into the selection()
formula?
Below, you can find a reproducible example, which illustrates the problem:
# Example data
N <- 1000
y <- rnorm(N, 2000, 200)
y_prob <- c(rep(0, N / 2), rep(1, N / 2)) == 1
x1 <- y + rnorm(N, 0, 300)
x2 <- y + rnorm(N, 0, 300)
x3 <- y + rnorm(N, 0, 300)
x4 <- y + rnorm(N, 0, 300)
x5 <- y + rnorm(N, 0, 300)
y[1:(N / 2)] <- 0
data <- data.frame(y, x1, x2, x3, x4, x5, y_prob)
x_vars <- colnames(data)[colnames(data) %in% c("y", "y_prob") == FALSE]
# Estimate linear model via lm() --> works without any problems
lm(paste("y", "~", paste(x_vars, collapse = " + ")))
# Estimate Heckman model via selection()
library("sampleSelection")
# Passing of vector does not work
selection(paste("y_prob", "~", paste(x_vars[1:4], collapse = " + ")),
paste("y", "~", paste(x_vars[3:5], collapse = " + ")), data)
# Formula has to be written manually
selection(y_prob ~ x1 + x2 + x3 + x4, y ~ x3 + x4 + x5, data)