-1

When I calculate a linear model in R via the lm() function, it is possible to pass a character vector of variables into the lm() formula. (E.g. like described here or here.) However, if I apply the same method to the selection() function of the sampleSelection package, it appears the following error:

Error in detectModelType(selection, outcome) : argument 'selection' must be a formula in function 'selection()'

Question: Is there a way to pass a character vector of variables into the selection() formula?

Below, you can find a reproducible example, which illustrates the problem:

# Example data
N <- 1000
y <- rnorm(N, 2000, 200)
y_prob <- c(rep(0, N / 2), rep(1, N / 2)) == 1
x1 <- y + rnorm(N, 0, 300)
x2 <- y + rnorm(N, 0, 300)
x3 <- y + rnorm(N, 0, 300)
x4 <- y + rnorm(N, 0, 300)
x5 <- y + rnorm(N, 0, 300)
y[1:(N / 2)] <- 0
data <- data.frame(y, x1, x2, x3, x4, x5, y_prob)
x_vars <- colnames(data)[colnames(data) %in% c("y", "y_prob") == FALSE]

# Estimate linear model via lm() --> works without any problems
lm(paste("y", "~", paste(x_vars, collapse = " + ")))

# Estimate Heckman model via selection()
library("sampleSelection")

# Passing of vector does not work
selection(paste("y_prob", "~", paste(x_vars[1:4], collapse = " + ")), 
      paste("y", "~", paste(x_vars[3:5], collapse = " + ")), data)

# Formula has to be written manually
selection(y_prob ~ x1 + x2 + x3 + x4, y ~ x3 + x4 + x5, data)
Joachim Schork
  • 2,025
  • 3
  • 25
  • 48

1 Answers1

1

Wrap your paste calls with as.formula

selection(as.formula(paste("y_prob", "~", paste(x_vars[1:4], collapse = " + "))), 
  as.formula(paste("y", "~", paste(x_vars[3:5], collapse = " + "))), data)


Call:
 selection(selection = as.formula(paste("y_prob", "~", paste(x_vars[1:4],      collapse = " + "))), outcome = as.formula(paste("y", "~",      paste(x_vars[3:5], collapse = " + "))), data = data) 

Coefficients:
S:(Intercept)           S:x1           S:x2           S:x3           S:x4  O:(Intercept)           O:x3           O:x4           O:x5          sigma  
   -1.936e-01     -5.851e-05      7.020e-05      5.475e-05      2.811e-05      2.905e+02      2.286e-01      2.437e-01      2.165e-01      4.083e+02  
      rho  
1.000e+00  
Daniel Anderson
  • 2,394
  • 13
  • 26