3

Using the caret package I've created 10 random cross-validation folds as follows with my analysis dataset:

### Create cross validation folds (k=10). ###
set.seed(123)
library(caret)
folds <- createFolds(dataset$member_id)

I have no problems manually assigning each fold to a separate training and testing data frames:

train1 <- dataset[-folds$Fold01,]
test1 <- dataset[folds$Fold01,]
train2 <- dataset[-folds$Fold02,]
test2 <- dataset[folds$Fold02,]
...
train10 <- dataset[-folds$Fold10,]
test10 <- dataset[folds$Fold10,]

I'd like to condense the above code into a more elegant loop. However the following code is only assigning empty datasets to train_1-train_9:

for(i in 1:9) 
{ 
  assign(paste0("train_",i), dataset[paste0("-folds$Fold0",i),])
}
train_10 <- dataset[-folds$Fold10,];

What am I missing?

RobertF
  • 824
  • 2
  • 14
  • 40
  • 1
    The character string "-folds$Fold01" is not the same as the variable `-folds$Fold10`. You're asking R to use a character string as though it were a variable. A common pattern is to use `get()` inside the `[]`. – Brandon Bertelsen Feb 01 '16 at 05:34

1 Answers1

2

using get() above might be cleaner but eval and parse also work:

assign(paste0("train_",i), train_missing[-eval(parse(text = paste0("folds$Fold0",i))),])

Basically evaluates the string as a variable.

EDIT: Moved the minus sign in front of eval and out of the paste statement.

admccurdy
  • 694
  • 3
  • 11
  • Don't use the evil `eval(parse(`. –  Feb 01 '16 at 06:41
  • @Pascal, why is it evil and what is using get() the better solution? I'm not doubting that you're correct I'm just curious `eval(parse(` is something I picked up on these forums and I use a decent amount if there's a better way I'd like to be using it. Could you provide an example using the question code? – admccurdy Feb 01 '16 at 13:20
  • It would appear the get option is `assign(paste0("train_",i), train_missing[-get(paste0("folds$Fold0",i)),])` which is admittedly much cleaner. – admccurdy Feb 01 '16 at 15:08
  • Thanks Adam! The eval(parse(.)) statement worked fine, even if it *is* evil. ;) I tried the assign(paste0("train_",i), train_missing[-get(paste0("folds$Fold0",i)),]) statement but received the error: "Error in get(paste0("-folds$Fold0", i)) : object '-folds$Fold01' not found". – RobertF Feb 01 '16 at 18:03
  • Found this: http://stackoverflow.com/questions/13649979/what-specifically-are-the-dangers-of-evalparse – RobertF Feb 01 '16 at 18:14
  • Also, I suppose using lapply() would be more efficient than a for loop, but one step at a time... – RobertF Feb 01 '16 at 18:29
  • @RobertF glad it helped and thanks for the link, I was certainly guilt of not knowing about get() before. – admccurdy Feb 01 '16 at 18:50
  • Wouldn't it be ironic if the R code behind get() is simply eval(parse())? – RobertF Feb 01 '16 at 18:57