1

I am trying to recode, and am running into a snag that seems simple enough, but I haven't been able to figure out after quite some time asking the internet, so I appreciate any help you can give.

I have some data that contains NA's. I would like to recode, using this data, but keep on running into the error "NAs are not allowed in subscripted assignments." As I'm trying to create an example data set, I'm additionally running into a warning that I don't have "meaningful factors." Any help would be appreciated.

My faux-data has three variables: "var1" and "var2" (character, and sometimes missing) and "var3" (numeric). I want to create a fourth variable, that contains the value of "var1" if beta is greater than zero, and contains the value of "var2" if beta is less than zero. If var1 or var2 is missing, I want the new variable to also be missing:

var1<-c("A","T",NA,"G","C")
var2<-c("G","A",NA,"A","G")
var3 <-c(-.1,3,-4,5,-3)
df=as.data.frame(cbind(var1,var2,var3))

df$newVar[df$var3>0]=df$var1[df$var3>0]
df$newVar[df$var3<0]=df$var2[df$var3<0]

What I get is a bunch of red:

df$newVar[df$var3>0]=df$var1[df$var3>0]
Error in df$newVar[df$var3 > 0] = df$var1[df$var3 > 0] : 
NAs are not allowed in subscripted assignments
In addition: Warning messages:
1: In Ops.factor(df$var3, 0) : > not meaningful for factors
2: In Ops.factor(df$var3, 0) : > not meaningful for factors
df$newVar[df$var3<0]=df$var2[df$var3<0]
Error in df$newVar[df$var3 < 0] = df$var2[df$var3 < 0] : 
NAs are not allowed in subscripted assignments
In addition: Warning messages:
1: In Ops.factor(df$var3, 0) : < not meaningful for factors
2: In Ops.factor(df$var3, 0) : < not meaningful for factors

Any advice would be appreciated. Thank you.

M. M.
  • 109
  • 2
  • 7
  • What do you want to recode as what? Showing desired output will probably be helpful here. – Simon O'Hanlon Aug 20 '13 at 15:40
  • Have you tried using `ifelse` statements? – Harrison Jones Aug 20 '13 at 15:41
  • possible duplicate of [Recoding over multiple data frames in R](http://stackoverflow.com/questions/18323236/recoding-over-multiple-data-frames-in-r) – Metrics Aug 20 '13 at 15:43
  • 4
    Hi, I notice you have *never* voted/accepted an answer. You might want to read the [**about**](http://stackoverflow.com/about) and [**FAQ**](http://stackoverflow.com/faq) sections of the website to help you get the most out of SO. If an answer does solve your problem you may want to *consider* upvoting and/or marking it as accepted to show the question has been answered, by ticking the little green check mark next to the suitable answer. You are **not** obliged to do this, but it helps keep the site clean of unanswered questions and rewards those who take the time to solve your problem. – Simon O'Hanlon Aug 20 '13 at 22:13

2 Answers2

4

Your problem is that you are using cbind before data.frame, this coerces your three variables into the same class (which has to be character), causing them to be coerced to factor when you make your data.frame.

Instead, just do

df <- data.frame(var1, var2, var3)

Run the same code for newVar and you should get:

  var1 var2 var3 newVar
1    A    G -0.1      2
2    T    A  3.0      4
3 <NA> <NA> -4.0     NA
4    G    A  5.0      3
5    C    G -3.0      2
Señor O
  • 17,049
  • 2
  • 45
  • 47
1

You can greatly simplify how you recode your variables. Don't use cbind as has already been pointed out elsewhere, but... you can supply a 2 column matrix of subscripting variables to subset your dataframe by. So we can do something like this:

df <- data.frame( var1 , var2 , var3 )

#  Gives 1 if 'var3' is greater than 0 and 2 otherwise (the numbers of the columns you want!)
ind <- (! df$var3 > 0) + 1
#[1] 2 1 2 1 2

#  Get each row selecting either column 1 or two
df$newVar <- df[ cbind( 1:nrow(df) , ind ) ]
# var1 var2 var3 newVar
#1    A    G -0.1      G
#2    T    A  3.0      T
#3 <NA> <NA> -4.0   <NA>
#4    G    A  5.0      G
#5    C    G -3.0      G
Simon O'Hanlon
  • 58,647
  • 14
  • 142
  • 184