0

I have a dataframe that looks like this

df <- data.frame(task       = c(1, 2,  3, 4, 5, NA),
                 day        = c(10, 6,  7, 9, 9, 10),
                 deadline   = c(7, 12, 9, 7, 9, NA),
                 completion = c(1, 1,  1, 1, 0, NA))

Now I want to create a dummy variable that shows if a task was overdue on the day of completion, therefore I have created this code, somehow it does not give me the right results.

df$overduetask <- ifelse(df$completion == 1 & df$day > df$deadline, 1,0)

So my thought behind this is, if a task was completed (completion = 1) and the day is greater than the deadline, then the task is overdue. The output i get for the overdue variable is only 0's, which i manually checked and cannot be true.

Amy
  • 91
  • 1
  • 12
  • 1
    can you please format your data – akrun May 26 '20 at 21:29
  • 1
    what were the 'wrong results'? CAn you add the desired outcome for the sample of data you have shown us here? – morgan121 May 26 '20 at 22:15
  • @RAB the newly created variable was only filled with 0's, but i checked manually and found that there are definitely a few variables that classify as overdue – Amy May 26 '20 at 22:53
  • i the data you have given us, none of the days are greater than the deadline, so they will al be 0. if there are spefic instances that are not working, then you need to include them in your question! Help us out here man – morgan121 May 26 '20 at 22:55
  • @RAB i made the data up, in my real df i have these cases. I think i have done something wrong with the completion variable or does it seem correct to you. I will edit this asap. – Amy May 26 '20 at 22:58
  • well you have used the wrong dataframe in your ifelse statement (using cllw, whatever that is) I assume thats your problem. – morgan121 May 26 '20 at 23:00
  • @RAB i assume the mistake is that maybe it is not clearly defined that i only want R to consider variables where the completion is 1? I have rerun everything and still do not get the right results – Amy May 26 '20 at 23:15
  • running exactly what you have in your question i get 1,0,0,1,0,NA. is this not what you want? – morgan121 May 26 '20 at 23:18
  • it is, now i am even more confused – Amy May 26 '20 at 23:19

1 Answers1

0

It works for me:

df$overduetask <- ifelse(df$completion == 1 & df$day >df$deadline, 1,0)

Have you spell it wrong cllw$ instead of df$ ?

Hi, has I said, it works for me:

eduardo> str(df)
'data.frame':   6 obs. of  5 variables:
 $ task       : num  1 2 3 4 5 NA
 $ day        : num  10 6 7 9 9 10
 $ deadline   : num  7 12 9 7 9 NA
 $ completion : num  1 1 1 1 0 NA
 $ overduetask: num  1 0 0 1 0 NA

I suspect what could be your problem... It happened many times to me in R: When you check completion == 1 probably the test is failing because of rounding problems, for example if you have completion defined as LONG or FLOAT. You can try:

df$overduetask <- ifelse(as.integer(df$completion) == 1 & df$day > df$deadline, 1,0)

I hope it helps

  • this was a mistake in the question, sorry about that. my problem is that my output variable only consists of 0's even though i saw that there are some conditions that classify as overdue, in the example data it is the first condition where day is greater then deadline – Amy May 26 '20 at 23:03