1

I would like assign group number for based on those below a threshold number. I have created a small toy example to help illustrate what I need to do in my very large data set. My dataset does have na values, which should always be the first number in the group. I suspect this is what is causing the problem.

TB <- c(na,21706,297,1078,61,75,6464,10649,3480,7823,3233,83,3646,60)
thresh = 316

This is how I would like it formatted, so those under the threshold criteria get grouped with the one above. I have included the Time series data in the picture for further illustration, but it's not necessary for the code

I have tried:

test.group <- dat %>%
  mutate(grp = cumsum(TB < 316))

but only get NA values returned. Any help is appreciated.

ThomasIsCoding
  • 96,636
  • 9
  • 24
  • 81
mere
  • 11
  • 1

1 Answers1

1

fcumsum from collapse can ignore the NA elements

library(dplyr)
library(collapse)
dat %>% 
   mutate(grp = fcumsum(TB < 316))

-output

  TB grp
1     NA  NA
2  21706   0
3    297   1
4   1078   1
5     61   2
6     75   3
7   6464   3
8  10649   3
9   3480   3
10  7823   3
11  3233   3
12    83   4
13  3646   4
14    60   5

data

dat <- structure(list(TB = c(NA, 21706, 297, 1078, 61, 75, 6464, 10649, 
3480, 7823, 3233, 83, 3646, 60)), class = "data.frame", row.names = c(NA, 
-14L))
akrun
  • 874,273
  • 37
  • 540
  • 662