3

I am trying to calculate cumulative sums and am using mutate to create the new column. I am doing this for multiple columns and each has missing data in different places.

day    water    nitrogen
 1      4        5
 2      NA       6
 3      3        NA
 4      7        NA
 5      2        9
 6      NA       3
 7      2        NA
 8      NA       2
 9      7        NA
10      4        3

I try

mutate(df, sumwater = cumsum(water))%>%
mutate(sumnitrogen = cumsum(nitrogen)) 

or

mutate(df, sumwater = cumsum(water, na.rm = TRUE))%>%
mutate(sumnitrogen = cumsum(nitrogen, na.rm = TRUE))

neither works. I know we can do na.rm=TRUE in summarize functions. Is there a way to get it to work in mutate? I prefer a dplyr answer because in reality this part of a long chain of piping.

Nazer
  • 3,654
  • 8
  • 33
  • 47
  • 1
    While not dplyr specific, good answers abut `cumsum` and `NA` [here](http://stackoverflow.com/questions/25576358/calculate-cumsum-while-ignoring-na-values) – aosmith Oct 05 '15 at 21:51
  • 1
    `cumsum` has no `na.rm` argument, which isn't an issue specific to dplyr. See `?cumsum` for its docs. – Frank Oct 05 '15 at 21:51
  • 2
    You could also do something like `df %>% mutate(sumwater = cumsum(replace(water, is.na(water), 0)), sumnitrogen = cumsum(replace(nitrogen, is.na(nitrogen), 0)))` – David Arenburg Oct 05 '15 at 22:10

2 Answers2

5

You need to choose a replacement value for NA. You could do zero. But if these are true NA's, mean replacement seems to make more sense (or even geometric mean, if values are strictly greater than 0)

detach("package:dplyr")
library(plyr)
library(dplyr)

mutate(df, sumwater = water %>% mapvalues(NA, 0) %>% cumsum)

mutate(df, sumwater = water %>% mapvalues(NA, mean(water) ) %>% cumsum)
bramtayl
  • 4,004
  • 2
  • 11
  • 18
2

Another solution, that avoids loading plyr, is to instead use tidyr::replace_na (see http://tidyr.tidyverse.org/reference/replace_na.html):

library(dplyr)
library(tidyr)

df %>% replace_na(list(water = 0)) %>% mutate(sumwater = cumsum(water))
jerome
  • 143
  • 6