59

Consider the following named vector x.

( x <- setNames(c(1, 2, 0, NA, 4, NA, NA, 6), letters[1:8]) )
# a  b  c  d  e  f  g  h 
# 1  2  0 NA  4 NA NA  6 

I'd like to calculate the cumulative sum of x while ignoring the NA values. Many R functions have an argument na.rm which removes NA elements prior to calculations. cumsum() is not one of them, which makes this operation a bit tricky.

I can do it this way.

y <- setNames(numeric(length(x)), names(x))
z <- cumsum(na.omit(x))
y[names(y) %in% names(z)] <- z
y[!names(y) %in% names(z)] <- x[is.na(x)]
y
# a  b  c  d  e  f  g  h 
# 1  3  3 NA  7 NA NA 13 

But this seems excessive, and makes a lot of new assignments/copies. I'm sure there's a better way.

What better methods are there to return the cumulative sum while effectively ignoring NA values?

Rich Scriven
  • 97,041
  • 11
  • 181
  • 245

5 Answers5

52

You can do this in one line with:

cumsum(ifelse(is.na(x), 0, x)) + x*0
#  a  b  c  d  e  f  g  h 
#  1  3  3 NA  7 NA NA 13

Or, similarly:

library(dplyr)
cumsum(coalesce(x, 0)) + x*0
#  a  b  c  d  e  f  g  h 
#  1  3  3 NA  7 NA NA 13 
josliber
  • 43,891
  • 12
  • 98
  • 133
  • 3
    what does `x*0` do here? – Denis Mar 26 '19 at 18:00
  • 3
    @Denis `x*0` takes value `NA` if the value in `x` is missing and otherwise takes value 0. So adding `x*0` basically just replaces by `NA` whenever the original value was missing. – josliber Mar 26 '19 at 18:37
35

It's an old question but tidyr gives a new solution. Based on the idea of replacing NA with zero.

require(tidyr)

cumsum(replace_na(x, 0))

 a  b  c  d  e  f  g  h 
 1  3  3  3  7  7  7 13 
DJV
  • 4,743
  • 3
  • 19
  • 34
  • 5
    This includes the zero in the calculation of the mean, but I think the post said that he wanted to ignore those values in the calculation. Both things are different. – Liliana Pacheco May 27 '19 at 20:49
30

Do you want something like this:

x2 <- x
x2[!is.na(x)] <- cumsum(x2[!is.na(x)])

x2

[edit] Alternatively, as suggested by a comment above, you can change NA's to 0's -

miss <- is.na(x)
x[miss] <- 0
cs <- cumsum(x)
cs[miss] <- NA
# cs is the requested cumsum
lebatsnok
  • 6,329
  • 2
  • 21
  • 22
  • 1
    one-liner doing the same thing: `"[<-"(x, !is.na(x), cumsum(na.omit(x)))` – lebatsnok Aug 30 '14 at 18:24
  • 1
    Isn't the more readable version of the same thing `x[!is.na(x)] <- cumsum(na.omit(x))`? – Simon O'Hanlon Aug 31 '14 at 00:38
  • 3
    It's more readable but it's not the same thing. `"[<-"(x, bla...` does what OP asked *without changing x*, your version does subset assignment on x and returns `cumsum(na.omit(x))`. So it's by far not the same thing. - A more readable version of the one-liner, doing the same thing, would be this: `replace(x, !is.na(x), cumsum(na.omit(x)))` – lebatsnok Sep 01 '14 at 06:41
12

Here's a function I came up from the answers to this question. Thought I'd share it, since it seems to work well so far. It calculates the cumulative FUNC of x while ignoring NA. FUNC can be any one of sum(), prod(), min(), or max(), and x is a numeric vector.

cumSkipNA <- function(x, FUNC)
{
    d <- deparse(substitute(FUNC))
    funs <- c("max", "min", "prod", "sum")
    stopifnot(is.vector(x), is.numeric(x), d %in% funs)
    FUNC <- match.fun(paste0("cum", d))
    x[!is.na(x)] <- FUNC(x[!is.na(x)])
    x
}

set.seed(1)
x <- sample(15, 10, TRUE)
x[c(2,7,5)] <- NA
x
# [1]  4 NA  9 14 NA 14 NA 10 10  1
cumSkipNA(x, sum)
# [1]  4 NA 13 27 NA 41 NA 51 61 62
cumSkipNA(x, prod)
# [1]      4     NA     36    504     NA   7056     NA
# [8]  70560 705600 705600
cumSkipNA(x, min)
# [1]  4 NA  4  4 NA  4 NA  4  4  1
cumSkipNA(x, max)
# [1]  4 NA  9 14 NA 14 NA 14 14 14 

Definitely nothing new, but maybe useful to someone.

Rich Scriven
  • 97,041
  • 11
  • 181
  • 245
1

Another option is using the collapse package with fcumsum function like this:

( x <- setNames(c(1, 2, 0, NA, 4, NA, NA, 6), letters[1:8]) )
#>  a  b  c  d  e  f  g  h 
#>  1  2  0 NA  4 NA NA  6
library(collapse)
fcumsum(x)
#>  a  b  c  d  e  f  g  h 
#>  1  3  3 NA  7 NA NA 13

Created on 2022-08-24 with reprex v2.0.2

Quinten
  • 35,235
  • 5
  • 20
  • 53