-1

I have the following data set of Velocity and Stroke (Stk). Each new stroke is represented by a 1 in the Stkcolumn,

Data <- structure(c(2.1, 2.4, 2.5, 2.2, 2.6, 2.8, 2.2, 2.9, 2, 7, 1, 
NA, NA, NA, NA, 1, NA, NA, NA, NA), .Dim = c(10L, 2L), .Dimnames = list(
    NULL, c("Velocity", "Stk")))

Data <- as.data.frame(Data)

I would like to add a new column StrokeNumber, which counts the number of strokes. For example, the first occurrence of the number 1 will assign a StrokeNumber = 1, and the second occurrence will assign StrokeNumber = 2. This is so I can calculate the average velocity for each stroke.

Desired output is below:

 A tibble: 10 x 4
# Groups:   StrokeNumber [2]
   Velocity   Stk StrokeNumber Velocity_mean
      <dbl> <dbl>        <dbl>         <dbl>
 1      2.1     1            1          2.36
 2      2.4    NA            1          2.36
 3      2.5    NA            1          2.36
 4      2.2    NA            1          2.36
 5      2.6    NA            1          2.36
 6      2.8     1            2          2.54
 7      2.2    NA            2          2.54
 8      2.9    NA            2          2.54
 9      2.7    NA            2          2.54
10      2.1    NA            2          2.54

My actual data is much longer and has values such as 2 and 3 in the Stk column as well. It seems like it should be simple but I can't figure out how to do it.

zx8754
  • 52,746
  • 12
  • 114
  • 209
wattss
  • 61
  • 5
  • Try `cumsum(!is.na(Data[,2]))`. FYI, the `Data` you provided is not a data frame – Sotos Jul 21 '20 at 07:56
  • What should happen with Strokenumber when Stk == 2 ? – Wimpel Jul 21 '20 at 08:00
  • I am only interested if ```Stk``` = 1 as this is the beginning of the stroke. Other numbers indicate different parts of the stroke. I could just convert the other numbers to ```NA``` but @RonakShah has dealt with this below. – wattss Jul 21 '20 at 08:05
  • Both of the answers below won't work since the example you shared is **NOT** a data frame – Sotos Jul 21 '20 at 08:06

1 Answers1

0

Increment the stroke number when the value in Stk column is not NA and is 1. For each group you can then take mean of Velocity.

library(dplyr)

Data %>%
  group_by(StrokeNumber = cumsum(!is.na(Stk) & Stk == 1)) %>%
  mutate(velocity_mean = mean(Velocity))

# A tibble: 10 x 4
# Groups:   StrokeNumber [2]
#   Velocity   Stk StrokeNumber velocity_mean
#      <dbl> <dbl>        <int>         <dbl>
# 1      2.1     1            1          2.36
# 2      2.4    NA            1          2.36
# 3      2.5    NA            1          2.36
# 4      2.2    NA            1          2.36
# 5      2.6    NA            1          2.36
# 6      2.8     1            2          3.38
# 7      2.2    NA            2          3.38
# 8      2.9    NA            2          3.38
# 9      2      NA            2          3.38
#10      7      NA            2          3.38

data

Data <- data.frame(Data)
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213