4

I've a dataframe as under

+----+-------+---------+
| ID | VALUE |  DATE   |
+----+-------+---------+
|  1 |    10 | 2019-08 |
|  2 |    12 | 2018-05 |
|  3 |    45 | 2019-03 |
|  3 |    33 | 2018-03 |
|  1 |     5 | 2018-08 |
|  2 |    98 | 2019-05 |
|  4 |    67 | 2019-10 |
|  4 |    34 | 2018-10 |
|  1 |    55 | 2018-07 |
|  2 |    76 | 2019-08 |
|  2 |    56 | 2018-12 |
+----+-------+---------+

What I'm trying to do here is to split the value and date into value1 and value2 and data1 and date2 based on the current year(year of systemdate) and the year before

But the condition here is if the date-month combination in DATE of the main table matched to that of current systemdate then donot consider last years date

Also disregard all the values dates that appear before the year of systemdate The resulting output would be as under

Over here in the result ID 1,2 and 3 had corresponding values for same month in this year and last year so we split them in 2 different columns Also we didn't consider last years result of ID 4 as its month this year matches with year-month combination of systemdate and we also disregard all the values from lat year that don't have a corresponding month match this year ( ID 1 for 2018-07 and 2 for 2018-12 in this example)

+----+---------+---------+--------+--------+
| ID |  DATE1  |  DATE2  | VALUE1 | VALUE2 |
+----+---------+---------+--------+--------+
|  1 | 2019-08 | 2018-08 |     10 | 5      |
|  2 | 2019-05 | 2018-05 |     98 | 12     |
|  3 | 2019-03 | 2018-03 |     45 | 33     |
|  4 | 2019-10 | NA      |     67 | NA     |
|  2 | 2019-08 | NA      |     76 | NA     |
+----+---------+---------+--------+--------+
user11845701
  • 157
  • 1
  • 9

1 Answers1

2

I think you could get everything in the right format first:

df <- data.frame(ID = c(1, 2, 3, 3, 1, 2, 4, 4, 1, 2, 2), 
 VALUE = c(10, 12, 45, 33, 5, 98, 67, 34, 55, 76, 56), 
 DATE = c("2019-08", "2018-05", "2019-03","2018-03", 
    "2018-08","2019-05", "2019-10", "2018-10", 
    "2018-07", "2019-08", "2018-12"))

library(tidyverse)
df <- df %>% mutate(
  year = str_split_fixed(DATE, "-", 2)[,1],  
  month = str_split_fixed(DATE, "-", 2)[,2]) %>% 
  pivot_wider(
    names_from = year,
    values_from = c(VALUE, DATE)) 

Then, you could filter and remove those values that you do not need according to your logic. I may not fully understand your system time here, but just assume it is the string "2019-10". It could be something like this:

df %>% 
  filter(!is.na(VALUE_2019)) %>% 
  mutate(
    VALUE_2018 = ifelse(DATE_2019 == "2019-10", NA, VALUE_2018), 
    DATE_2018 = ifelse(DATE_2019 == "2019-10", NA, as.character(DATE_2018)))

# A tibble: 5 x 6
     ID month VALUE_2019 VALUE_2018 DATE_2019 DATE_2018
  <dbl> <chr>      <dbl>      <dbl> <fct>     <chr>    
1     1 08            10          5 2019-08   2018-08  
2     2 05            98         12 2019-05   2018-05  
3     3 03            45         33 2019-03   2018-03  
4     4 10            67         NA 2019-10   NA       
5     2 08            76         NA 2019-08   NA   
Zhiqiang Wang
  • 6,206
  • 2
  • 13
  • 27