2

I got stuck at a fairly easy data munging task. I have a transactional data frame in R that resembles this one:

id<-c(11,11,22,22,22)
dates<-as.Date(c('2013-11-15','2013-11-16','2013-11-15','2013-11-16','2013-11-17'), "%Y-%m-%d")
example<-data.frame(id=id,dates=dates)

  id      dates
1 11 2013-11-15
2 11 2013-11-16
3 22 2013-11-15
4 22 2013-11-16
5 22 2013-11-17

I'm looking for a way to retain the date of the previous transaction. The resulting table would look like this:

previous_dates<-as.Date(c('','2013-11-15','','2013-11-15','2013-11-16'), "%Y-%m-%d")
example2<-data.frame(id=id,dates=dates, previous_dates=previous_dates)

  id      dates previous_dates
1 11 2013-11-15           <NA>
2 11 2013-11-16     2013-11-15
3 22 2013-11-15           <NA>
4 22 2013-11-16     2013-11-15
5 22 2013-11-17     2013-11-16

I looked into other similar problems and one solution that is very close to what I want is:

library(data.table)
dt <- as.data.table(example)

prev_date <- function(x) c(x[1],x)

dt[,prev:=prev_date(dates), by=id]

The problem with this one is that if there is no previous date (like in the case of id=11 dates=2013-11-15) the function would output the same date resulting in:

  id      dates previous_dates
1 11 2013-11-15     2013-11-15
2 11 2013-11-16     2013-11-15

Can someone help, please?

Nikolay Nenov
  • 547
  • 1
  • 8
  • 27

3 Answers3

4
library(plyr)
example <- ddply(example, .(id), transform, 
                                  previous_dates=c(as.Date(NA), head(dates, -1)))
  id      dates previous_dates
1 11 2013-11-15           <NA>
2 11 2013-11-16     2013-11-15
3 22 2013-11-15           <NA>
4 22 2013-11-16     2013-11-15
5 22 2013-11-17     2013-11-16
Roland
  • 127,288
  • 10
  • 191
  • 288
3
example$previous_dates <- ave(example$dates, example$id,
                 FUN=  function(dt) c.Date( c(NA, dt[-length(dt)])
                                              ))
> example
  id      dates previous_dates
1 11 2013-11-15           <NA>
2 11 2013-11-16     2013-11-15
3 22 2013-11-15           <NA>
4 22 2013-11-16     2013-11-15
5 22 2013-11-17     2013-11-16

Playing around with the classes of Date objects .... this also works:

example$previous_dates <- ave(example$dates, example$id,
             FUN=  function(dt) structure( 
                                  c(NA, dt[-length(dt)]),
                                  class="Date" )  )
IRTFM
  • 258,963
  • 21
  • 364
  • 487
  • Actually your solution works faster than Roland's when applied to a large dataset. – Nikolay Nenov Nov 28 '13 at 20:26
  • @NikolayNenov If your dataset is really large and there are many ids, you should use data.table. I only didn't show yesterday how to do that because I didn't find a solution to conserve the Date class and would have needed to do something like Dwin used in the first revision of his answer. – Roland Nov 29 '13 at 08:41
1

Just another approach:

transform(example, previous_dates = ave(dates, id, FUN = 
                                         function(x) x[c(NA, (seq_along(x)-1))]))

  id      dates previous_dates
1 11 2013-11-15           <NA>
2 11 2013-11-16     2013-11-15
3 22 2013-11-15           <NA>
4 22 2013-11-16     2013-11-15
5 22 2013-11-17     2013-11-16
Sven Hohenstein
  • 80,497
  • 17
  • 145
  • 168