0

I have two columns containing data. While comparing these columns I got FALSE return on this row, which kind of stumped me.

dat[82,"UG_accept_avg_total.x"]
## [1] 1.842105
dat[82,"UG_accept_avg_total.y"]
## [1] 1.842105
dat[82,"UG_accept_avg_total.x"]==dat[82,"UG_accept_avg_total.y"]
## [1] FALSE

I read the answer to this question which explained why my problem occurs, but the answer didn't help me a lot because:

all.equal(dat[82,"UG_accept_avg_total.x"],dat[82,"UG_accept_avg_total.y"])
## "Mean relative difference: 1.427714e-07"
isTRUE(all.equal(dat[82,"UG_accept_avg_total.x"],dat[82,"UG_accept_avg_total.y"]))
## [1] FALSE

I could just shave off some digits after the decimal point, since 3 is probably enough, but checking all data fields (over 250000) in my data set to do this would be a rather wasteful use of recources. Does anyone have a better suggestion? Is there a way to decrease the "sensitivity" of isTRUE(all.equal(x,y))?

Community
  • 1
  • 1
Xizam
  • 699
  • 7
  • 21

1 Answers1

6

Use the tolerance argument in all.equal.

This works on my machine:

x <- 0.0000001
y <- 0.0000002

isTRUE(all.equal(x, y))
## [1] FALSE

isTRUE(all.equal(x, y, tolerance=10^-7))
## [1] TRUE
sebastian-c
  • 15,057
  • 3
  • 47
  • 93