-1

Below is a micro-version of my data set. I would like to loop through the three variables without using the 'for' function.

Reproducible Version of the Subject Data Set

cat1 <- rep(c("A","S"),each=40)# First variable
cat2 <- rep(c("C","M","Y","K"), each=10, times=2) # Second variable
num1 <- rep(c(0,10,20,30,40,45,65,80,90,100),8) # Third variable
rspns <-structure(list(dataSet = c(0.2484375, 0.3959375, 0.4875, 0.57875, 0.6696875, 0.694375,
    0.816875, 0.879375, 0.9121875, 0.93125, 0.25125, 0.3796875, 0.4609375, 0.5396875, 0.6159375,
    0.6515625, 0.7696875, 0.8384375, 0.864375, 0.8865625, 0.271875, 0.39875, 0.4821875,
    0.5628125, 0.6284375, 0.650625, 0.7553125, 0.8003125, 0.8103125, 0.8125, 0.251875, 0.3775,
    0.4703125, 0.5725, 0.6996875, 0.7378125, 0.945, 1.055625, 1.1021875, 1.1140625, 0.25125,
    0.4203125, 0.5215625, 0.615625, 0.71, 0.74, 0.865625, 0.9246875, 0.9603125, 0.9734375,
    0.256875, 0.3953125, 0.4775, 0.5528125, 0.62875, 0.65875, 0.78375, 0.8384375, 0.8653125,
    0.8740625, 0.2790625, 0.4215625, 0.515, 0.6009375, 0.6665625, 0.693125, 0.7959375, 0.83875,
    0.8490625, 0.8575, 0.2571875, 0.3759375, 0.4665625, 0.56375, 0.68875, 0.725, 0.9259375,
    1.0328125, 1.085625, 1.1096875)), .Names = "rspns",
  class = "data.frame", row.names = c(NA, -80L))
gain <- data.frame(cat1,cat2,num1, rspns=rspns)

tint <-  function(x,y,z) gain[cat1 == x & cat2 == y & num1 == z, 4]
dgain <- function(x,y,z){(100* (1-10^-(tint(x,y,z) - tint(x,y,0))) / (1-10^-(tint(x,y,100)- tint(x,y,0)))) - z}

How can I get the dgain function, which takes three parameters, to loop through cat1, cat2 and num1 variables without using for loops?

Your help and technical expertise is greatly appreciated.

CoffeeRain
  • 4,460
  • 4
  • 31
  • 50
Ragy Isaac
  • 1,458
  • 1
  • 17
  • 22

1 Answers1

1

Here's the basic idea of what I was getting at with my comments:

#With a basic for loop for reference
out1 <- rep(NA,80)
for (i in 1:80){
    out1[i] <- dgain(cat1[i],cat2[i],num1[i])
}

out2 <- mapply(dgain,x = cat1,y = cat2,z = num1)

out3 <- dgain(cat1,cat2,num1)

all.equal(out1,unname(out2))
[1] TRUE
> all.equal(out1,out3)
[1] "Mean relative difference: 0.2579067"

It's probably possible to refactor your code such that the final idiom (dgain(cat1,cat2,num1)) would return the correct result, but it would take more tinkering than I have time for. The issue is that you're subsetting a data frame from your global environment inside one of your functions using ==, which won't return what you want if you pass is vector arguments.

Nothing special is going on with mapply. It's almost literally doing the same thing as the for loop, only perhaps slightly more efficiently and perhaps in a more aesthetically pleasing manner.

joran
  • 169,992
  • 32
  • 429
  • 468
  • joran, you are correct, the idiom (dgain(cat1,cat2,num1)) does not provide the correct answer. That is why I am interested in mapply or sammply. How should i rewrite the functions? Inputting the three parameters in the current functions manually provides the correct answer. Can you please outline what i should be doing? – Ragy Isaac Mar 12 '13 at 17:14
  • @RagyIsaac I showed you the `mapply` solution. I'm not really interested in re-writing your code for you. You'll probably have to use `match` instead of `==` inside `tint`, though. – joran Mar 12 '13 at 17:17
  • Thanks joran, this help me quite a bit – Ragy Isaac Mar 16 '13 at 15:50