0

I am new to R and stackoverflow so this will probably have a very simple solution.

I have a set of data from 20 different subject. In the future I will have to perform a lot of different actions on this data and will have to repeat this action for all individual sets. Analyzing them separately and recombining them. My question is how can I automate this process: P4 <- read.delim("P4Rtest.txt") P7 <- read.delim("P7Rtest.txt") P13 <- read.delim("P13Rtest.txt") etc etc etc.

I have tried looping with a for loop but see to get stuck with creating a new data.frame with a unique name every time.

Thank you for your help

J.Jansen
  • 17
  • 7
  • 9
    There are many posts on this question on SO. See [here](http://stackoverflow.com/questions/11218498/reading-multiple-files-into-multiple-data-frames-in-r) and [here](http://stackoverflow.com/questions/36555020/reading-multiple-files-from-directory-in-r) for some ideas. It is regarded as best practice to read multiple files into a list object. You can see examples of this in the linked posts. There are also discussions of why this is a good idea on SO. See [here](http://stackoverflow.com/questions/17499013/how-do-i-make-a-list-of-data-frames/24376207) for example. – lmo May 19 '16 at 19:52

3 Answers3

5

The R way to do this would be to keep all the data sets together in a named list. For that you can use the following, where n is the number of files.

nm <- paste0("P", 1:n)  ## create the names P1, P2, ..., Pn
dfList <- setNames(lapply(paste0(nm, "Rtest.txt"), read.delim), nm)

Now dfList will contain all the data sets. You can access them individually with dfList$P1 for P1, dfList$P2 for P2, and so on.

Rich Scriven
  • 97,041
  • 11
  • 181
  • 245
0

There are a bunch of different ways of doing stuff like this. You could combine all the data into one data frame using rbind. The first answer here has a good way of doing that: Replace rbind in for-loop with lapply? (2nd circle of hell)

If you combine everything into one data frame, you'll need to add a column that identifies the participant. So instead of

P4 <- read.delim("P4Rtest.txt")
...

You would have something like

my.list <- vector("list", number.of.subjects)
for(participant.number in 1:number.of.subjects){
    # load individual participant data
    participant.filename = paste("P", participant, "Rtest.txt", sep="")
    participant.df <- read.delim(participant.filename)
    # add a column:
    participant.df$participant.number = participant.number
    my.list[[i]] <- participant.df
}
solution <- rbind(solution, do.call(rbind, my.list))

If you want to keep them separate data frames for some reason, you can keep them in a list (leave off the last rbind line) and use lapply(my.list, function(participant.df) { stuff you want to do }) whenever you want to do stuff to the data frames.

Community
  • 1
  • 1
Erin
  • 386
  • 1
  • 7
-2

You can use assign. Assuming all your files have a similar format as you have shown, this will work for you:

# Define how many files there are (with the numbers).
numFiles <- 10

# Run through that sequence.
for (i in 1:numFiles) {

  fileName <- paste0("P", i, "Rtest.txt") # Creating the name to pull from.
  file <- read.delim(fileName) # Reading in the file.
  dName <- paste0("P", i) # Creating the name to assign the file to in R.
  assign(dName, file) # Creating the file in R.

}

There are other methods that are faster and more compact, but I find this to be more readable, especially for someone who is new to R.

Additionally, if your numbers aren't a complete sequence like I've used here, you can just define a vector of what numbers are used like:

numFiles <- c(1, 4, 10, 25)
giraffehere
  • 1,118
  • 7
  • 18
  • Using the `assign` function in this manner is not a great solution. You can accomplish the task, but the data frames are not well organized versus putting them in a list. When you run a `for` loop, the expectation is that you are processing objects in some replicated manner. Isn't it better to have the output organized in a list rather than sticking them in the environment absent this organization? Sure you can use `get` with `paste` to reverse the operation, but if other objects have similar names, then you have to jump through hoops to filter the correct objects. – lmo May 21 '16 at 19:30
  • Personally, I like having the data frames exist as their own objects and can be seen in the top right corner of R-Studio. I understand your reasoning, but in this case, it's a personal preference of mine. I did note that there are faster and more compact methods (as you say), I just find assign far more readable for someone new to R. Thanks for the comment. – giraffehere May 23 '16 at 14:11