0

I have created a function which cleans up my data and plots using ggplot. I want to name the cleaned data and plot with a suffix so that it can be recalled easily.

For example:
data_frame
data_frame_cleaned
data_frame_plot

I haven't managed to find anything that might pull this off.

I read about using deparse(substitute(x)) to turn the variable into a string, so I gave it a shot together with paste().

import a new data frame
my_data <- read.csv("my_data.csv")
analyze_data(my_data)
function with dpylr and ggplot.

Then, I want to store analyse_data and data_plot in the environment, here is what I thought might work, but no...

analyze_data <- function(x){
    x_data <- x %>%
        filter()%>%
        group_by() %>%
        summarize() %>%
        mutate()
    x_plot <- ggplot(x_data)
    x_name <- deparse(substitute(x))
    assign(paste(x_name,"cleaned",sep="_"),x_data)
    assign(paste(x_name,"plot",sep="_"),x_plot)
}
I got warning message instead.

Warning messages: 1: In assign(paste(x_name, "cost_plot", sep = "_"), campg_data) : only the first element is used as variable name

NelsonGon
  • 13,015
  • 7
  • 27
  • 57
Wai Yeong
  • 3
  • 1
  • Where do you want to "store" them? Also why have an empty `filter`,`ggplot` and `group_by`? `deparse(substitute(.......)` is also incorrectly used. You need more arguments in your function. `assign` too. – NelsonGon May 07 '19 at 06:50
  • @NelsonGon Sorry, I guess I should have made a complete example. – Wai Yeong May 08 '19 at 02:06

1 Answers1

1

Using assign to assign variables is not the best idea. You can litter your environment with lots of variables, which can become confusing, and makes it difficult to handle them programmatically. It's better to store your objects in something like a list, which allows you to extract data easily or modify it in sequence using the *apply or map_* functions. That said…

I cannot replicate the warning when I run your function more or less as it is above. Nevertheless, although the function seems to run just fine, it doesn't do what is desired, i.e. no new variables appear in .GlobalEnv. The issue is that you haven't specified the environment in which the variables should be assigned, so they are assigned within the function's own local environment and vanish when the function completes.

You can use pos = 1 to assign your variables within the .GlobalEnv. The following code create variables mtcars_cleaned and mtcars_plot in my .GlobalEnv:

library(dplyr)

analyze_data <- function(x){
    x_data <- x %>%
        filter(cyl > 4)
    x_plot <- ggplot(x_data, aes(mpg, disp)) + geom_point()
    x_name <- deparse(substitute(x))
    assign(paste(x_name,"cleaned", sep="_"), x_data, pos = 1)
    assign(paste(x_name,"plot", sep="_"), x_plot, pos = 1)
}

analyze_data(mtcars)
  • 1
    Thanks! The `pos = 1` worked. I will have a look at `apply` and `map` functions, the examples you shared definitely made the process look simpler. – Wai Yeong May 08 '19 at 02:02