1

I have an R dataframe with 20 columns, one for each model. The lines of the dataset present the statistics for a boxplot. I want to plot a boxplot for each of those models, setting the parameters of the boxplot as the lines of the dataframe

Below is one example:

        Model 1    Model 2   ...  Model 20
min       1           5              15
q25       2           7              16
median    3           8              20
q75       4           9              21
max       5           10             22

As can be seeing, the statistics are already calculated. I just need to set them to the boxplot but I have no idea in how to do that

Rods2292
  • 665
  • 2
  • 10
  • 28
  • Does this answer your question? [Draw bloxplots in R given 25,50,75 percentiles and min and max values](https://stackoverflow.com/questions/11129432/draw-bloxplots-in-r-given-25-50-75-percentiles-and-min-and-max-values) – jpsmith Sep 03 '22 at 23:04
  • Curious, how did you calculate those stats? If in R, run `boxplot` on that previous step. – Parfait Sep 03 '22 at 23:51
  • @jpsmith I am having a weird behaviour. `bxp` expects a list. I pass a list and I receive an error: `Error in is.finite(z$stats) : default method not implemented for type 'list'`. I am literally using the same code as shown in question you sent and passing my df as a list to the `bxp` – Rods2292 Sep 04 '22 at 12:46

2 Answers2

0

In case you are willing to use ggplot2 you could try something like this:

Set up a fake dataset. Apparently, you need that to run ggplot() + geom_boxplot():

df <- data.frame("Model" = "Model 1")

Then you can control the single boxplot components like this:

ggplot(df, aes(x = Model,
           ymin=5,     #min
           lower=20,   #q25
           middle=25,  #median
           upper=50,   #q75
           ymax=100))+ #max
  geom_boxplot(stat="identity")  

enter image description here

Analogous for multiple models:

df <- data.frame("Model" = c("Model 1", "Model 2"))

ggplot(df, aes(x = Model,
           ymin=c(5, 9),
           lower=c(20,46),
           middle=c(25,55),
           upper=c(50,89),
           ymax=c(100, 111)))+
  geom_boxplot(stat="identity") 
OliverHennhoefer
  • 677
  • 2
  • 8
  • 21
0

What has not been explained so far is that you need a matrix and not a data frame (since data frames are actually lists, the error refers to lists). I assume you somewhere also have the sample sizes, I rbind them here as a new row.

dat <- rbind(dat, n=c(20, 14, 60))

So all you need to do is coercing as.matrix.

bxp(list(stats=as.matrix(dat[1:5, ]), n=dat[6, ]))

enter image description here


Data:

dat <- read.table(header=TRUE, text='Model1    Model2   Model20
min       1           5              15
q25       2           7              16
median    3           8              20
q75       4           9              21
max       5           10             22')
jay.sf
  • 60,139
  • 8
  • 53
  • 110