2

I have data that is structured like this:

msr_type <- c('Clinic','Hospital','Lab','Office')
result <- c(10.5, 21.9, 30.5, 14.5)
sec_q <- c(9.5, 15.2, 25, 9)
third_q <- c(11, 17, 34, 20)
four_q <- c(17, 25, 29, 25)
df_check <- data.frame(msr_type, result, sec_q, third_q, four_q)

It's easy enough plotting the result as a dot:

ggplot() + geom_point(data=df_check, aes(x=msr_type, y=result), colour="blue")

enter image description here

But is there a way to use geom_rect to plot quartiles depending on the msr_type, given that it's a discrete variable stacked on top of each other?

PinkyL
  • 341
  • 1
  • 8
  • 19
  • What's `df_practice2`? Your sample data only creates `df_check`, & the ggplot produced using that data source looks nothing like the picture. – Z.Lin Aug 26 '17 at 04:25
  • That's a typo and as my question states, the difficulty is not creating a point - it's the stacked quartiles rectangles/bars in the background. It would have been too difficult to explain what I need using strictly words so I made a mock-up to indicate the general idea. – PinkyL Aug 26 '17 at 04:33

1 Answers1

2

Here are two possible approaches, depending on your needs. In either case, though, I think geom_col would be easier. (It's possible to use geom_rect when your x-axis data is discrete, but it's not the most straightforward. Example)

Sample data (I switched Q3 & Q4 values for Lab as Q3's value was larger, which didn't make sense):

msr_type <- c('Clinic','Hospital','Lab','Office')
result <- c(10.5, 21.9, 30.5, 14.5)
sec_q <- c(9.5, 15.2, 25, 9)
third_q <- c(11, 17, 29, 20)
four_q <- c(17, 25, 34, 25)
df_check <- data.frame(msr_type, result, sec_q, third_q, four_q)

Approach 1 (keeping the wide format of the original dataset):

ggplot(df_check,
       aes(x = msr_type)) +
  geom_col(aes(y = four_q), fill = "slategray3") +
  geom_col(aes(y = third_q), fill = "slategray2") +
  geom_col(aes(y = sec_q), fill = "slategray1") +
  geom_point(aes(y = result)) +
  xlab("") + ylab("")

Since Q2 <= Q3 <= Q4, you can simply create one set of bars for each quartile & overlay them. But if you need a legend for Q2 / Q3 / Q4, it's not that straightforward...

Approach 1

Approach 2 (converting the dataset to long format so that all the quartile values are in the same variable):

df_check2 <- df_check %>%
  tidyr::gather(quartile, quartile.value, -msr_type, -result) %>%
  mutate(quartile = factor(quartile, levels = c("sec_q", "third_q", "four_q")))

ggplot(df_check2,
       aes(x = msr_type)) +
  geom_col(aes(y = quartile.value, fill = quartile),
           position = position_dodge(0), width = 2.5) +
  geom_point(aes(y = result)) +
  scale_fill_manual(values = c("slategray1", "slategray2", "slategray3")) +
  xlab("") + ylab("")

A legend is created by default using this approach. It's also more flexible if you have other quartiles / deciles / percentiles / etc to plot.

Approach 2

Z.Lin
  • 28,055
  • 6
  • 54
  • 94