0

I'm creating geom_col() charts in ggplot faceted across grouping variables. When the groups are charted individually, the charts look like what I expect. But when shown at the same time, a lot of bars (columns) disappear and the rest are shown narrower than before.

By the way, the numbers I'm trying to plot are the outputs from dbplot::db_compute_bins.So I'm trying to put these columns together to look like a histogram.

Is this behavior by design?

My expected chart is the same chart shown side-by-side, scaled down to fit. How can I get my expected chart?

The data:

test.dataframe = data.frame(
  group = rep(c('A', 'B'), each= 5),
  bins = c(-9000, -4400, 200, 4800, 9400,
             -2360, -1084.8, 190.4, 1465.6, 2740.8),
  counts = c(2, 6259, 2950, 8, 6, 
             22, 609, 543, 62, 5
             )
)

First group:

ggplot(test.dataframe %>%
         filter(group == 'A')) +
  geom_col(aes(x= bins, y= counts)) +
  scale_y_log10()

Chart:

Group A's Column Chart

Second Group:

ggplot(test.dataframe %>%
         filter(group == 'B')) +
  geom_col(aes(x= bins, y= counts)) +
  scale_y_log10()

Chart:

Group B's Column Chart

Now putting them together:

ggplot(test.dataframe) +
  geom_col(aes(x= bins, y= counts)) +
  scale_y_log10()+
  facet_wrap(vars(group),
             ncol = 2,
             scales = "free")

Chart:

Chart with both groups faceted by group, column-wise

Doing it row-wise results in a different chart, but still not what I expected:

ggplot(test.dataframe) +
  geom_col(aes(x= bins, y= counts)) +
  scale_y_log10()+
  facet_wrap(vars(group),
             nrow = 2,
             scales = "free")

Result:

Chart with both groups faceted by group, row-wise

pbahr
  • 1,300
  • 12
  • 14
  • Possible duplicate of [Bars in geom\_bar have unwanted different widths when using facet\_wrap](https://stackoverflow.com/questions/30196143/bars-in-geom-bar-have-unwanted-different-widths-when-using-facet-wrap) – heds1 Oct 08 '19 at 19:31
  • That question was similar in a way. Thanks for pointing to it. But my main concern was the bars disappearing, which didn't happen over there. – pbahr Oct 08 '19 at 20:19

1 Answers1

2

Edit: reasoning added below.

Here's one approach, where we manually figure out the bar widths and feed that to ggplot:

library(dplyr)
test.dataframe %>%
  group_by(group) %>%
  mutate(bin_count = n(), range = max(bins) - min(bins)) %>%
  mutate(bin_width = 0.9 * range / (bin_count - 1)) %>%
  # not sure what to assume when there's only one bin...
  ungroup() %>%

ggplot() +
  geom_col(aes(x= bins, y= counts, width = bin_width)) +
  scale_y_log10()+
  facet_wrap(vars(group),
             ncol = 2,
             scales = "free")

enter image description here

The reason this happens is that ggplot does some data preparation under the hood to show you the data with reasonable (or intended to be so) defaults. In this case, it calculates the implied resolution of your x-axis and uses that to determine bar width.

You'll note, for instance, that if the two groups had harmonious breaks, your faceting problem would go away:

test.dataframe = data.frame(
  group = rep(c('A', 'B'), each= 5),
  bins = c(-9000, -4400, 200, 4800, 9400,
           -13600, -9000, -4400, 200, 4800),
  counts = c(2, 6259, 2950, 8, 6, 
             22, 609, 543, 62, 5
  )
)

Then your original faceted code produces:

enter image description here

The problem arises because the "data resolution" calculation appears to be determined based on the entire data, so that ggplot is assuming your data is much more granular than it really is. For the time being, it's probably simplest to grab the wheel and specify the widths you really want, since this case doesn't work well with the default heuristic.

Jon Spring
  • 55,165
  • 4
  • 35
  • 53
  • Thanks a lot for a clean solution. I'm still interested in why it happens in the first place. – pbahr Oct 08 '19 at 20:17