0

I have a dataset set of earnings. I want to display a boxplot of earnings depending on race.

The race is split into numbers from 0 to 10. 0 to 3 is white, 4 to 5 is black, 6 to 10 is mixed.

How can I show a boxplot of earnings depending on race?

I tried splitting it into factors, and I have 3 factors now using:

white <- factor(Race < 4)
black <- factor(Race>4 & Race<6)
mixed <- factor(Race>6)

But the box plot doesn't work with that.

GRS
  • 2,807
  • 4
  • 34
  • 72

2 Answers2

2

You can do this with cut

Race = 0:10
R2 = factor(cut(Race, breaks=c(0,3,5,10), include.lowest=TRUE), 
        labels=c("White", "Black", "Mixed"))
R2
 [1] White White White White Black Black Mixed Mixed Mixed Mixed Mixed
Levels: White Black Mixed
G5W
  • 36,531
  • 10
  • 47
  • 80
1

Using dplyr:

levels <- c(3, 5, 10)
labels <- c("White", "Black", "Mixed")
data %>% 
mutate(Race.factor = cut(Race, levels, labels = labels)) %>%
ggplot(aes(x=Race.factor, y=earnings) +
geom_boxplot()

You could also use data.table:

library(data.table)
setDT(data)[, race.Factor := cut(b, levels, labels)]
James Martherus
  • 1,033
  • 1
  • 9
  • 20