Post #14. Multiple color scales in one ggplot

2022

Using different color scales for multiple geom layers in one ggplot with ggnewscale and relayer

Gen-Chang Hsu
2022-04-24

The problem

Let’s kick things off with an example:

library(tidyverse)

ggplot(data = iris) + 
  geom_point(aes(x = Species, y = Sepal.Length, color = Sepal.Length), position = position_jitter(width = 0.1)) + 
  geom_boxplot(aes(x = Species, y = Sepal.Length), width = 0.5, outlier.color = NA, fill = NA) + 
  labs(y = "Sepal length (cm)") + 
  theme_classic(base_size = 14) + 
  scale_color_viridis_c(name = "Sepal length")


Here, we make a boxplot of the sepal length for the three species and add the original data points, which are mapped to a continuous color scale.

Now, what if we also want to use different colors for the boxes? Let’s try it out:

ggplot(data = iris) + 
  geom_point(aes(x = Species, y = Sepal.Length, color = Sepal.Length), position = position_jitter(width = 0.1)) + 
  geom_boxplot(aes(x = Species, y = Sepal.Length, color = Species), width = 0.5, outlier.color = NA, fill = NA) +  # map species to color
  labs(y = "Sepal length (cm)") + 
  theme_classic(base_size = 14) + 
  scale_color_viridis_c(name = "Sepal length") + 
  scale_color_brewer(name = "Species", palette = "Set1")  # color scale for species
Scale for 'colour' is already present. Adding another scale for
'colour', which will replace the existing scale.
Error: Continuous value supplied to discrete scale


Eek, it throws an error!!! The message says “Continuous value supplied to discrete scale”, meaning that we cannot use another color scale for the plot (the discrete one for species) when there is already one (the continuous one for sepal length). What can we do?

The solution

Thankfully, there are two extension packages designed to solve this problem: ggnewscale and relayer. I’ll walk through them in the following sections.

(1) ggnewscale

Looking for a quick fix? ggnewscale is the go-to: The main function new_scale_color() is super straightforward and easy to use. Simply insert the function before the second color scale (in this case scale_color_brewer()). This will “decouple” the first scale from the second one so that the two scales will not interfere with each other. Indeed, you can apply the principle to three or more color scales!

By the way, note that new_scale_color() only works for scale, not geom! So if you put both geom_point() and geom_boxplot() before the function, the error will still occur (you can try this out yourself!).

# install.packages("ggnewscale")  # install the package if you haven't
library(ggnewscale)

ggplot(iris) + 
  geom_point(aes(x = Species, y = Sepal.Length, color = Sepal.Length), position = position_jitter(width = 0.1)) + 
  scale_color_viridis_c(name = "Sepal length") + 
  new_scale_color() +
  geom_boxplot(aes(x = Species, y = Sepal.Length, color = Species), width = 0.5, outlier.color = NA, fill = NA) + 
  scale_color_brewer(palette = "Set1") + 
  labs(y = "Sepal length (cm)") + 
  theme_classic(base_size = 14)

(2) relayer

The second way to handle the problem is to use the function rename_geom_aes() in the package relayer. The basic idea of the function is to assign different “names” to the same aesthetic (e.g., color) as if they were different aesthetics so that these “pseudo-aesthetics” will be treated independently and not interfere with each other.

Confused? Let’s break it down in the following example!

# devtools::install_github("clauswilke/relayer")  # install the package if you haven't
library(relayer)

ggplot(iris) +
  
  # the first color aesthetic mapping 
  # note that you need to use "colour" instead of "color" for the function to work!
  geom_point(aes(x = Species, y = Sepal.Length, colour = Sepal.Length), position = position_jitter(width = 0.1)) +
  
  # use the new name "colour_for_species" for the second color aesthetic mapping
  # specify that "colour_for_species" corresponds to the color aesthetic in rename_geom_aes()
  (geom_boxplot(aes(x = Species, y = Sepal.Length, colour_for_species = Species), width = 0.5, outlier.color = NA, fill = NA) %>% 
     rename_geom_aes(new_aes = c("colour" = "colour_for_species"))) + 
  
  # the color scale for the first color aesthetic mapping ("colour")
  scale_color_viridis_c(aesthetics = "colour", name = "Sepal length") + 
  
  # the color scale for the second color aesthetic mapping ("colour_for_species")
  scale_color_brewer(aesthetics = "colour_for_species", palette = "Set1", name = "Species") +
  
  labs(y = "Sepal length (cm)") + 
  theme_classic(base_size = 14)


There are a few more cool things we can do with the relayer package. For example, we can color the points by both sepal length and species (i.e., three sets of color palettes for the points).

# install.packages("RColorBrewer")  # install the package if you haven't
library(RColorBrewer)  
pal_species <- brewer.pal(3, "Set1")  # the point colors for the three species

ggplot(iris) +
  
  # specify "group = species" to use different colors for each species
  # use the new names (col1, col2, and col3) for the color aesthetic mappings
  # specify the color mappings for the corresponding species in aes() 
  (geom_point(aes(x = Species, y = Sepal.Length, col1 = Sepal.Length, col2 = Sepal.Length, col3 = Sepal.Length, group = Species), position = position_jitter(width = 0.1)) %>% 
  rename_geom_aes(new_aes = c("colour" = "col1", "colour" = "col2", "colour" = "col3"), aes(colour = case_when(group == 1 ~ col1, group == 2 ~ col2, group == 3 ~ col3)))) +
  
  # geom layer for the boxplot
  (geom_boxplot(aes(x = Species, y = Sepal.Length, col4 = Species), width = 0.5, outlier.color = NA, fill = NA) %>% 
     rename_geom_aes(new_aes = c("colour" = "col4"))) + 
  
  # color scales for the points (three scales, each for one species)
  scale_color_gradient(aesthetics = "col1", low = "white", high = pal_species[1], name = "setosa", guide = guide_legend(order = 1, direction = "horizontal", title.position = "top", keywidth = unit(0.05, "in"))) +
  scale_color_gradient(aesthetics = "col2", low = "white", high = pal_species[2], name = "versicolor", guide = guide_legend(order = 2, direction = "horizontal", title.position = "top", keywidth = unit(0.05, "in"))) +
  scale_color_gradient(aesthetics = "col3", low = "white", high = pal_species[3], name = "virginica", guide = guide_legend(order = 3, direction = "horizontal", title.position = "top", keywidth = unit(0.05, "in"))) +
  
  # color scale for the boxplot
  scale_color_brewer(aesthetics = "col4", palette = "Set1", name = "Species", guide = guide_legend(order = 4, direction = "vertical")) + 

  labs(y = "Sepal length (cm)") + 
  theme_classic(base_size = 14) + 
  theme(legend.title.align = 0.5,
        legend.spacing.y = unit(0.05, "in"))


In the example above, the three point legends are jointly determined by the sepal length of the three species, and so they all have the same range. This works fine here, but may not work if the three species had quite different sepal lengths (imagine one species has its sepal ten times longer than that of the other two species, then in that case the legend range will be greatly distorted!)

So is it possible to have individual range for each legend (i.e., the range of each legend is determined only by the sepal length of the corresponding species)? Of course!

ggplot() +
  
  # draw three separate geom layers for the points so that their color aesthetic mappings are independent of each other
  (geom_point(data = filter(iris, Species == "setosa"), aes(x = Species, y = Sepal.Length, col1 = Sepal.Width), position = position_jitter(width = 0.1)) %>% 
     rename_geom_aes(new_aes = c("colour" = "col1"))) +
  (geom_point(data = filter(iris, Species == "versicolor"), aes(x = Species, y = Sepal.Length, col2 = Sepal.Width), position = position_jitter(width = 0.1)) %>% 
     rename_geom_aes(new_aes = c("colour" = "col2"))) +
  (geom_point(data = filter(iris, Species == "virginica"), aes(x = Species, y = Sepal.Length, col3 = Sepal.Width), position = position_jitter(width = 0.1)) %>% 
     rename_geom_aes(new_aes = c("colour" = "col3"))) +
  
  # geom layer for the boxplot
  (geom_boxplot(data = iris, aes(x = Species, y = Sepal.Length, col4 = Species), width = 0.5, outlier.color = NA, fill = NA) %>%
     rename_geom_aes(new_aes = c("colour" = "col4"))) +
  
  # color scales for the points (three scales, each for one species)
  scale_color_gradient(aesthetics = "col1", low = "white", high = pal_species[1], name = "setosa", guide = guide_legend(order = 1, direction = "horizontal", title.position = "top", keywidth = unit(0.05, "in"))) +
  scale_color_gradient(aesthetics = "col2", low = "white", high = pal_species[2], name = "versicolor", guide = guide_legend(order = 2, direction = "horizontal", title.position = "top", keywidth = unit(0.05, "in"))) +
  scale_color_gradient(aesthetics = "col3", low = "white", high = pal_species[3], name = "virginica", guide = guide_legend(order = 3, direction = "horizontal", title.position = "top", keywidth = unit(0.05, "in"))) +
  
  # color scale for the boxplot
  scale_color_brewer(aesthetics = "col4", palette = "Set1", name = "Species", guide = guide_legend(order = 4, direction = "vertical")) +
  
  labs(y = "Sepal length (cm)") + 
  theme_classic(base_size = 14) + 
  theme(legend.title.align = 0.5,
        legend.spacing.y = unit(0.05, "in"))


The main difference from the previous example is that, instead of drawing all the points and specifying all the color mappings in one geom layer, now we do them separately in three geom layers, each for one species so that the three legends will be determined independently for each layer.


*NOTE*

It seems that currently the relayer package only works for discrete legends. Therefore, if the mapped variable is continuous (like Sepal.Length), you have to force a continuous color scale (like scale_color_gradient()) to be a discrete one using the argument guide = guide_legend().

Summary

To recap, we learned two packages that allow us to use different color scales for multiple geom layers: ggnewscale is the most convenient for simple purposes; relayer is slightly more complicated yet also more versatile.

In this post, I use color aesthetic as an example (which I think is the most common situation you will be encountering). But the same principle applies to other aesthetics as well: fill, shape, linetype, etc. (although the authors of the packages do warm that “It’s very experimental, so use at your own risk!”).

That’s it for now and as always, don’t forget to leave your comments and suggestions below if you have any!

Corrections

If you see mistakes or want to suggest changes, please create an issue on the source repository.