Post #38. Creating ternary plots with ggtern

2024

Come and learn some useful tips for creating ternary plots in ggplot!

Gen-Chang Hsu
2024-12-24

Introduction

Welcome to the end-of-year post! Really stoked to conclude 2024 with ternary plots!

You may (or may not) have seen it, a ternary plot is a type of plot that displays the relative proportions of three variables in an equilateral triangle. It’s a more specialized plot type and certainly not the one you’ll encounter in everyday life, but it can be quite handy when it comes to visualizing compositional data, for example, the carbohydrates, protein, and fat composition of food items.

I remember I first came across a ternary plot in an Ecology textbook years ago when I was learning the classic Grime’s CSR Triangle for depicting three major plant life strategies (competitor, stress-tolerator, and ruderal) under different stress, resource, and disturbance regimes. I was really fascinated with this kind of plot, but I didn’t know how to make it back then. Now I know, and I think it would be fun (and useful too!) to write a post on creating ternary plots in ggplot. So here it is!

Ternary plots with ggtern

The extension package ggtern provides a suite of functions for creating ternary plots in ggplot. We’ll start from the very basics, gradually add in some variation and complexity, and finally finish by polishing the plot appearance.

1. Start with a point: basic ternary plot

First, let’s create a super simple ternary plot with just a single point.

ggtern() is the master function here: it takes a dataframe with three numeric columns, which are then mapped to the “x”, “y”, and “z” aesthetics. If the three columns are not proportions but instead raw values, the function will automatically rescale them so that each row sums to 1.

In the example below, I drew the lines connecting the point to the sides using the function geom_crosshair_tern(). I also colored the axes and gridlines as well as added the arrows and labels using the function theme_rgbw(). These will help us better understand what the plot is doing.

library(tidyverse)
library(ggtern)

### Generate the data
ternary_df_point <- data.frame(x = 1, y = 2, z = 3)

### Create a basic ternary plot
ggtern(data = ternary_df_point, aes(x = x, y = y, z = z)) +  # the values will be rescaled 
  geom_point() +
  geom_crosshair_tern() +  # draw the lines connecting the point to the sides
  theme_rgbw()  # color the axes and gridlines & add the arrows and labels

If you’re new to ternary plots, you may be scratching your head about how to read the plot. Well, it’s not that difficult once you understand the principle: as indicated by the lines, the point corresponds to an x-value of around 17% (1/6), a y-value of around 33% (2/6), and a z-value of 50% (3/6).

For plots that do not have the lines (which are most likely what you’ll see in the wild), you can draw mental lines from the point to the sides with the same angle as the numbers. So for example, the numbers on the “x” side have an angle of 120°, so you can draw a line with an angle of 120° from the point to that side and see what the corresponding value is.

2. From point to line: ternary line chart

Now that we have a better idea of how ternary plots work, we can take a step further. Instead of just showing individual points, we can connect them to create a “ternary line chart”. This will be useful for visualizing how composition changes over time.

### Generate the data
ternary_df_line <- data.frame(x = seq(2, 9, length.out = 10)^2, 
                              y = seq(9, 2, length.out = 10)^2) %>% 
  mutate(z = 100 - x - y) %>% 
  rowid_to_column(var = "time")

### Create a ternary line chart
ggtern(data = ternary_df_line, aes(x = x, y = y, z = z)) +
  geom_point(aes(color = time)) +
  geom_line(aes(color = time)) +
  scale_color_viridis_c(breaks = c(1, 10)) + 
  guides(color = guide_colorbar(reverse = T)) + 
  theme_arrownormal()  # add the arrows and labels without colors

3. From line to area: ternary contour plot

Next, we’ll go from a line to an area by adding density contours to the ternary plot.

As mentioned in the beginning, one application of ternary plots is visualizing the nutrient composition of food, and this is what we’re going to do now. We’ll use the starbucks dataset from the package openintro to visualize the carbs, protein, and fat content of different food items sold at Starbucks.

We can add density contours to the plot using the function stat_density_tern() and map the relative density “nlevel” (computed via the 2D kernel density estimation) to the “fill” and “alpha” aesthetics. We can adjust how close the contour lines are (i.e., the “steepness” of the slope) using the argument “binwidth”: a larger binwidth produces sparser, smoother contour lines, whereas a smaller binwidth produces denser, more detailed contour lines.

library(openintro)  # for the dataset "starbucks"
data("starbucks")

### Create a ternary contour plot
ggtern(data = starbucks, aes(x = carb, y = protein, z = fat)) +
  stat_density_tern(geom = "polygon", 
                    aes(fill = after_stat(nlevel), alpha = after_stat(nlevel)),
                    binwidth = 0.5) +   # binwidth controls how close the contour lines are
  stat_density_tern(geom = "polygon",
                    color = "black", 
                    fill = NA,
                    binwidth = 5,
                    linewidth = 0.1) + 
  geom_point(size = 1, alpha = 0.5) + 
  scale_fill_distiller(name = "Relative density", 
                       palette = "Reds", 
                       direction = 1,
                       breaks = c(0.1, 0.3, 0.5, 0.7, 0.9)) + 
  guides(alpha = "none") + 
  theme_rgbw() + 
  theme(legend.position = c(1.05, 0.5),
        legend.title = element_text(hjust = 0.5),
        plot.margin = margin(r = 70))

Seems like most Starbucks food is high in carbs and low in protein (not surprising?!).

4. From area to bin: ternary tribin and hexbin plot

When there are a ton of data points to show, we can run into the issue of overplotting: the overlapping points will make the plot unreadable. A solution is to divide the plot into triangular or hexagonal bins, count the number of points in each bin, and color the bins accordingly. These are easily done with geom_tri_tern() and geom_hex_tern(). We can use the argument “bins” to specify the number of bins along the axes.

### Generate the data
set.seed(1)
ternary_df_tri_hex <- data.frame(x = runif(5000), 
                                 y = runif(5000),
                                 z = runif(5000))

### Create a ternary tribin plot
ggtern(data = ternary_df_tri_hex, aes(x = x, y = y, z = z)) +
  geom_tri_tern(aes(fill = after_stat(count)),
                bins = 10) +  # number of bins along the axes
  scale_fill_viridis_c()

### Create a ternary hexbin plot
ggtern(data = ternary_df_tri_hex, aes(x = x, y = y, z = z)) +
  geom_hex_tern(aes(fill = after_stat(count)),
                bins = 20) + 
  scale_fill_viridis_c()

5. Ternary interpolation plot

One more cool thing we can do is creating a ternary interpolation plot. We’ll first create a grid of points evenly distributed across the plot area, each associated with a value (“height”). We can then use geom_interpolate_tern() to perform interpolation on these data points and plot the interpolated values as contour lines. The “breaks” argument allows us to specify the interpolated values at which the contour lines are drawn.

### Generate the data
ternary_df_interpolation <- expand_grid(x = seq(0, 100, 10),
                                        y = seq(0, 100, 10)) %>% 
  mutate(z = 100 - x - y) %>% 
  filter(z >= 0) %>% 
  mutate(height = x * y - 0.05 * z)

### Create a ternary interpolation plot
library(tidyterra)  # for the function "scale_color_hypso_c()"

ggtern(data = ternary_df_interpolation, aes(x = x, y = y, z = z, value = height)) +
  geom_interpolate_tern(aes(color = after_stat(level)),
                        base = "identity",
                        breaks = seq(1, 3000, length.out = 30),  # the interpolated values to draw contour lines
                        size = 2) +
  geom_point(aes(color = height)) + 
  scale_color_hypso_c(name = "Height", palette = "dem_screen", limits = c(0, 3000)) +
  theme_bw() + 
  theme(legend.position = c(0.08, 0.7))

6. Polish the plot: axis labels and annotations

To wrap things up, we’ll give the above ternary interpolation plot a finishing touch by adjusting the axis labels and adding some annotations:

### The original ternary interpolation plot
p_ternary_interpolation_original <- ggtern(data = ternary_df_interpolation, aes(x = x, y = y, z = z, value = height)) +
  geom_interpolate_tern(aes(color = after_stat(level)),
                        base = "identity",
                        breaks = seq(1, 3000, length.out = 30),
                        size = 2) +
  geom_point(aes(color = height)) + 
  scale_color_hypso_c(name = "Height", palette = "dem_screen", limits = c(0, 3000)) +
  theme_bw() + 
  theme(legend.position = c(0.08, 0.7))

### The polished ternary interpolation plot
p_ternary_interpolation_polished <- p_ternary_interpolation_original + 
  
  # axis label text
  scale_L_continuous(name = "X-axis",  # L for "left" axis label (i.e., x-axis)
                     breaks = seq(0, 1, 0.2),  
                     labels = seq(0, 1, 0.2)) +  
  scale_T_continuous(name = "Y-axis",  # T for "top" axis label (i.e., y-axis)
                     breaks = seq(0, 1, 0.2),  
                     labels = seq(0, 1, 0.2)) +  
  scale_R_continuous(name = "Z-axis",  # R for "right" axis label (i.e., z-axis)
                     breaks = seq(0, 1, 0.2),
                     labels = seq(0, 1, 0.2)) + 
  
  # axis label appearance
  theme(tern.axis.title.L = element_text(vjust = -2.25, hjust = -3.5, angle = 60),
        tern.axis.text.L = element_text(color = "black"),
        tern.axis.title.T = element_text(vjust = -1.5, hjust = -3.5, angle = 300),
        tern.axis.text.T = element_text(color = "black"),
        tern.axis.title.R = element_text(vjust = 2.5, hjust = 4.75, angle = 0),
        tern.axis.text.R = element_text(color = "black")) + 
  
  # annotations
  annotate(geom = "point", x = 0.5, y = 0.5, z = 0, shape = 17, size = 5, color = "brown") + 
  annotate(geom = "curve", x = 0.5, y = 0.45, z = 0.05, xend = 0.45, yend = 0.35, zend = 0.2,
           arrow = arrow(length = unit(0.03, "npc")), color = "brown") + 
  annotate(geom = "text", x = 0.45, y = 0.35, z = 0.2, 
           label = "Peak", size = 4, color = "brown", hjust = -0.1)

p_ternary_interpolation_polished

Summary

In this post, we created a wide variety of ternary plots, including a basic ternary plot, a ternary line chart, a ternary contour plot, a ternary tribin/hexbin plot, and a ternary interpolation plot. In the end, we also learned how to customize the appearance of axis labels and add annotations.

To be perfectly honest with you, I’ve never used ternary plots in my research and presentation in the past, not because I didn’t want to use them, but because I didn’t really get the chance to use them. That said, I believe I’ll encounter compositional data in the future, and by that time I’ll certainly make some nice-looking ternary plots. I think you should keep these plots in your data viz toolbox too!

Hope you learn something useful from this post and don’t forget to leave your comments and suggestions below if you have any!

Corrections

If you see mistakes or want to suggest changes, please create an issue on the source repository.