Post #37. ggplot Map Series No.6: Cartograms in ggplot

2024 Map Series

Come and learn how to create cartograms in ggplot!

Gen-Chang Hsu
2024-10-08

Introduction

In a previous post in this Map Series, I wrote about how to create choropleth maps in ggplot: they are thematic maps that visualize the quantity of a focal variable across a geographical area (e.g., the population size of each country in the world) using different colors. Choropleth maps provide a quick straightforward glance at the spatial patterns of the focal variable.

Cartograms are similar to choropleth maps; they also visualize the quantity of a variable across a geographical area. However, the geographic sizes of the regions are altered to be proportional to the numerical values of the focal variable. This enhances the visual impacts of the maps. And in fact, it’s easy to switch between choropleths and cartograms as they are both created from data with the same underlying structure!

In this post, I’ll show you how to create cartograms in ggplot. We’ll then make a variant of cartogram by adding pie charts to the map. Read on!

A map of the world forest area

We’re going to make thematic maps of the global forest area by country. The dataset can be downloaded from the website Our World in Data: it contains the forest area in each country from 1990 to 2020.

Let’s first get an idea of what the data look like by creating a choropleth map of the forest area in each country in 2020. We retrieved the world base map from the Natural Earth database using the package rnaturalearth and joined the forest data to the world map data to draw the choropleth map:

library(tidyverse)

### Read the forest area data
country_forest_area <- read_csv("Country_Forest_Area.csv")

country_forest_area_2020 <- country_forest_area %>% 
  filter(Year == 2020)

### Get a world base map from Natural Earth
library(rnaturalearth)
library(sf)  # for the function "st_transform()"

world_map <- ne_countries(type = "countries", returnclass = "sf", scale = 110) %>%
  st_transform(crs = "ESRI:54009")  # Mollweide projection

### Join the forest data and the world map data based on country codes
world_map_forest_area_2020 <- world_map %>% 
  mutate(Code = toupper(adm0_a3)) %>% 
  left_join(country_forest_area_2020, by = join_by("adm0_a3" == "Code"))

### Create a choropleth map of the global forest area by country
library(ggtext)  # for the function "element_markdown()"
library(colorspace)  # for the function "scale_fill_continuous_sequential()"

ggplot(world_map_forest_area_2020) + 
  geom_sf(aes(fill = `Forest area`)) + 
  scale_fill_continuous_sequential(palette = "Greens",
                                   breaks = seq(1e8, 8e8, by = 2e8), 
                                   labels = str_glue("<span>{c(1, 3, 5, 7)}×10^8^</span>"), 
                                   name = "<span>Forest area (hectares)</span>") + 
  coord_sf(expand = F) + 
  theme_bw() + 
  theme(panel.border = element_blank(),
        axis.text = element_blank(),
        axis.ticks = element_blank(),
        legend.title = element_markdown(hjust = 0.5),
        legend.text = element_markdown(),
        legend.direction = "horizontal",
        legend.position = "top",
        legend.title.position = "top")

Create the cartograms

Time to create the cartograms! We’ll use the handy extension package cartograms to make the maps. This package offers functions for two types of cartograms: a continuous one where the geographical sizes of the regions are altered to be proportional to the values of the focal variable (via cartogram_cont()), and a discrete one (also called a “Dorling cartogram”) where the regions are redrawn as circles with the sizes proportional to the focal variable values (via cartogram_dorling()).

The process is indeed quite simple; what we need to do is to pass the “sf” object to the functions and specify the focal variable in the argument “weight =”. The argument “itermax =” is used for controlling the number of iterations for the cartogram algorithm (100 is pretty much sufficient for the estimation to complete). The functions will return a new “sf” object, which can directly be passed to geom_sf() for visualization.

Let’s take a look at how it works:

# install.packages("cartogram")  # install the package if you haven't
library(cartogram)

### Drop NA values for the forest area (otherwise the cartogram functions will complain!)
world_map_forest_area_2020 <- drop_na(world_map_forest_area_2020, "Forest area")

### Construct the continuous cartogram
### This might take a while to run
cartogram_continuous <-  cartogram_cont(world_map_forest_area_2020, 
                                        weight = "Forest area",  # specify the focal variable 
                                        itermax = 100)

### Contruct the Dorling cartogram
cartogram_dorling <- cartogram_dorling(world_map_forest_area_2020, 
                                       weight = "Forest area",  # specify the focal variable
                                       itermax = 100)

### Visualize the cartograms
p_continuous_cartogram <- ggplot() + 
  geom_sf(data = cartogram_continuous, fill = "#2ca25f", color = alpha("#b2e2e2", 0.75)) +
  labs(title = "Continuous cartogram") + 
  theme_void() + 
  theme(plot.title = element_text(hjust = 0.5, vjust = 3, size = 14, color = "grey"),
        plot.background = element_rect(fill = "#045a8d", color = NA),
        plot.margin = margin(t = 15))

p_dorling_cartogram <- ggplot() + 
  geom_sf(data = world_map_forest_area_2020, fill = "#2ca25f", color = alpha("#b2e2e2", 0.5)) + 
  geom_sf(data = cartogram_dorling, fill = alpha("#dfc27d", 0.8), color = alpha("#a6611a", 0.8)) +   labs(title = "Dorling cartogram") + 
  theme_void() + 
  theme(plot.title = element_text(hjust = 0.5, vjust = 3, size = 14, color = "grey"),
        plot.background = element_rect(fill = "#045a8d", color = NA),
        plot.margin = margin(t = 15))

library(patchwork)

p_continuous_cartogram / plot_spacer() / p_dorling_cartogram + 
  plot_layout(heights = c(1, 0.1, 1))

Here we go! As you can see, Russia has the largest share of the global forest area, followed by Brazil, Canada, the United States, and China (the big five!).

A variant of Dorling cartogram

In this final section, we’ll create a variant of Dorling cartogram by adding pie charts of the percent forest cover in the countries. This takes four steps:

  1. First, we got the areas of the countries using st_area() to calculate the percent forest area.
### Step 1. Calculate the percent forest area for each country
percent_forest_area_2020 <- world_map_forest_area_2020 %>% 
  # use st_area() to get the country areas in square meters and convert them to hectares
  mutate(country_area_hectares = st_area(world_map_forest_area_2020)/10000) %>%  
  mutate(percent_forest_area = as.numeric(`Forest area`/country_area_hectares*100)) %>% 
  select(Code, percent_forest_area) %>% 
  st_drop_geometry()  # convert the "sf" object into a dataframe
  1. Second, we made pie charts of the percent forest areas for the countries. The plots were converted to “grob” objects via ggplotGrob() so that they can be added to the cartogram later.
### Step 2. Create pie charts of the percent forest area for the countries
percent_forest_area_2020_pie_charts <- percent_forest_area_2020 %>% 
  mutate(pie_chart = map(percent_forest_area, function(percent_forest_area){
    p <- ggplot() +  
      geom_col(aes(x = 1, y = percent_forest_area), fill = alpha("#a6611a", 0.8), width = 1) + 
      scale_y_continuous(limits = c(0, 100), expand = c(0, 0)) + 
      scale_x_continuous(limits = c(0.5, 1.5), expand = c(0, 0)) + 
      coord_polar(theta = "y") + 
      theme_void() +
      theme(plot.margin = unit(c(0, 0, 0, 0),"cm"))
    
    p_grob <- ggplotGrob(p)  # convert the ggplot objects into grobs
  }))
  1. Third, we got the centroids and areas of the circles in the Dorling cartogram using st_centroid() and st_area() and calculated the radii of the circles. These would be used for drawing the pie charts.
### Step 3. Get the centroids and the radii of the country circles in the Dorling cartogram
country_centroids <- cartogram_dorling %>% 
  mutate(country_centroid_lon = st_coordinates(st_centroid(cartogram_dorling)$geometry)[, 1],
         country_centroid_lat = st_coordinates(st_centroid(cartogram_dorling)$geometry)[, 2]) %>%
  select(Code, country_centroid_lon, country_centroid_lat) %>% 
  st_drop_geometry()

country_circle_radii <- cartogram_dorling %>%
  mutate(circle_area = as.numeric(st_area(cartogram_dorling)),
         circle_radius = as.numeric(sqrt(circle_area/pi))) %>% 
  select(Code, circle_radius) %>% 
  st_drop_geometry()

# put the pie charts, centroids, and radii into a single dataframe
pie_chart_df <- country_centroids %>% 
  left_join(country_circle_radii, by = join_by("Code")) %>% 
  inner_join(., percent_forest_area_2020_pie_charts, by = join_by("Code"))
  1. Finally, we created a dorling cartogram and added the pie charts to it using the function annotation_custom(). The radii of the pies were multiplied by a scalar (in this example 1.2) so that the pies fit to the circles. The function reduce() allowed us to iterate over the rows to add the pie chart of each country based on its centroid and radius.
### Step 4. Add the pie charts to the Dorling cartogram
# create a Dorling cartogram base map
p_dorling_cartogram_base_map <- ggplot() + 
  geom_sf(data = world_map_forest_area_2020, fill = "#2ca25f", color = alpha("#b2e2e2", 0.5)) + 
  geom_sf(data = cartogram_dorling, fill = alpha("#dfc27d", 0.8), color = alpha("#a6611a", 0.8)) +
  theme_void() + 
  theme(plot.background = element_rect(fill = "#045a8d", color = NA),
        plot.margin = margin(t = 15, b = 15))

# iterate over the rows to add the pie chart for each country
reduce(1:nrow(pie_chart_df), 
       .f = function(x, y){
         x + annotation_custom(grob = pie_chart_df$pie_chart[[y]],
                               xmin = pie_chart_df$country_centroid_lon[y] - pie_chart_df$circle_radius[y] * 1.2,
                               xmax = pie_chart_df$country_centroid_lon[y] + pie_chart_df$circle_radius[y] * 1.2,
                               ymin = pie_chart_df$country_centroid_lat[y] - pie_chart_df$circle_radius[y] * 1.2,
                               ymax = pie_chart_df$country_centroid_lat[y] + pie_chart_df$circle_radius[y] * 1.2)},
       .init = p_dorling_cartogram_base_map)

VOILÀ!

Summary

To recap what we did in this post, we created cartograms of the global forest area by country in ggplot using the package cartogram. We first made a choropleth map to get a sense of what the forest area data looked like. We then made two versions of cartograms: a continuous cartogram with altered country areas proportional to their forest areas and a Dorling cartogram with circles representing the forest areas of the countries. Finally, we made a variant of the Dorling cartogram by adding pie charts of the percent forest area in the countries to the map.

Cartograms are easy to read and can generate strong visual effects. Honestly, I haven’t really used this type of maps in the past, but now I know how to make them in ggplot (and how easy it is to make them), I’ll definitely integrate this into my data visualization toolbox, and I think you should too!

Hope you learn something useful from this post and don’t forget to leave your comments and suggestions below if you have any!

Corrections

If you see mistakes or want to suggest changes, please create an issue on the source repository.