Daymetr raster manipulation

Summarizing raster data

We got a question recently on how to manipulate spatial Daymet data, as downloaded using the BlueGreen Labs daymetr package. In particular, the question focused on how to calculate temporal averages. Below you find an example on how to calculate yearly and monthly summaries for your datasets.

Setting up the environment

In this example I’ll use the daymetr package to download the Daymet data, and raster, tidyverse and patchwork to cover data manipulation and plotting.

# read libraries
library(tidyverse)
library(patchwork)
library(raster)
library(daymetr)

Downloading the data

In this example I will use snow water equivalent data (SWE, kg/m2), around Mount Washington in the North East of the US, to plot the seasonal and yearly maximum snow accumulation values. However, the technique can be scaled to all other daymet variables and/or summary functions, or temporal subsets. To start this example, we first define the region of interest centered on Mount Washington.

# mount Washington location
mount_washington <- data.frame(
  lat = 44.2705,
  lon = -71.30325)

# pad by 0.5 degrees
location <- c(
  mount_washington$lat + 0.5,
  mount_washington$lon - 0.5,
  mount_washington$lat - 0.5,
  mount_washington$lon + 0.5
  )

For this region of interest we download SWE Daymet data for the year 2010.

# download daymetr data
download_daymet_ncss(
  location = location,
  start = 2010,
  end = 2010,
  frequency = "daily",
  param = "swe",
  path = tempdir(),
  silent = FALSE)

Calculating temporal summary statistics

The data downloaded can be easily read using the raster library. This library also includes functions to calculate summary statistics across pixels based upon an index, which specifies what layers to include in the calculation of these summary statistics. Given that this functionality is included in the raster package and well documented no such function is covered in daymetr itself.

First we load in the downloaded Daymet data, and set and reproject the raster. In addition we extract the dates from the names of the raster stack. Finally we convert these dates to the month which they cover. These months will server as indices for the latter summary statistics.

# load raster data
r <- raster::stack(file.path(tempdir(),"swe_daily_2010_ncss.nc"))

# set correct projection data, embedded values are wrong
projection(r) <- "+proj=lcc +lat_1=25 +lat_2=60 +lat_0=42.5 +lon_0=-100 +x_0=0 +y_0=0 +ellps=WGS84 +units=km +no_defs"

# reproject to lat/lon for easier visuals
r <- raster::projectRaster(r, crs = "+init=epsg:4326")

# get the names of the layers
# which include the dates
dates <- data.frame(layer = names(r)) %>% 
  mutate(
   days = as.Date(layer,"X%Y.%m.%d"),
   months = months(days, abbreviate = TRUE)
  )

With the indices defined (i.e. months) we can calculate monthly and yearly statistics. For yearly statistics, in this example, we can just apply any function to the whole raster stack. When we want to know the maximum value we can call the max() function. This works for a select number of functions out of the box, for custom functions use calc().

Monthly summaries are made using the stackApply() function which uses an index to summarize the selected layers within the index (i.e. a vector with locations having the same value, in this case abbreviated months). The fun parameter specifies the function to use. Running the below commands calculates the maximum value for a year or a month, for the loaded raster stack.

# total swe across a year (kg/m2)
yearly_total_swe <- max(r)

# calculate total swe by month
monthly_total_swe <- stackApply(r, dates$months, fun = max)

Plotting the results

In order to plot everything using ggplot data are converted to data frames. The final plot is below together with the instructions on how to format the data. Although this doesn’t really cover the question at hand it might help people to visualize their data. In our case we see the yearly progression of maximum snow volumes (as SWE) throughout the year (seasons). For this region snow volumes tend to peak in late winter, early spring.

# convert to data frames for plotting
yearly_total_swe <- yearly_total_swe %>%
  rasterToPoints %>%
  as.data.frame() %>%
  rename('lon' = 'x',
         'lat' = 'y',
         'swe' = 'layer')  %>%
  filter(swe > 0)

monthly_total_swe <- monthly_total_swe %>%
  rasterToPoints %>%
  as.data.frame() %>%
  pivot_longer(values_to = "swe",
               cols = starts_with("index_"),
               names_to = "months") %>%
  mutate(
    months = factor(gsub("index_","",months),levels = month.abb)
    ) %>%
  rename('lon' = 'x',
         'lat' = 'y') %>%
  filter(swe > 0)

A theme is defined.

# set theme
theme_map <- function(...) {
  theme_minimal() +
    theme(
      axis.line = element_blank(),
      panel.grid.major = element_line(linetype = 2, size = 0.2),
      panel.grid.minor = element_blank(),
      plot.background = element_rect(fill = "#ffffff", color = NA),
      panel.background = element_rect(fill = "#ffffff", color = NA),
      legend.background = element_rect(fill = "#ffffff", color = NA),
      text = element_text(family = "Montserrat", color = "grey30", size=12),
      panel.border = element_blank(),
      legend.key.width = unit(2, "cm"),
      legend.position = "bottom",
      plot.margin = margin(0, 0, 0, 0),
      panel.spacing = unit(1, "lines"),
      axis.title.y = element_blank(),
      axis.title.x = element_blank(),
      ...
    )
}

Plots are generated

# setup yearly plot
p1 <- ggplot(yearly_total_swe) +
  geom_raster(aes(
    lon,
    lat,
    fill = swe)) +
  geom_point(
    data = mount_washington,
    aes(
      lon,
      lat
    ),
    pch = 17,
    colour = "white"
  ) +
  scale_fill_viridis_c(
    na.value = NA,
    guide = "none"
  ) +
  labs(title = "yearly SWE maximum")

# setup monthly plot
p2 <- ggplot(monthly_total_swe) +
  geom_raster(aes(
    lon,
    lat,
    fill = swe)) +
  geom_point(
    data = mount_washington,
    aes(
      lon,
      lat
    ),
    pch = 17,
    colour = "white"
  ) +
  scale_fill_viridis_c(
    na.value = NA,
    name = expression("SWE (kg/m" ** 2 * ")")
  ) +
  labs(title = "monthly SWE maximum") +
  facet_wrap(~months)

And combined using patchwork

# format output using patchwork
p <- (p1 + p2 + plot_layout(ncol = 2,
                      widths = c(2, 2.5),
                      guides = "keep") +
  plot_annotation(title = "",
                  subtitle = "",
                  caption = "",
                  tag_levels = "a",
                  tag_suffix = ")",
                  theme = theme(plot.title = element_text(
                    family = "Merriweather",
                    size = 30, color="grey10")
                  )) &
  theme_map())

# plot data (see header of blog post)
plot(p)
Avatar
Koen Hufkens, PhD
Partner, Researcher

As an earth system scientist and ecologist I model ecosystem processes.

Related

Previous