CHELSA dynamic BIOCLIM subsets

COG Geotiff operations to speed up processing

May 28, 2025 4 min read API, spatial, software, COG

The CHELSA dataset provides access to climatologies at high resolution for the earth’s land surface areas. Among those are the global climate-related predictors at kilometer resolution for the past and future (Brun et al. 2022) - generally known as bio-climatic variables. This dataset is provided through the main CHELSA website and links to a file download option which focuses on bulk file downloads.

However, it seems that the geotiffs provided are cloud optimized geotiffs (COG files). These COG files are regular geotiff files but with additional features which allow for dynamic interactions with the files. This feature seems undocumented on the website and is either a happy coincidence of sensible defaults or an oversight. Below I’ll provide an example on how this saves you time (money) and the planet.

When querying the BIOCLIM variables through the normal file downloader you are provided a list of files which cover the globe for all bio-climatic variables. These files need to be downloaded in bulk. All 19 bio-climatic variables result in roughly 5GB of data on disk. An example URL is provided below.

https://os.zhdk.cloud.switch.ch/chelsav2/GLOBAL/climatologies/1981-2010/bio/CHELSA_bio1_1981-2010_V.2.1.tif

In many cases you only need a small subset of this data. The old fashioned way of doing this would be to download all files (5 GB) and subset this larger dataset, while either discarding the unneeded data or storing the main dataset (both are wasteful of time, and or disk space). In the case of the single bio-climatic variable (BIO1) we would download a file of ~110MB.

However, since these files are COG geotiffs we can leverage their tiled nature to spatially subset the data, and only retrieve what we need. We therefore only need to change the URL slightly, before executing standard operations in the R {terra} package for raster data processing

library(terra)

# alter the url, adding the /vsicurl/ prefix
url <- paste0("/vsicurl/", "https://os.zhdk.cloud.switch.ch/chelsav2/GLOBAL/climatologies/1981-2010/bio/CHELSA_bio1_1981-2010_V.2.1.tif")

r <- terra::rast(url)

class       : SpatRaster 
dimensions  : 20880, 43200, 1  (nrow, ncol, nlyr)
resolution  : 0.008333333, 0.008333333  (x, y)
extent      : -180.0001, 179.9999, -90.00014, 83.99986  (xmin, xmax, ymin, ymax)
coord. ref. : lon/lat WGS 84 (EPSG:4326) 
source      : CHELSA_bio1_1981-2010_V.2.1.tif 
name        : CHELSA_bio1_1981-2010_V.2.1

Spatial subsetting

We can now call the crop command to spatially restrict the data on which to operate. We will use the outline of Switzerland as a bounding box for cropping the data.

# load the country outlines
library(rnaturalearth)
ch <- ne_countries(country = "Switzerland", scale = 50)

# crop the raster data and
# mask data outside the Switzerland
r_cropped <- r |> crop(ch) |> mask(ch)

Writing this data to disk results in a file of ~134KB instead of having to download the full 110MB. In short, you save orders of magnitude in both the required storage space and download times for your analysis. Note that you can add lists of COG based urls to the rast() command to create a raster stack, or time series depending on the dataset, to subset.

Point queries

COG files also allow you to make point queries for a single layer, or across all layers (i.e. time series). To demonstrate this you can draw 50 random points within the Swiss polygon to retrieve.

library(sf)

# sample random points within the Swiss polygon
# and convert to SpatVector
random_points <- st_sample(ch, 50) |> vect()

Instead of cropping to a bounding box you can use the extract() {terra} function to retrieve the point values. Note that the original raster data is used in a stacked fashion to show a query across a raster stack.

# extract the random points from the original
# COG URL and stack two URLs to show how to parse
# multiple layers
point_data <- terra::extract(c(r,r), random_points, xy = TRUE, ID = FALSE)

head(point_data)

CHELSA_bio1_1981-2010_V.2.1 CHELSA_bio1_1981-2010_V.2.1.1 x y
1	9.55	9.55	6.970694	46.78736
2	4.65	4.65	7.304027	46.51236
3	5.15	5.15	7.162360	46.60403
4	-1.05	-1.05	8.504027	46.38736
5	-3.45	-3.45	7.845694	46.14569
6	7.45	7.45	6.962360	46.59569

Sampling these 50 points takes mere seconds, rather than having to download the full underlying data first. Obviously, speed and storage gains are even larger than in the spatial subset example.

In closing, note that not all datasets within the larger framework of the CHELSA project are provided as COG files. Depending on the product you might not have access to this geotiff specific functionality. We hope that these small examples of COG file operations within the context of the CHELSA dataset might speed up your analysis.

References

Brun, P., Zimmermann, N.E., Hari, C., Pellissier, L., Karger, D.N. (preprint): Global climate-related predictors at kilometre resolution for the past and future. Earth System Science Data doi.org/10.5194/essd-2022-212

spatial software API COG Geotiff

Koen Hufkens, PhD

Founder, Researcher

As an earth system scientist and ecologist I model ecosystem processes.