ECMWF API use

migrating to new API services - the missing manual

Oct 14, 2024 5 min read API, package, software

Below you find a centralized compilation of key documentation on the new ECMWF APIs. I will only use the CDS as a reference, but the same applies to ADS and EWDS. The first section will cover general API use and will be familiar to most, with some gotchas highlighted (e.g. switching services, accepting licenses). Further sections cover breaking changes and advanced API implementations.

General API use

For those ending here with old non-functioning scripts, all API endpoints have been migrated to the new server. This means that you will need to update your login credentials and python packages. To get started follow these instructions:

update your login by registering a new ECMWF account

you need to validate the login for each service
the API key is the same across all services

update the python cdsapi package

pip install 'cdsapi>=0.7.2'

“official” support is provided for this package only
for other implementations, such as {ecmwfr} use the forum

set your API key and API url (depending on the service, more on this below in switching services)
accept the license agreement for the products you want to use

you find the license agreement on the right hand bar in the dataset search panel (see below)
all accepted licenses are listed at the end of your profile page
not accepting the license will generate failed downloads

The instructions assume that you will generate new scripts, based upon the download pages of any service.

Data download pages for all services

Switching services

The cdsapi python package generates a conflict when trying to use the software to download from multiple data services. In short, the URL used for any given services is determined (set) by a value in your .cdsapirc file. If there is a mismatch between the product requested and the URL provided your download will fail. If downloading from CDS and ADS this would necessitate altering the .cdsapirc file. The workaround for this is to set an environmental variable.

FIX: Set a URL environmental variable before every switch in API (data service) in your python script using:
# CDS
os.environ['CDSAPI_URL'] = 'https://cds.climate.copernicus.eu/api'
# ADS
os.environ['CDSAPI_URL'] = 'https://ads.atmosphere.copernicus.eu/api'
# EWDS
os.environ['CDSAPI_URL'] = 'https://ewds.climate.copernicus.eu/api'

WARNING:

Never include your API key in any scripts, you can set the .cdsapirc file for a single service and use the above line before any change in service without exposing your key.

Breaking changes

Non functioning scripts

The format of the python API package changed, i.e. old scripts and requests will need to be reworked.

FIX: Rephrase your old query using a data page search (see above) and alter the request part of your python query
client.retrieve(dataset, request, target) # <- new retrieval call

Non functioning netCDF files

The netCDF output of the new API (netCDF4) is different from the old API (netCDF3), especially when it comes to formatting time variables. This will leave the new output incompatible with any old processing workflows.

FIX: Alter the netCDF output field (data_format) in your request to ‘netcdf_legacy’ to get the old output format e.g.

dataset = 'reanalysis-era5-pressure-levels'
  request = {
     'product_type': ['reanalysis'],
     'variable': ['geopotential'],
     'year': ['2024'],
     'month': ['03'],
     'day': ['01'],
     'time': ['13:00'],
     'pressure_level': ['1000'],
     'data_format': 'netcdf_legacy'
 }

Basic API documentation (custom API implementations)

For those implementing their own scripts using curl, C, R or other languages note that the endpoint structure and workflow has been altered. I’ll use the CDS endpoint as an example, but the structure remains the same across other data services.

The API requires two API endpoints for a complete download (+ validation). The base URL of all APIs has the form of:

https://cds.climate.copernicus.eu/api/retrieve/v1/

where the first part of the URL changes depending on the service used.

A query to the service uses this base URL + modifiers to submit a valid POST call. The POST call not only depends on end line arguments but also alters the URL itself e.g.:

https://cds.climate.copernicus.eu/api/retrieve/v1/processes/{dataset}/execute/

where {dataset} is the dataset you would find in the python request (see ‘reanalysis-era5-pressure-levels’ in the above demo request).

HTML return codes will give you an indication of the success of the call itself. If successful the call will return a job ID. You can use this job ID to check on the progress of the download using the following structure:

https://cds.climate.copernicus.eu/api/retrieve/v1/jobs/{ID}

If the processing was successful this call will return you a URL for the location of the data, from which you can download it. This URL has the following structure:

https://cds.climate.copernicus.eu/api/retrieve/v1/jobs/{ID}/results

Authentication

The API uses a custom header field (not the Authorization routine) to validate your transactions. To successfully query the API(s) you need to add the following statement to your header (including your private key). In the examples below use the URLs (URL) as highlighted above and your private key (KEY).

In Curl this would read:

curl -i -H "PRIVATE-TOKEN: KEY" -H "Content-Type: application/json" URL

In Python this would read:

  import requests
  requests.get(
   URL,
   headers={"PRIVATE-TOKEN":"KEY"}
  )

In R this reads:

  library(httr)
  GET(
   URL,
   add_headers("PRIVATE-TOKEN" = KEY)
  )

Acknowledgements

Thanks go out to all who first highlighted these issues in the forum and those providing solutions. This resource will be updated when new information becomes available.

spatial software API

Koen Hufkens, PhD

Founder, Researcher

As an earth system scientist and ecologist I model ecosystem processes.