cmemsarco provides cloud-native access to Copernicus Marine Service (CMEMS) Analysis-Ready Cloud-Optimized (ARCO) Zarr datasets. The package builds a catalog of GDAL-ready data source names, letting you go straight from URL to pixels without file downloads, directory listings, format, or tool wrangling.
library(cmemsarco)
# The bundled catalog
cmems_catalog_data
#> # A tibble: 1,731 × 13
#> product_id dataset_version_id timeChunked_url geoChunked_url native_url
#> <chr> <chr> <chr> <chr> <chr>
#> 1 NWSHELF_ANALYSI… cmems_mod_nws_bgc… https://s3.waw… https://s3.wa… https://s…
#> 2 NWSHELF_ANALYSI… cmems_mod_nws_bgc… https://s3.waw… https://s3.wa… https://s…
#> 3 NWSHELF_ANALYSI… cmems_mod_nws_bgc… https://s3.waw… https://s3.wa… https://s…
#> 4 NWSHELF_ANALYSI… cmems_mod_nws_bgc… https://s3.waw… https://s3.wa… https://s…
#> 5 NWSHELF_ANALYSI… cmems_mod_nws_bgc… NA NA https://s…
#> 6 NWSHELF_ANALYSI… cmems_mod_nws_bgc… NA NA https://s…
#> 7 NWSHELF_ANALYSI… cmems_mod_nws_bgc… https://s3.waw… https://s3.wa… https://s…
#> 8 NWSHELF_ANALYSI… cmems_mod_nws_bgc… https://s3.waw… https://s3.wa… https://s…
#> 9 NWSHELF_ANALYSI… cmems_mod_nws_bgc… https://s3.waw… https://s3.wa… https://s…
#> 10 NWSHELF_ANALYSI… cmems_mod_nws_bgc… https://s3.waw… https://s3.wa… https://s…
#> # ℹ 1,721 more rows
#> # ℹ 8 more variables: dataset_id <chr>, version <chr>, timeChunked_gdal <chr>,
#> # geoChunked_gdal <chr>, timeChunked_gdals3 <chr>, geoChunked_gdals3 <chr>,
#> # timeChunked_s3 <chr>, geoChunked_s3 <chr>The catalog is built by walking the CMEMS STAC API. Each row represents a versioned dataset with URLs to Zarr stores in different formats:
| Column | Description |
|---|---|
product_id |
CMEMS product identifier |
dataset_id |
Dataset identifier (without version) |
version |
6-digit version (YYYYMM) |
timeChunked_url |
HTTPS URL to timeChunked.zarr |
geoChunked_url |
HTTPS URL to geoChunked.zarr |
*_gdal |
GDAL DSN using /vsicurl/
|
*_gdals3 |
GDAL DSN using /vsis3/
|
*_s3 |
S3 URI (s3://bucket/path) |
Use cmems_latest() to keep only the most recent version
of each dataset, and cmems_arco_only() to drop datasets
without Zarr URLs (static/native-only).
cmems_catalog_data |>
cmems_arco_only() |>
cmems_latest()
#> # A tibble: 1,056 × 13
#> product_id dataset_version_id timeChunked_url geoChunked_url native_url
#> <chr> <chr> <chr> <chr> <chr>
#> 1 NWSHELF_ANALYSI… cmems_mod_nws_bgc… https://s3.waw… https://s3.wa… https://s…
#> 2 NWSHELF_ANALYSI… cmems_mod_nws_bgc… https://s3.waw… https://s3.wa… https://s…
#> 3 NWSHELF_ANALYSI… cmems_mod_nws_bgc… https://s3.waw… https://s3.wa… https://s…
#> 4 NWSHELF_ANALYSI… cmems_mod_nws_bgc… https://s3.waw… https://s3.wa… https://s…
#> 5 NWSHELF_ANALYSI… cmems_mod_nws_bgc… https://s3.waw… https://s3.wa… https://s…
#> 6 NWSHELF_ANALYSI… cmems_mod_nws_bgc… https://s3.waw… https://s3.wa… https://s…
#> 7 NWSHELF_ANALYSI… cmems_mod_nws_bgc… https://s3.waw… https://s3.wa… https://s…
#> 8 NWSHELF_ANALYSI… cmems_mod_nws_bgc… https://s3.waw… https://s3.wa… https://s…
#> 9 NWSHELF_ANALYSI… cmems_mod_nws_bgc… https://s3.waw… https://s3.wa… https://s…
#> 10 NWSHELF_ANALYSI… cmems_mod_nws_bgc… https://s3.waw… https://s3.wa… https://s…
#> # ℹ 1,046 more rows
#> # ℹ 8 more variables: dataset_id <chr>, version <chr>, timeChunked_gdal <chr>,
#> # geoChunked_gdal <chr>, timeChunked_gdals3 <chr>, geoChunked_gdals3 <chr>,
#> # timeChunked_s3 <chr>, geoChunked_s3 <chr>CMEMS provides two Zarr stores for each dataset, optimised for different access patterns:
timeChunked (chunks: 1 × 720 × 512 in time × lat × lon)
geoChunked (chunks: 138 × 32 × 64 in time × lat × lon)
Choosing the wrong chunking strategy means many more HTTP requests and slower performance.
Each Zarr store is available in four formats. Use whichever suits your tooling:
*_gdal — zero configuration (recommended)
Uses GDAL’s /vsicurl/ handler which works without any
environment setup:
dsn <- cmems_catalog_data$timeChunked_gdal[1]
#> 'ZARR:"/vsicurl/https://s3.waw3-1.cloudferro.com/mdl-arco-time-045/..."'
# Works immediately with any GDAL-based tool
#vapour::vapour_raster_info(dsn)
#terra::rast(dsn)*_gdals3 — S3 protocol
Uses GDAL’s /vsis3/ handler which requires
cmems_setup() first to configure the AWS endpoint:
cmems_setup() # Sets AWS_NO_SIGN_REQUEST=YES, AWS_S3_ENDPOINT=...
dsn <- cmems_catalog_data$timeChunked_gdals3[1L]
dsn
#> [1] "ZARR:\"/vsis3/mdl-arco-time-041/arco/NWSHELF_ANALYSISFORECAST_BGC_004_002/cmems_mod_nws_bgc_anfc_0.027deg-3D_P1D-m_202411/timeChunked.zarr\""This may offer better performance in some cases due to S3-specific optimisations in GDAL.
*_s3 — S3 URI
Standard s3:// URIs for use with S3-aware tools:
uri <- cmems_catalog_data$timeChunked_s3[1]
uri
#> [1] "s3://mdl-arco-time-041/arco/NWSHELF_ANALYSISFORECAST_BGC_004_002/cmems_mod_nws_bgc_anfc_0.027deg-3D_P1D-m_202411/timeChunked.zarr"*_url — raw HTTPS
The underlying HTTPS URLs, useful if you need to construct your own access pattern:
url <- cmems_catalog_data$timeChunked_url[1]
url
#> [1] "https://s3.waw3-1.cloudferro.com/mdl-arco-time-041/arco/NWSHELF_ANALYSISFORECAST_BGC_004_002/cmems_mod_nws_bgc_anfc_0.027deg-3D_P1D-m_202411/timeChunked.zarr"
library(cmemsarco)
# Find your dataset
sla <- cmems_catalog_data |>
dplyr::filter(grepl("SEALEVEL.*NRT", product_id)) |>
cmems_latest()
# Grab the DSN (no setup needed)
dsn <- sla$timeChunked_gdal[1]
dsn
#> [1] "ZARR:\"/vsicurl/https://s3.waw3-1.cloudferro.com/mdl-arco-time-053/arco/SEALEVEL_EUR_PHY_L3_NRT_008_059/cmems_obs-sl_eur_phy-ssh_nrt_al-l3-duacs_PT1S_202311/timeChunked\""