Changelog
[0.4.0] - 2026-06-16
Added
gdalxarray.warp()— build lazy warp-VRT recipes for reprojection, regridding, GCP/RPC/geolocation-array transformation, cutline clipping, and resampling. Returns a VRT XML string that composes with thegdalxarrayengine: only the bytes the consumer reads are materialised. Common args (crs,bbox,shape,resolution,resampling,nodata) plus a full escape hatch for anygdal.WarpOptionskeyword.
Changed
- Classic-raster open (
multidim=False) on a file with no bands but subdatasets now raises aValueErrorlisting the available subdataset paths and pointing atmultidim=True. Previously this produced an empty 512x512 stub Dataset.
0.3.0 - 2026-06-15
Added
band_as_dimparameter onopen_dataset(classic raster mode, defaultTrue). Bands now become an xarraybanddimension on a singleband_dataDataArray. Passband_as_dim=Falsefor per-band-variable layout.- New
GDALMultiBandArrayBackendArray reading viadataset.ReadAsArray(band_list=...), letting GDAL handle BIP/BIL/BSQ interleaving internally. - Lazy-by-default for classic raster mode:
chunks=Nonenow returns aLazilyIndexedArray-wrapped Dataset rather than eagerly reading. ds.encoding["source"]andds.encoding["gdal_driver"]provenance strings (serialization-safe replacements for the live GDAL objects previously stashed in encoding).[tool.ruff]config +.pre-commit-config.yamlfor lint and format.
Changed
gdal.UseExceptions()moved from module import toGDALBackendEntrypoint.__init__, guarded withgdal.GetUseExceptions(). Scopes the side effect to actual backend use.GDALMultiDimArraynow holds optional_parent_datasetand_parent_groupreferences internally to keep the underlyingmdarrayvalid for the dataset's lifetime. Previously stashed inds.encoding, which broketo_netcdf()and other serialization paths.- Per-band metadata (description, nodata, scale, offset) in
band_as_dimmode is attached as scalar attrs when uniform across bands, or asband-dim coordinates when it varies. - Multidim group navigation accepts nested paths (
"/a/b/c") with tolerant leading/trailing slash handling. - Unsupported codecs on individual MDArrays (e.g.
numcodecs.pcodec) are now logged at warning level and the array skipped, instead of aborting the open. This lets you open stores like Earthmover's public ERA5 Icechunk store and access the readable coordinate/mask variables.
Fixed
Dataset.ReadAsArraycall inGDALMultiBandArraynow usesxsize/ysize(the Dataset API) rather thanwin_xsize/win_ysize(which are the Band API).- Reverse-slice canonicalisation across all axes in all three BackendArray
classes - fixes failures when xarray sends
slice(stop, start, 1)for selections on decreasing coordinates (very common in atmospheric data with latitude 90 -> -90). GDALBackendArray.__dask_tokenize__no longer references a non-existentself.datasetattribute. Was latent (only fired on certain Dask graph hashing paths) but would have raised AttributeError when it did.AdviseReadnow skips both tiny reads (no benefit) and huge reads (which blow up on sharded stores reporting shard-sized blocks); boundedCACHE_SIZEbetween 4 MB and 512 MB.
Removed
- Live GDAL objects no longer stored in
ds.encoding. Useds.encoding["source"]andds.encoding["gdal_driver"](strings) for introspection. GDAL refs are now held inside the BackendArrays themselves.
0.2.0 - 2026-05-12
Changed
- Renamed package from
gdxtogdalxarrayfor PyPI publication. The originalgdxname is taken on PyPI by an unrelated GAMS Data Exchange project. - Repository moved from
mdsumner/gdxtohypertidy/gdalxarray. - Python import path is now
from gdalxarray import GDALBackendEntrypoint. - Build backend switched from setuptools to hatchling;
setup.pyremoved. - Packaging modernised to PEP 621 / PEP 639 standards.
requires-pythonbumped to>=3.10to matchxarray>=2025.6.
Added
- CF datetime decoding for time coordinates using
unitsfromMDArray.GetUnit()andcalendarattribute. - Backend arrays accessible via
ds['var'].encoding['gdal_backend']for debugging and introspection. - GDAL dataset and group objects retained in
ds.encoding['gdal_dataset']andds.encoding['gdal_group']to keepMDArraymethods functional. - Entry point registration so
xr.open_dataset(..., engine="gdal")works.
Fixed
- Slice index parsing where
0was incorrectly treated asNonedue to Python's falsy evaluation (k.start or 0->k.start if k.start is not None else 0). - Re-enabled
AdviseReadfor chunk-aligned prefetching on remote datasets.
0.1.0 - 2026-01-20
Initial release as gdx.
Added
- GDAL backend for xarray, supporting both Classic and Multidimensional APIs.
chunks={}uses native block sizes from GDAL'sGetBlockSize(), aligning Dask chunks with storage chunks for efficient reads.multidim=Trueis the default foropen_dataset().
Fixed
- Dask lazy loading for remote Zarr datasets. Zero-sized slice requests (used
by Dask for
_metainference) no longer hang or attempt full array allocation. - Slice start/stop of
0now parsed correctly.