Skip to main content

Data Cubes

Latest software technologies, such as SpatioTemporalAsset Catalog STAC, xarray, Dask, and Open Data Cube are exploited to serve terrabyte data as multi-dimensional data cubes. These technologies are adopted by other large EO platforms as well, such as Euro Data Cube, OpenEO Platform, and Microsoft Planetary Computer. Our data cube approach is based on the Pangeo framework, which is already used for scalable Earth System Science Hamann et al, 2018. Dask Rocklin, 2015 enables a scalable approach, which distributes the processing graph of xarray over all computing resources provided for the execution. In addition, other tools such as GDAL support STAC-based data cubes out of the box with recent GDAL versions (see GDAL STACIT driver).

Usage in terrabyte

Our data cube approach can be used with a large variety of input data. We provide some examples in the Tutorials section:

Interoperability

With interoperable standards, such as STAC, the transferability of the data cube approach to other EO platforms is ensured without the need to copy the data cube itself - only the input data needs to be available on each platform.

References

Rocklin, M. (2015): "Dask: Parallel Computation with Blocked algorithms and Task Scheduling", In: "Proceedings of the 14th Python in Science Conference, pages 130 - 136"

Hamman, M., M. Rocklin, R. Abernathy (2018): "Pangeo: A Big-data Ecosystem for Scalable Earth System Science", In: "20th EGU General Assembly, EGU2018, Proceedings from the conference held 4-13 April, 2018 in Vienna, Austria, p. 12146"