Data Systems at Scale

Presentation: pdf (0.5MB)

This was a presentation given as part of the esiwace mid-term review. It describes some of the progress made in one of the work packages, WP4, which is about data systems at scale. Here the definition of “at scale” means “for weather and climate simulation on exascale (and pre-exascale) supercomputing” and for downstream analysis systems.

I have an older post linking to a talk describing the work package objectives.

In this presentation I highlighted

  • our work on ensemble handling. Amongst other activities we have demonstrated “in-flight ensemble data processing” with a large high resolution ensemble (10 member global 25 km resolution using 50K cores).
  • progress with the Earth System Data Middleware (ESDM), which includes NetCDF interface and some new backends, and
  • work by our industry colleagues, Seagate and DDN both on ESDM backends and on new active storage systems which will be able to do simple manipulations “in storage”, and
  • our work on S3NetCDF, a drop in extension for netcdf-python suitable for use with object stores.