An invited talk to the UK turbulence consortium which aimed to describe some of the activities the climate science commmunity has undertaken, and is undertaking, to deal with the data deluge.

Presentation: pdf (13 MB)

This is a talk in three parts, covering some motivation for investing in data management, some new developments underway to deal with high volume data, and a reminder of the importance of FAIR.

Part one aims to introduce other scientific communities to why and how climate science uses data management in delivering model intercomparison. Using examples from how the IPCC process uses CMIP data, there is a walk through some of the technology underpinning CMIP6 (the CF conventions, data analysis stacks, ESGF etc), and a description of some of the (especially European) data delivery. There is a good deal of motivation for why we do model intercomparison and how it both delivers and supports projections using scenarios. Examples of how high level software stacks remove labour from scientists are shown.

Part two describes some of the work being done in our Excalibur cross-cutting data projects, targeting those pieces most relevant to meeting attendees: Standards based Aggregation, Active Storage, and support for in-flight ensemble analysis.

Part three concludes the talk with a short discussion of how FAIR applies to simulation data.