The Poseidon project is built on the following premises regarding cyberinfrastucture for Geosciences, and computational ocean/atmosphere/climate dynamics in particular:
- Computational Oceanography has come of age. Computational Oceanography is the study of ocean processes by numerical simulation, and is a new branch of marine science. Numerical simulation of the circulation has grown increasingly realistic with exponentially greater compute power in the last few decades. In comparison, the observational database has grown more slowly. Thus, oceanographers are approaching the point when ocean circulation models can exactly match the data in an infinite number of ways. Such ocean circulation simulations must be treated as seriously as real observations.
- Interactive analyses of these petascale–and in the future, exascale–ocean simulations must be widely available. Seamless access to ocean simulations has not kept pace with the increase in model resolution. Although the simulation output is in principle available to anyone, severe barriers exist to using the data in practice. To exploit simulation output it’s imperative that the model data are accessed and disseminated as widely as possible; including to casual, non-expert, and non-professional users. The simulation output must be “democratized” by making it (at least) as easy to use as oceanographic observations.
- Exascale Computational Science is on the horizon and ocean/atmosphere/climate dynamics constitute avanguard use-case for computational innovation. Ocean model simulations will migrate to exascale compute resources in the foreseeable future. Moreover, this use-case is built on a generic vision to develop cyberinfrastructure and applies to many other fields in science and engineering. A useful analogy is the Large Hadron Collider (LHC), the world’s most sophisticated experimental facility. The LHC provides a single source of data on subatomic particle collisions with several experiments tapping the stream. Within each experiment, customized hardware triggers filter the data, retaining only about one event in ten million for storage and detailed analysis. The analogous idea in exascale oceanography concerns automatic identification of simulated circulation events, for example, both at fixed points in space and along streamlines following the flow; plus their origin and fate.
The Poseidon Project and its team has the objective to deliver:
- A benchmark, accurate global ocean circulation solution at O(1) km horizontal resolution.
- Open-source software tools enabling efficient storage, indexing, and analysis of petabyte-scale ocean/atmophere/climate datasets.
- A Data Acess Portal that deploys these tools, together with costum-build high-performance storage and computing resources, to provide scalable interactive analyses and visualizations of the benchmark solution to the climate and computer science communities.
- Explorations of machine learning frameworks for automated identification of important events and data compression, with working prototypes for use in coupled climate models.
- A foundation and a path to migrate computational oceanography to exascale.
- A fully-functioning instance of the sort of cyberinfrastructure that will increasingly be needed by the next generation simulation software in geosciences and beyond.
The Poseidon Project extensively uses the big data Pangeo tools.