Coherent File Format Accelerates Time-to-Solution with OpenFOAM

Visualization of CFD simulation in a cube-shaped container
Visualization of fluid flow in a cavity, driven by the movement of the lid. Scientists at HLRS and WIKKI GmbH used this benchmark simulation problem to test a new approach that will simplify data management in OpenFOAM. Image: HLRS

In a large-scale simulation using more than 500,000 cores on HLRS’s supercomputer Hawk, scientists successfully tested a new approach that will slash workflow times for users of the popular CFD software.

Scientists at the High-Performance Computing Center Stuttgart (HLRS), working in collaboration with software developers at WIKKI GmbH in the project exaFOAM, have successfully tested a new method that dramatically improves the usability of OpenFOAM applications on high-performance computing (HPC) systems. Based on a concept called the coherent file format, the approach simplifies data management for HPC file systems during simulations, and eliminates time-intensive steps such as data pre- and post-processing that slow down ordinary OpenFOAM simulation workflows.

Using HLRS’s supercomputer Hawk, members of the exaFOAM team successfully completed a benchmark computational fluid dynamics (CFD) simulation in which OpenFOAM scaled efficiently with respect to input/output (I/O) performance to 4,096 nodes (524,288 CPU cores). This number doubles the size of a test run that HLRS completed in August 2023, and is more than four times the size of the previous scaling record for OpenFOAM.

For users of OpenFOAM — a free, open source software framework for CFD simulations that is widely used in fields such as automotive engineering, manufacturing, and the energy industry — the coherent file format promises to make simulation workflows run much faster, including on pre-exascale and exascale systems. By reducing the programming complexity required for typical OpenFOAM workflows, it will also make simulation more accessible to users in academia and industry who rely on more limited computing resources. OpenFOAM users can find the code at https://code.hlrs.de/exaFOAM.

Coherent file format eliminates bottlenecks in OpenFOAM workflows

As supercomputers grow toward the exascale, scientists would like to use OpenFOAM to carry out ever larger simulations. However, increases in computing power have not necessarily translated into quicker results. Having access to more processors might make it possible to run higher numbers of computational processes in parallel, but data pre-processing, post-processing, and I/O remain rate-limiting steps in typical simulation workflows. Even today, the time it takes for data pre- and post-processing in large simulations in OpenFOAM can be multiple times longer than the time needed to run the actual calculations, and the problem is getting worse as supercomputers grow in size.

Ordinarily a user of OpenFOAM must first create a computational mesh representing the physical system, which is subdivided into domains, each of which is distributed to a processor core. Although the calculations run in parallel, they produce a large number of files that must be written, read, and processed in a serial manner. Normally OpenFOAM uses a fragmented data layout, and so orchestrating this process is extremely time-consuming, especially in the pre- and post-processing steps, which are not parallelized. Once a simulation is complete, the fragmented data can also make it difficult to find and revisit specific results, as the data is stored in a large number of files that grows as the level of parallelism increases. Within a high-performance computing center, this does not only affect individual users, but can also become a drag on the overall performance of an HPC file system, reducing its productivity.

Initially conceived by Dr. Henrik Rusche, CEO of WIKKI GmbH and one of OpenFOAM’s lead developers, the coherent file format uses a different approach to simplify this process. As in other OpenFOAM formats, the cells of a simulation grid are still organized using an owner–neighbor logic, but a strict sorting in the mesh layout creates a global mesh comprehension that becomes sliceable.

Additionally, processes involved in a large-scale simulation are subdivided into processor groups. Each group contains a single data-aggregating process that has a fixed relationship to the data of the other processes within the group and is responsible for their input/output operations. In this way, the coherent file format transforms a large, fragmented collection of data distributed across processors into a smaller, more consecutive, and more meaningful set of “chunks” that are easier for the file system to manage.

The advantage is that organizing data in this way drastically reduces the number of files that a file system must manage. The coherent file format writes a small set of ASCII data files and a single binary file for each timestep. The ASCII files serve as metadata for managing data structure and boundary conditions. This approach also avoids the time-consuming step of reconstructing the results of individual domains after the simulation is complete, as the structure of the mesh is already reflected in the coherent data structure in a global file layout. In practice this means that whereas data pre-processing might take as much as a week in a typical large-scale simulation, researchers using this new method could have their results in just a couple of hours.

This achievement could also offer other benefits. At HLRS, for example, HPC users receive fixed quotas for the size of the simulations they can run and the number of files they can store. Reducing the number of files their algorithms produce could therefore make it possible to get more results out of these allotments, perhaps even including running larger simulations. At the level of system management, this approach will also help HLRS to optimize the allocation of its limited computing resources, providing the greatest possible scientific benefit for its entire user community.

For OpenFOAM programmers who only have access to smaller file management systems — in industry, for example — this approach could also make larger simulations more feasible using their companies’ internal computing resources.

A new world record in OpenFOAM scalability

Within the exaFOAM project, Dr. Gregor Weiß of HLRS’s Department of Numerical Methods and Libraries has been collaborating with Dr. Sergey Lesnik of WIKKI GmbH to develop the coherent file format for HPC systems. In September 2023 they tested the approach on a benchmark simulation of the well-known lid-driven cavity case. Although the test application is not especially interesting from a scientific perspective, the researchers’ I/O implementation is easily adaptable for other applications.

The simulation was prepared with 512 million grid cells, which took less than two hours to preprocess in the coherent file format. Extrapolations showed that the same workflow would have taken more than seven days using one of OpenFOAM’s conventional file formats. In the previous record, scientists working at the RIKEN Center for Computational Science in Kobe, Japan ran OpenFOAM at high performance on approximately 110,000 processors.

The developers of the coherent data method have made the code available on a free, open source basis to the global OpenFOAM community at the following link: https://code.hlrs.de/exaFOAM. Integration of the method into future releases of production branches of OpenFOAM is ongoing and is being supported by the developers of the coherent file format. They are also planning scientific publications describing the logic behind their methods, technical details about the benchmarking process, and best practices for users.

In addition, the results will support ongoing research within the exaFOAM project. exaFOAM’s goal is to improve OpenFOAM algorithms so that they are capable of running at high performance on upcoming massively parallel computing systems. Using the coherent data approach, the project team aims to address a grand challenge involving a combustion simulation with 500 million grid points, a size that is similar to that of the most recent benchmark run at HLRS. Because of the massive data management involved, pre-processing such a large simulation on Hawk can easily take more than 24 hours. The coherent file format should enable major speedup gains, demonstrating the utility of the method for improving time-to-solution in large-scale, real-world challenges in science and engineering.

— Christopher Williams

This project has received funding from the European High-Performance Computing Joint Undertaking (JU) under grant agreement No 956416. The JU receives support from the European Union’s Horizon 2020 research and innovation program and France, Germany, Spain, Italy, Croatia, Greece, and Portugal.

Funding for Hawk was provided by Baden-Württemberg Ministry for Science, Research, and the Arts and the German Federal Ministry of Education and Research through the Gauss Centre for Supercomputing (GCS).

Related publication

Weiß RG, Rusche H, Lesnik S, Galeazzo FCC, Ruopp A. Preprint. Coherent mesh representation for parallel I/O of unstructured polyhedral meshes. J Supercomputing.