HLRS's extended user support program goes above and beyond our basic support and mentorship offerings to facilitate close cooperation between our scientific users and HLRS experts in high-performance computing.
While system users bring a deep understanding of the scientific problems they investigate and the codes needed to study them, HLRS user support staff understand exactly how our HPC systems are constructed and configured. Bringing both together ensures that our users' software is written to utilize the systems' parallelized architectures more efficiently.
This multidisciplinary, collaborative approach can both enhance our users' scientific capabilities and enable a more efficient and sustainable operation of HLRS's computing resources. HLRS's extended user support is primarily driven by the SiVeGCS project and focuses on these key activities:
HLRS staff reviews our system users' most computationally intensive sections of code with the help of profiling methods that assess key performance features, including POP metrics for parallel efficiency as well as input/output (I/O) performance and node-level performance.
Twice each year, HLRS holds performance optimization and scaling workshops. In these meetings, users spend a full week working side-by-side with optimization specialists at HLRS to develop and implement code changes that lead to performance improvements. For some users, the performance workshops serve as a gateway to ongoing Level 2 and Level 3 Support.
In Level 2 Support, HLRS user support staff work with system users over the span of weeks to a few months to improve existing codes with known methods. Here, the emphasis is on improving code performance and hence efficiency while the algorithm remains basically the same.
In Level 3 Support, HLRS experts collaborate with system users over several months to develop entirely new methods. This can include making modifications to the algorithm to allow for more efficient usage of compute resources and to optimize node-level, I/O, and parallel performance, while keeping the algorithm fixed.
In Level 4 Support, HLRS staff work closely together with users on dedicated, longer-term projects that address new kinds of scientific and technological challenges. This can include, for example, developing new computational tools (including entire application codes), programming models, workflows, or databases to support specific user communities. In this way, we enable advances in our users' respective application domains, addressing their evolving interests and requirements.
In some cases, collaboration that occurs during our extended user support activities can lead to scientific publications authored by system users together with HLRS extended support staff members. Such publications can focus, for example, on the development of new methods or the deployment of known methods to improve efficiency, including an assessment of their respective pros and cons.
If you are a current HLRS system user, click here to report a problem, check on Hawk's operating status, or find technical documentation or other information related to your project.
HLRS's extended user support program is an essential part of our efforts to ensure that our supercomputing infrastructure provides the greatest possible benefit for science and society. Considering the energy and resource requirements involved in building and operating a world-class Tier 0/1 system, optimizing the performance of users' computationally intensive codes is also necessary to ensure that HLRS uses these resources responsibly and can continue providing state-of-the-art HPC capabilities in the future. Specifically, performance optimization resulting from our extended user support program has the following benefits:
When algorithms and codes are optimized to take full advantage of the speed offered by our computing systems' architectures, our users' research can proceed more quickly. This can provide competitive advantages in rapidly developing scientific or technological fields.
As scientists adapt and scale their algorithms to run efficiently on new and ever larger computing systems, they gain the ability to study more complex problems in ways that previously would have been computationally impractical.
Ensuring that individual user jobs use as little computing time and resources as possible means that HLRS can better address growing demand for high-performance computing, making its limited resources available for the greatest possible number of user research projects. For individual users, more efficient system usage also has the benefit of producing more scientific results from computing time allocations.
When user codes utilize parallel computing architectures more efficiently, they require less computing hardware and use less energy. For this reason, HLRS's extended user support program is an essential component of the center's sustainability strategy.
Oct 18, 2024
Sep 10, 2024
May 08, 2024
Dec 19, 2023