Hackathon: Porting and Optimization for Hunter

Participation in English or German is possible.

The Hackathon consists of two parts

  • a preparation phase with introduction to programming model, architecture, topology and discussion of your project and goals for the hackathon, taking place January 13-15 (online),
  • the actual coding phase, where you will work towards the defined goal, see below, and port your code to Hunter, taking place January 27-31 (hybrid).

HLRS's next supercomputer system called “Hunter” is available in 2025. As for every new system, users have to spend some effort into porting their code and workflow to the new environment. Furthermore, the vast majority of Hunter’s compute power will be delivered by the GPU part of the APU, which requires users to adapt their hot loops in order to offload them to the GPUs. In this hackathon we will hence support our users in doing both.

In order to offload to GPUs, multiple programming models (HIP, OpenMP device offloading, PSTL/do concurrent) are available depending on the programming language used. In preparation phase, we will also discuss pros and cons of those models based on your situation and provide you with information on how to use them. Further, you and the support staff will decide together which are suitable goals for the coding phase and how to proceed.

In the coding phase, you will work towards your defined goals supported by HLRS user support staff, HPE and AMD specialists. We expect that you will be porting your code for the new system, so building with the respective compilers, linking against highly optimized numerical libraries provided by the Cray Programming Environment (CPE), and using CPE’s performance analysis tools will be the main task.

Due to the large number of groups using HLRS’ systems but limited support staff, the number of participants needs unfortunately to be limited. In order to use the system as efficiently as possible, we have to focus on groups holding medium and large compute time budgets. We hence reserve the right to select attendees!

To allow for easy attendance, we decided to provide this workshop in a hybrid fashion. Besides meeting in person at HLRS, we will also setup breakout rooms in a Zoom session, which enable remote participants to communicate as well as share screens and remote control applications with support staff, hence providing the same options of interaction as meeting in person.

Target audience: Groups holding a compute time budget to be used on Hunter.

Location

This hybrid event will take place online and at
HLRS, University of Stuttgart
Nobelstraße 19
70569 Stuttgart, Germany
Location and nearby accommodations

Start date

Jan 13, 2025
13:00

End date

Jan 31, 2025
15:00

Language

English

Entry level

Advanced

Course subject areas

Performance Optimization & Debugging

Hardware Accelerators

Topics

Code Optimization

MPI

MPI+OpenMP

OpenMP

Back to list

Prerequisites and content levels

Prerequisites

To participate in the Hackathon, you should already have an account on Hawk/Hunter.

  • Furthermore, kindly provide the following details by submitting an email to training@hlrs.de and CC to Björn Dick: dick(at)hlrs.de by the workshop registration deadline.
    • Scientific domain you are working in,
    • name of your code including URL (if available),
    • Numerical approaches (should not be the majority of the provided details),
    • name of the compute time budget used,
    • programming language,
    • programming models used (MPI, OpenMP, etc.),
    • list of dependencies/libraries,
    • known issues with specific compilers
    • information on porting efforts done so far
    • IO requirements (what is written to disk? with which frequency? number of files generated per each of those I/O events? volume of those files?)
    • Size of your test case in terms of number of Hawk nodes used as well as expected runtime.
  • Running AI codes also necessitates information about:
    • The user's AI software stack.
    • IO requirements (e.g., dataset size, number of files, model size, number of checkpoints)
    • The workflow along with computational requirements (e.g., number of GPUs needed for model training/fine-tuning and hyperparameter tuning).
  • Additionally, we require that you bring your own code including a test case which is set up according to the following rules:
  • use case selection:

    • When processing the test case, your code should have a behavior and profile which is as close as possible to that of current and future production runs.

    • If possible, the test case should be representative for those production runs of your group which consume the largest part of your compute time budget.

  • number of cores:

    • In order to be representative, the test case should be in size comparable to the respective current and future production runs.

    • In order to save valuable resources and to allow for a productive workflow, it should, however, be as small as possible.

    • So, please, be prepared to reduce the size of your test case during the workshop! This can often be achieved by reducing the simulated domain or resolution and keep the computational load per core constant ("weak down scaling").

  • wall time:

    • In order to allow for a productive workflow, the wall time should be a few minutes only.

    • At the same time, it should cover all important parts of the code, i.e. computation, communication and I/O.

    • So take into account to reduce the number of simulated time steps and be prepared to increase I/O frequency to render investigation of I/O possible within such a low number of timesteps.

If you are unsure about how to set up your test case, please contact Björn Dick (see contact data below).

In general the language of instruction is German, but can be changed to English if required.

Content levels
  • Advanced level: 30 hours

Learn more about course curricula and content levels.

Instructors

HLRS, HPE and AMD user support staff

Learning outcomes

  • Knowledge about and practical experience with deploying your code in CPE
  • Overview of available programming models for AMD APUs and how to use them
  • Important knowledge about how to optimize your code for the APU and which performance metrics are important

Handout

Handouts will be available to participants as PDFs.

Hybrid course

This course will be hybrid, i.e. it will take place at HLRS on-site but it will also be possible to attend online. Participants, online as well as on-site, have to be aware and agree that they might appear in the live video stream taken by a camera in the back of the lecture room or by a webcam on laptops. The live stream will not be saved. We strongly recommend to attend this course on-site since on-site attendance is much more effective and efficient in our experience. Therefore we might give priority to on-site over online participants during registration.

Dates

The Hackathon consists of two parts

  • a preparation phase with introduction to programming model, architecture, topology and discussion of your project and goals for the hackathon, taking place January 13-15,
  • the actual coding phase, where you will work towards your goal and optimize your code for Hunter, taking place January 27-31.

The first phase will be a virtual event. The second phase will be offered as a hybrid event. Please select for the hybrid event on which days you plan to be on-site at HLRS.

We expect you to participate in both phases, in the first at least to discuss your project and goals.

Agenda

- preliminary -

Local times: Central European Time Zone (Berlin).
Communication format: Face-to-face, via Slack, Zoom, and email.

Preparation Phase

January 13, 13-17 h
  • Theory on GPU

  • Programming Models

  • CPE Basics

  • OpenMP Offloading – Part 1: Basics

January 14, 9-17 h
  • MI300A Architecture

  • OpenMP Offloading – Part 2: More OpenMP Topics

  • Memory Allocation and Memory Pools

  • MPI(CH) for GPUs

  • Profiling Tools (basics)

January 15, 9-13 h
  • Tutor assignment

  • Development of agenda for your code

  • Develop to-dos (Compile & Profile)

  • Set up of test case toghether with tutors

  • Agree on weekly checkpoints with tutors until the hackathon starts

Coding Phase

We start ON_SITE/ONLINE workshop on Monday with

  • 8:30 Local registration/drop in to Zoom

Workshop ends on Friday, latest at 17:00.

Daily agenda:

  • 9:00 - 17:30 Workshop ON-SITE/ONLINE
  • Working on your code and workflow together with HLRS/HPE/AMD support staff

Registration information

Register via the button at the top of this page.

Please submit information about your code (project name, code name, URL, and a short description) by the workshop registration deadline via email to training@hlrs.de. You can book individual days acording to your project needs, attending a full workshop in this case is not necessary.

Registration closes on Friday, January 10, 2025. Late registrations after the registration phase are still possible according to the course capacity.

Fees

  • Students without master’s degree or equivalent at a German university: 0 Euro
  • PhD students or employees at a German university or public research institute: 0 Euro

Further fee categories can be found in the registration page.

Our course fees include coffee breaks (in classroom courses only).

Contact

Björn Dick phone 0711 - 685 87189, bjoern.dick(at)hlrs.de
Tobias Haas phone 0711 685 87223, training(at)hlrs.de

HLRS Training Collaborations in HPC

HLRS is part of the Gauss Centre for Supercomputing (GCS), together with JSC in Jülich and LRZ in Garching near Munich. EuroCC@GCS is the German National Competence Centre (NCC) for High-Performance Computing. HLRS is also a member of the Baden-Württemberg initiative bwHPC.

Further courses

See the training overview and the Supercomputing Academy pages.

Related training

All training

January 21 - 23, 2025

Hybrid Event - Stuttgart, Germany


February 17 - 21, 2025

Stuttgart, Germany


March 17 - 21, 2025

Dresden, Germany


March 24 - 28, 2025

Hybrid Event - Stuttgart, Germany


April 07 - 10, 2025

Online


April 08 - 09, 2025

Online


June 17 - 18, 2025

Online