HLRS High Performance Computing Center Stuttgart: Introduction to oneAPI, SYCL2020 and OpenMP offloading

Prerequisites and content levels

Good knowlegde any of C/C++/Fortran and familiarity with usual OpenMP programming is sufficient for the OpenMP part. For Data Parallel C++/SYCL knowlegde of C++11 or later is recommended (C++17 very much faciliates SYCL2020 programming).

Content levels

Basic: 6:35 hours
Intermediate: 4:25 hours

Learn more about course curricula and content levels.

Instructors

Intel staff.

Learning outcomes

After this course, participants will:

be familiar with the oneAPI programming model,
have an overview over DPC++/SYCL programming,
have gained knowledge about fundamental OpenMP offloading,
have an overview over oneAPI libraries (oneMLK, ...),
basic knowledge about profiling and performance analysis,
basic knowlegde on (dynamical) debugging of programms using the oneAPI programming model,
be aware how Intel's Compatibility tool can help to migrate CUDA to SYCL code.

Agenda

The preliminary agenda is as follows. All times are CEST.

Day 1
Start	End
8:45	9:00	Drop in to Zoom
09:00	09:10	Welcome and Introduction to Day 1
09:10	09:30	oneAPI – Introduction to a mixed Architecture Development Environment - Motivation and oneAPI Standardization - Intel’s oneAPI Toolkits Portfolio and Components - Intel oneAPI plug-ins for Nvidia and AMD hardware ( CPU and GPUs)
09:30	10:20	Direct programming with oneAPI Compilers (Part 1) – with Demos - Intro to heterogenous programming model with SYCL 2020 - SYCL features and examples o “Hello World” Example o Device Selection o Execution Model
10:20	10:25	Break
10:25	11:15	Direct programming with oneAPI Compilers (Part 2) – with Demos o Compilation and Execution Flow o Memory Model; Buffers, Unified Shared Memory (USM) o Performance optimizations with SYCL features
11:15	11:20	Break
11:20	12:20	oneAPI Case Study – GROMACS
12:20	12:25	Break
12:25	12:45	Introduction to the DevCloud/ IDC - A sandbox for software development and benchmarking - Purpose: Demoing, testing and porting applications - Hardware and Software offerings - How to onboard & how to get a DevCloud/ IDC account
12:45	13:05	Instructions on lab exercises (direct programming with SYCL using Intel oneAPI compilers)

Day 2
Start	End
8:45	9:00	Drop in to Zoom
9:00	9:05	Welcome and Introduction to Day 2
09:05	09:55	Intel OpenMP for Offloading for Fortran – with Demos - Parallelizing heterogenous applications with OpenMP 5.2
09:55	10:00	Break
10:00	10:45	Intel oneAPI libraries (oneMKL) for HPC - with demos - Performance optimized libraries for numerical simulations and other purposes
10:45	10:50	Break
10:50	11:30	Target NVIDIA and AMD with oneAPI and SYCL Using SYCL based NVIDIA and AMD plugins with Demos
11:30	12:00	Open Source Compatibility tool for porting purposes (SYCLomatic) - with demo - Migration Cuda based GPU Applications to SYCL
12:00	12:05	Break
12:05	12:30	Intel Debugging Tools for heterogenous programming ( CPU, GPU ) - with demos
12:30	13:00	Programming for Distributed HPC Systems using Intel MPI

Day 3
Start	End
8:45	9:00	Drop in to Zoom
9:00	9:05	Welcome and Introduction to Day 3
9:05	10:05	Application profiling for CPU and or mixed hardware withe the Intel VTune - Demos - Vtune general / main functionality ( Hot spot analysis ,….) starting with CPU - Profiling Tools Interfaces for GPU - Profile heterogenous SYCL/OpenMP Workloads with Intel VTune Profiler
10:05	10:10	Break
10:10	11:20	Application profiling for CPU and or mixed hardware withe the Intel VTune - Demos - Vtune general / main functionality ( Hot spot analysis ,….) starting with CPU - Profiling Tools Interfaces for GPU - Profile heterogenous SYCL/OpenMP Workloads with Intel VTune Profiler
11:20	11:25	Break
11:25	12:35	Application profiling for CPU and mixed hardware with the Intel Advisor - Demos - Advisor's main functionality ( Vectorization and Roofline ) starting with CPU - Estimate performance potential gains with Offload Advisor ( CPU -> HW Accelerator) - Analyse heterogenous SYCL/OpenMP Workloads with Intel Advisor and Roofline analysis
12:35	12:45	Questions and Answers - Wrap up

Exercises

During the lectures in the morning only demonstrations will be shown. However, we will also show how to access Intel's DevCloud where participants can explore and work on the examples given themselves in the afternoon. Additionally, Intel will offer support to a limited number of participants.

Registration information

Register via the button at the top of this page.
We encourage you to register to the waiting list if the course is full. Places might become available.

Please be aware that the Zoom session will be recorded. You declare that you are aware of and consent to the recording by registering.

Registration closes on August 30, 2023 (extended registration phase).

Late registrations after that date are still possible according to the course capacity.

Fees

This course is free of charge.

Our course fee includes coffee breaks (in classroom courses only).

Contact

Tobias Haas phone 0711 685 87223, tobias.haas(at)hlrs.de
Maksym Deliyergiyev phone 0711 685 87261, maksym.deliyergiyev(at)hlrs.de

HLRS training collaborations in HPC and AI

HLRS is part of the Gauss Centre for Supercomputing (GCS), together with JSC in Jülich and LRZ in Garching near Munich. EuroCC@GCS is the German National Competence Centre (NCC) for High-Performance Computing. HLRS is also a member of the Baden-Württemberg initiative bwHPC. Since 2025, HLRS coordinates HammerHAI.

This course is provided within the framework of EuroCC2.

Further courses and training team

See the training overview and the Supercomputing Academy pages.
See also information about the HLRS training department and staff.

Introduction to oneAPI, SYCL2020 and OpenMP offloading

Location

Prerequisites and content levels

Content levels

Instructors

Learning outcomes

Agenda

Exercises

Registration information

Fees

Contact

HLRS training collaborations in HPC and AI

Further courses and training team

Related training

All training

Hackathon: Porting and Optimization for Hunter

Iterative Linear Solvers and Parallelization

AMD Instinct™ GPU Training

Parallelization with MPI and OpenMP

Efficient Parallel Programming with GASPI

Multi-GPU Deep Learning

Parallel Programming Workshop (Train the Trainer)

Parallel Programming Workshop (MPI, OpenMP and Advanced Topics)

Introduction to OpenMP Offloading with AMD GPUs

Supercomputing Academy: Parallel Programming with MPI