Introduction to GPU Programming using CUDA

CUDA, as native programming model of Nividia GPUs, allows very fine-grained control over parallel execution compared to higher level programming models such as OpenMP offloading, which helps to optimize performance.

The course provides an introduction to the programming language CUDA which is used to write fast numeric algorithms for NVIDIA GPUs. Focus is on the basic usage of the language, the exploitation of the most important features of the device (massive parallel computation, shared memory) and efficient usage of the hardware to maximize performance. An overview of the available development tools and some advanced features of the language is given.

Location

Online course
Organizer: HLRS, University of Stuttgart, Germany

Start date

Nov 04, 2024
08:45

End date

Nov 08, 2024
12:30

Language

English

Entry level

Basic

Course subject areas

Hardware Accelerators

Parallel Programming

Topics

GPU Programming

Back to list

Prerequisites and content levels

Programming experience in any of C, C++, or Fortran. Exercises will use a Linux cluster. Therefore you should have some basic knowlegde about how to work with a Linux shell and a text editor in a shell. Resources for this could be e.g. https://ubuntu.com/tutorials/command-line-for-beginners and for an editor https://opensource.com/article/19/3/getting-started-vim. Some knowledge about parallel programming is a plus.

Content levels
  • Basic: 9 hours
  • Intermediate: 4 hours
  • Advanced: 2 hours

Learn more about course curricula and content levels.

Instructors

Tobias Haas (HLRS)

Learning outcomes

After this course, participants will:

  • be familiar with the CUDA programming model,
  • have basic knowledge on performance optimization and profiling of CUDA code,
  • have an overview of important CUDA libraries.

Agenda

All times are local times in the central European time zone (Berlin).

Drop in to the video conference (8:45 - 9:00)

Course will take place from 9:00 - 12:30 on each day. More details will be published soon.

Day 1

  • Basics about CUDA
  • Kernel, kernel launch, host/device functions
  • Memory management: device, host, pinned and managed memory
  • Synchronization
  • Error handling

Day 2

  • Profiling and NVTX annotations
  • GPU architecture
  • Performance optimization: memory access patterns and cache
  • Coalesced memory access

Day 3

  • Shared and constant memory
  • Bank conflics
  • Atomic operations

Day 4

  • Overview of CUDA libraries
  • CUDA streams
  • Introduction to Multi-GPU

Day 5

  • Kernel-level profiling and correctness checking
  • Other programming methods (using base language constructs, pragmas and libraries)
  • Interoperability with OpenMP

Handouts

Each participant will get access to all slides (PDF).

Exercises

Although this is an online course, the exercises will be very interactive using break out rooms. Participants will work on HLRS's systems.

Registration-information

Register via the button at the top of this page.
We encourage you to register to the waiting list if the course is full. Places might become available.

Fees

  • Students without master’s degree or equivalent: 27.50 Euro
  • PhD students or employees at a German university or public research institute: 52.50 Euro
  • PhD students or employees at a university or public research institute in an EU, EU-associated or PRACE country other than Germany: 105 Euro
  • PhD students or employees at a university or public research institute outside of EU, EU-associated or PRACE countries: 210 Euro
  • Other participants, e.g., from industry, other public service providers, or government: 510 Euro

Our course fee includes coffee breaks (in classroom courses only).

For lists of EU and EU-associated coutries, and PRACE countries have a look at the Horizon Europe and PRACE website.

Contact

Lucienne Dettki phone 0711 685 63894, lucienne.dettki(at)hlrs.de
Tobias Haas phone 0711 685 87223, tobias.haas(at)hlrs.de

HLRS Training Collaborations in HPC

HLRS is part of the Gauss Centre for Supercomputing (GCS), together with JSC in Jülich and LRZ in Garching near Munich. EuroCC@GCS is the German National Competence Centre (NCC) for High-Performance Computing. HLRS is also a member of the Baden-Württemberg initiative bwHPC.

This course is provided within the framework of the bwHPC training program.

Further courses

See the training overview and the Supercomputing Academy pages.