From Machine Learning to Deep Learning: A concise introduction

Photo of scientists participating in a training course in HLRS's Ruehle Saal

This HLRS course addresses students, data scientists, and researchers who would like to have an introduction to Machine and Deep Learning methods to solve challenging and future-oriented problems. Both Machine and Deep Learning methods and examples as well as a method for data compression will be presented. Different examples are shown via hands-on sessions on an HLRS cluster. However, please be aware that this course is not a sequence of beginners’-to-advanced lectures about theoretical aspects of AI.

The first part will be an introduction to basic methods in Machine Learning, including pre-processing and supervised learning using Apache Spark. The course will then move on to elements of supervised Deep Learning on real data to classify annotated images of waste in the wild. Given the deluge of information needed to power machine and deep learning methods, it is imperative to think about effective data processing strategies. Therefore, the course will conclude with an introduction to data compression using the BigWhoop library (developed within EXCELLERAT P2). As an efficient data reduction tool, BigWhoop can be applied to generic numerical datasets to minimize I/O bottlenecks and optimize data storage. The lectures are interleaved with many hands-on sessions using Jupyter Notebooks and scripts on HLRS systems.

In addition, a guest lecture from the IAG will show how Deep Learning can be applied to problems in computational fluid dynamics.

New in 2024, the course will contain a lecture from the philosophy department at HLRS.

 

Location

HLRS, University of Stuttgart
Nobelstraße 19
70569 Stuttgart, Germany
Room 0.439 / Rühle Saal
Location and nearby accommodations

Start date

Mar 26, 2024
08:30

End date

Mar 28, 2024
16:45

Language

English

Entry level

Basic

Course subject areas

Data in HPC / Deep Learning / Machine Learning

Topics

Artificial Intelligence

Big Data

Deep Learning

Machine Learning

Scientific Machine Learning

Back to list

Prerequisites and content levels

Prerequisites
  • Familiarity with Linux operating systems, including Linux shell (some parts of the training will use a cluster).
  • Technical background and basic understanding of machine learning concepts will be helpful.
  • Preliminary experience with Python is required. Since Python is used, the following tutorial can be used to learn the syntax.
  • For the second day, familiarity with TensorFlow will be a plus as all hands-on sessions will be using TensorFlow. For those who like to use TensorFlow in advance TensorFlow tutorial will be a great help.
Content levels
  • Basic: 10:30 hours
  • Intermediate: 2:30 hours
  • Advanced: 1 hour
  • Community: 6:30 hours

Learn more about course curricula and content levels.

Learning outcomes

After this course, participants will

  • have a basic understanding of classical Machine Learning and Deep Learning (DL) concepts and methods,
  • have gained practical experience in applying these methods,
  • and will know how to use HLRS's systems for certain ML or DL tasks.

Instructors

Nico Formanek, Dr. Khatuna Kakhiani, Patrick Vogler and Dr.-Ing. Lorenzo Zanon (HLRS), and Anna Schwarz (IAG).

Agenda

(preliminary)

08:30 - 09:00  Local registration

Day 1: Focus on Pre-processing, Feature Engineering and Machine Learning (9:00 - 17:30, Dr.-Ing. Lorenzo Zanon)

The first day will be based on the “Stuttgart S-Bahn Example” (originally developed by Dennis Hoppe, HLRS) to provide an introduction to Machine Learning. The focus is on data preparation, classification and regression algorithms in supervised learning: Can these tools be helpful to improve the travel experience in the Stuttgart S-Bahn, which are their limits? Apache Spark will be employed for the hands-on sessions on Jupyter Notebooks as well as via interactive jobs on script. Finally, we will also touch upon the visualisation of results.

Day 2: Focus on supervised Deep Learning to classify images of waste in the wild (9:00 - 17:30, Dr. Khatuna Kakhiani)

During this day, participants will explore how Deep Learning can be used to classificy waste in wild. After brief introduction of Deep Learning, and basic concepts and Building blocks of Deep Neural Networks, participants will learn how to:

  • implement common deep learning workflow for classification problem
  • preprocess unstructured data, understand image data, manipulate image instances and visualize results, work with unbalanced small dataset
  • define the Neural Network Architecture (Sigmoid Neural Model, Multilayer Perceptron (MLP), Convolutional Neural Networks (CNN)); Compile, Train and Evaluate the Model
  • classify images with Convolutional Neural Networks (CNN)
  • visualize results

Upon completion, participant should be able to solve classification problems with CNN on other custom datasets. Tensorflow will be employed for the hands-on sessions on Jupyter Notebooks.

Day 3:

  • Guest Lecture: Towards Data-Driven Computational Fluid Dynamics (9:00 - 11:30, Anna Schwarz, IAG)
  • Generalization and the problem of leakage: (11:45-13:00, Nico Formanek, HLRS)
  • Lunch break: 13:00-14:15
  • Data Compression of numerical data sets with the BigWhoop library (14:15-16:45, Patrick Vogler, HLRS)

On the third day we start with the guest lecture "Towards Data-Driven Computational Fluid Dynamics". It will be given by Anna Schwarz, Institute of Aerodynamics and Gas Dynamics, University of Stuttgart.

We will conclude the day with an introduction to data compression, focusing on the various methods available to us for the efficient size reduction of our training data. Special attention will be paid to which approaches are best suited for different data types and what impact the different approaches and compression rates have on the quality of the datasets. The compression library BigWhoop and its accompanying command line tool will be made available for the hands-on sessions.

Lunch break will be from 13:00-14:15.

Exercises

The exercises will be carried out on HLRS's systems using Jupyter Notebooks.

HLRS concept for on-site courses

Besides the content of the training itself, an important aspect of this event is the scientific exchange among the participants. We try to facilitate such communication by

  • a social event on the evening of the first course day,
  • offering common coffee and lunch breaks and
  • working together in groups of two during the exercises.

Registration-information

Register via the button at the top of this page.
We encourage you to register to the waiting list if the course is full. Places might become available.

Registration closes on March 9, 2024.

Late registrations after that date might still be possible according to the course capacity.

Fees

  • Students without master’s degree or equivalent: 30 EUR
  • PhD students or employees at a German university or public research institute: 60 EUR
  • PhD students or employees at a university or public research institute in an EU, EU-associated or PRACE country other than Germany: 120 EUR.
  • PhD students or employees at a university or public research institute outside of EU, EU-associated or PRACE countries: 240 EUR
  • Other participants, e.g., from industry, other public service providers, or government: 600 EUR

Link to the EU and EU-associated (Horizon Europe), and PRACE countries.

Our course fee includes coffee breaks (in classroom courses only).

HLRS Training Collaborations in HPC

HLRS is part of the Gauss Centre for Supercomputing (GCS), together with JSC in Jülich and LRZ in Garching near Munich. EuroCC@GCS is the German National Competence Centre (NCC) for High-Performance Computing. HLRS is also a member of the Baden-Württemberg initiative bwHPC.

This course is provided within the framework of the bwHPC training program.

EXCELLERAT P2

This course is partly realised in cooperation with the Centre of Excellence EXCELLERAT P2. Funded by the European Union. This work has received funding from the European High Performance Computing Joint Undertaking (JU) and Germany, Italy, Slovenia, Spain, Sweden, and France under grant agreement No 101092621. See also the EXCELLERAT Service Portal for more information.

CEEC CoE

This course is partly realised in cooperation with the Centre of Excellence CEEC. Funded by the European Union. This work has received funding from the European High Performance Computing Joint Undertaking (JU) and Sweden, Germany, Spain, Greece, and Denmark under grant agreement No 101093393.

Contact

Tobias Haas, phone 0711 685 87223, tobias.haas(at)hlrs.de

Further courses

See the training overview and the Supercomputing Academy pages.