From Machine Learning to Deep Learning: a concise introduction

Photo of scientists participating in a training course in HLRS's Ruehle Saal
28 April - Course was postponed from May, 17 - 19 to June, 28 - 30 2021.

This HLRS course addresses students, data scientists, and researchers who would like to have an introduction to Machine and Deep Learning methods to solve challenging and future-oriented problems. Both Machine and Deep Learning methods and examples will be presented, together with their implementation on HLRS systems. The first part will be an introduction to basic methods in Machine Learning, including pre-processing and supervised learning using Apache Spark. The course will then move on to elements of supervised Deep Learning on real data to classify annotated images of waste in the wild. Given the deluge of information needed to power machine and deep learning methods, it is imperative to think about effective data processing strategies. Therefore, the course will conclude with an introduction to data compression using the BigWhoop library (part of the EXCELLERAT Data Exchange and Workflow Portal). As an efficient data reduction tool, BigWhoop can be applied to generic numerical datasets to minimize I/O bottlenecks and optimize data storage. The lectures are interleaved with many hands-on sessions using Jupyter Notebooks and scripts on HLRS systems.

Location

Online course
Organizer: HLRS, University of Stuttgart, Germany

Start date

Jun 28, 2021
08:30

End date

Jun 30, 2021
13:00

Language

English

Entry level

Basic

Topics

Big Data

Deep Learning

Machine Learning

Back to list

Program

(preliminary)

08:45 - 09:00  on every day: drop in to Zoom

Day 1: Focus on Pre-processing, Feature Engineering and Machine Learning (9:00 - 17:30, Dr.-Ing. Lorenzo Zanon)

The first day will be based on the “Stuttgart S-Bahn Example” (originally developed by Dennis Hoppe, HLRS) to provide an introduction to Machine Learning. The focus is on data preparation, classification and regression algorithms in supervised learning: Can these tools be helpful to improve the travel experience in the Stuttgart S-Bahn, which are their limits? Apache Spark will be employed for the hands-on sessions on Jupyter Notebooks as well as via interactive jobs on script. Finally, we will also touch upon the visualisation of results.

Day 2: Focus on data processing, Model of ANN and supervised Deep Learning to classify images of waste in the wild (9:00 - 17:00, Dr. Khatuna Kakhiani)

During second day, participants will explore how Deep Learning can be used to classification waste in wild. After brief introduction of Deep Learning, and basic concepts and Building blocks of Deep Neural Networks, participants will learn how to:

  • Implement common deep learning workflow for image classification
  • Process data, experiment with network structure and training parameters
  • Deploy neural network to classify images
  • Visualize results Upon completion, participant will be able to solve classification problems with CNN on other custom datasets. The hands-on training using Jupyter Notebooks, interactive jobs on script and Tensorflow.


Day 3:

  • Guest Lecture: Deep Neural Networks for Data-Driven Turbulence Models (9:00 - 10:30, Dr.-Ing. Andrea Beck, IAG)
  • Data Compression of numerical data sets with the BigWhoop library (11:00-13:00, Patrick Vogler, HLRS in cooperation with EXCELLERAT)

On the 3rd day we start with a guest lecture about Deep Neural Networks for Data-Driven Turbulence Models, using Deep Learning in Computational Fluid Dynamics. It will be given by Dr.-Ing. Andrea Beck, Institute of Aerodynamics and Gas Dynamics, University of Stuttgart. The preliminary abstract from 2020 can be found here.

We will conclude the day with an introduction to data compression, focusing on the various methods available to us for the efficient size reduction of our training data. Special attention will be paid to which approaches are best suited for different data types and what impact the different approaches and compression rates have on the quality of the datasets. The compression library BigWhoop and its accompanying command line tool will be made available for the hands-on sessions.

Lunch breaks on days 1 and 2 from 12:30 to 14:00.

Prerequisites and content levels

Prerequisites
  • Familiarity with Linux operating systems, including Linux shell (some parts of the training will use a cluster).
  • Access to an SSH client to enable remote access for interactive portions of the training.
  • Technical background and basic understanding of machine learning concepts will be helpful.
  • Preliminary experience with Python is required. Since Python is used, the following tutorial can be used to learn the syntax.
  • For the 2nd day familiarity with TensorFlow will be a plus, as all the hands-on sessions are using TensorFlow. For those who do not program in TensorFlow, please go over the TensorFlow tutorial (especially the "Learn and use ML" section).
Content levels

Community: 18 hours 15 minutes

Learn more about course curricula and content levels.

Exercises

The exercises will be carried out on both HLRS systems using Jupyter Notebooks and as interactive jobs on script.

Teachers

Dr. Khatuna Kakhiani, Patrick Vogler and Dr.-Ing. Lorenzo Zanon (HLRS), and Dr.-Ing. Andrea Beck (IAG).

Language

The course language is English.

Registration

Registration is closed.

Deadline

for registration is June 13, 2021 (extended deadline).

Fee

Students without Diploma/Master: 30 EUR
Students with Diploma/Master (PhD students) at German universities: 60 EUR
Members of German universities and public research institutes: 60 EUR
Members of universities and public research institutes within EU or PRACE member countries: 120 EUR.
Members of other universities and public research institutes: 240 EUR
Others: 300 EUR

PRACE PATC and bwHPC-C5

HLRS is part of the Gauss Centre for Supercomputing (GCS), which is one of the six PRACE Advanced Training Centres (PATCs) that started in Feb. 2012.
HLRS is also member of the Baden-Württemberg initiative bwHPC-C5.
This course is provided within the framework of the bwHPC-C5 user Support.
This course is not part of the PATC curriculum and is not sponsored by the PATC program.

EXCELLERAT

This workshop is supported by the Horizon-2020 Centre of Excellence EXCELLERAT. See also the EXCELLERAT Service Portal for more information.

Contact

Rolf Rabenseifner phone 0711 685 65530, rabenseifner(at)hlrs.de
Tobias Haas phone 0711 685 87223, tobias.haas(at)hlrs.de

Shortcut-URL & Course number

https://www.hlrs.de/training/2021/DL3