Resource-aware Machine Learning - 6th International Summer School 2022
TU Dortmund University
Key Information
Campus location
Dortmund, Germany
Languages
English
Study format
Blended
Duration
5 days
Pace
Full time
Tuition fees
EUR 520
Application deadline
Request info
Earliest start date
Request info
Scholarships
Explore scholarship opportunities to help fund your studies
Introduction
The 6th International Summer School on Resource-Aware Machine Learning (REAML 2022) will be held this year from Sept. 12-16, 2022. The school provides lectures on the latest research in machine learning, typically with a twist on resource consumption and how these can be reduced.
The Summer School will be offered as a hybrid event. Due to the ongoing COVID-19 pandemic, it is not guaranteed that every international participant/lecturer can visit Dortmund. The event will thus be a mixture of local and (possibly some) remote lectures. All lectures will also be streamed via Zoom and Youtube to the remote audience of participants that could not travel to Germany. Lectures will be available on-demand on YouTube during the week of the Summer School. Each lecture will be accompanied by a Q&A session. There will be a dedicated space for presenting Ph.D./PostDoc research and a hackathon featuring real-world ML tasks.
Hackathon - Predicting Virus-Like Particles in Liquid Samples
Fitting the context of the COVID-19 pandemic, summer school is accompanied by a challenge in the detection of nanoparticles such as viruses. Using a plasmon-assisted microscopy sensor that can make nanometer-sized particles visible, we provide real-world images containing virus-like signals. The participants are challenged to test their knowledge of Machine Learning and cyber-physical systems in this real-world scenario. In this hackathon, they aim to achieve the most reliable and rapid detection possible with limited resources. They will receive training datasets with particles of defined sizes for training and validating their approaches. All submitted approaches are evaluated against a previously unknown dataset and ranked using a metric that considers both the predictive quality and resource efficiency of the model.
Students’ Corner - Share and discuss your work
The summer school will be accompanied by an exchange platform for participants, the Students' Corner, which will allow them to network and share their research. During the registration, you may express your interest in participation at the student’s corner and we will keep you updated.
During registration, you may express your interest in presenting at the students’ corner.
The summer school is organized by the collaborative research center SFB 876 and the artificial intelligence group at TU Dortmund University.
Gallery
Scholarships and Funding
Curriculum
Highlights:
- Efficient federated learning
- Matrix factorizations with binary constraints - from k-means to deep learning
- The generalization mystery in deep learning
- Deep learning on FPGAs
- Understanding inverse problems
- Counterfactual Evaluation and Learning for Interactive Systems
- Machine Learning without batteries: the next frontier wireless sensors
- Randomized Bayesian inference
- A Painless Introduction to Coresets
Lectures
Opening
Katharina Morik
Artificial Intelligence Unit, TU Dortmund University, Dortmund, Germany
The warm welcome to the summer school comes with an introduction of the collaborative research center SFB 876, which organises REAML 2022. What are the hot topics of resource-aware machine learning? Why and how should we save energy and communication when learning or applying the learned models? We conclude with practical hints.
Efficient Federated Learning
Michael Kamp
Institute for Artificial Intelligence in Medicine (IKIM), Ruhr-University Bochum
Data science and machine learning is taking the world by storm. Almost all theory and methods, however, are inherently flawed in such a basic way that it prevents them from being used in practice. Unlike what most papers assume, in many applications (e.g., autonomous driving, industrial machines, or healthcare) it is impossible or hugely impractical to gather all data into one place. This is not only due to privacy concerns, but the sheer size of data makes centralizing and processing it infeasible. Federated learning offers a solution: models are trained only locally and combined to create a well-performing joint model - without sharing data. Like many data science techniques, applying them in practice requires a high level of trust. However, giving a guarantee on the model quality, training and resource efficiency, bounding the communication, and ensuring data privacy is a huge undertaking. In this talk I will present efficient, theoretically sound, and practically useful methods for efficient federated machine learning, as well as identify important and exciting open problems.
Understanding inverse problems and their solutions
Michael Schmelling
Max-Planck-Institut fuer Kernphysik, Heidelberg
The lecture starts with information theoretic considerations, which show why inverse problem are hard when a measurement is distorted by finite-resolution effects. In most cases this implies that the problem can only be solved by biasing the result, for example by regularisation methods. Different ways are discussed to implement this approach and to control the resulting bias, with a special focus on the proper interpretation of the results.
Matrix factorizations with binary constraints - from k-means to deep learning
Sibylle Hess
Eindhoven University of Technology, Eindhoven, NL
In the field of data mining, there is one big open problem: the optimization subject to binary constraints. Binary constraints make data mining results interpretable and definite. Is this picture showing a cat? Should this movie be recommended to this user? Should the next chess move be this one? Binary results give a definite answer to questions like these. There are a lot of methods which are able to solve binary constrained problems, however, they mostly work under one condition: exclusivity. That is, if a picture shows a cat, then it can not show a dog, if a movie is recommended to one user, then it can't be recommended to the other, and there should only be one next chess move which is the optimal one. Depending on the application, this assumption is is more or less justified. The field of clustering is an area in which the optimization subject to binary constraints is explicitly studied. In this talk, we will discuss the broad spectrum of tasks where a matrix factorization approximation error in Frobenius norm is minimized subject to binary constraints. We will unveil under which circumstances this optimization task defines the clustering objectives of k-means, spectral clustering and subspace clustering, but we will also make connections to methods like deep learning. We will also see how bridging those disciplines under the umbrella of matrix factorization establishes novel research ideas and insights, providing inspiration to tackle pending research questions of adversarial learning, computing meaningful embeddings and to learn sensible similarity metrics.
Transparent, adaptable, and reproducible data analysis with Snakemake
Johannes Köster
Institute of Human Genetics, University Hospital Essen, University of Duisburg-Essen
Data analyses usually entail the application of many scripts, notebooks, and command line tools to transform, filter, aggregate or plot data and results. With ever increasing amounts of data being collected in science, reproducible and scalable automatic workflow management becomes increasingly important. Snakemake is a workflow management system, consisting of a text-based workflow specification language and a scalable execution environment, that allows the parallelized execution of workflows on workstations, compute servers and clusters without modification of the workflow definition. Snakemake thereby puts a particular focus on transparency and human readability, as well as adaptability and modularization of data analyses. With over 380,000 downloads and on average more than 7 new citations per week in 2021 (>1300 in total), Snakemake is one of the most widely used systems for reproducible data analysis. This tutorial will introduce the Snakemake workflow definition language and describe how to use the execution environment. Further, it will be shown how Snakemake helps to create reproducible and transparent analyses that can be adapted to new data with little effort.
On the Generalization Mystery in Deep Learning
Satrajit (Sat) Chatterjee
Palo Alto, CA, USA
The generalization mystery in deep learning is the following: Why do over-parameterized neural networks trained with gradient descent (GD) generalize well on real datasets even though they are capable of fitting random datasets of comparable size? Furthermore, from among all solutions that fit the training data, how does GD find one that generalizes well (when such a well-generalizing solution exists)? We argue that the answer to both questions lies in the interaction of the gradients of different examples during training. Intuitively, if the per-example gradients are well-aligned, that is, if they are coherent, then one may expect GD to be (algorithmically) stable, and hence generalize well. We formalize this argument with an easy to compute and interpretable metric for coherence, and show that the metric takes on very different values on real and random datasets for several common vision networks. The theory also explains a number of other phenomena in deep learning, such as why some examples are reliably learned earlier than others, why early stopping works, and why it is possible to learn from noisy labels. Moreover, since the theory provides a causal explanation of how GD finds a well-generalizing solution when one exists, it motivates a class of simple modifications to GD that attenuate memorization and improve generalization. Generalization in deep learning is an extremely broad phenomenon, and therefore, it requires an equally general explanation. We conclude with a survey of alternative lines of attack on this problem, and argue that the proposed approach is the most viable one on this basis.
Deep learning on FPGAs: trends and examples
Wayne Luk
Department of Computing, Imperial College London, UK
Ce Guo
Departments of Computing and of Physics, Imperial College London, UK
Hardware implementation is critical to reducing execution time and energy consumption for the training and deployment of deep learning models. The use of field-programmable gate arrays (FPGAs) is a promising approach to achieve a good trade-off between the design cycle and performance for deep learning systems. This lecture on FPGA-based deep learning consists of two parts. The first part gives an overview of state-of-the-art FPGA design for training and inference of deep learning models. Specifically, this part covers potential benefits, application scenarios, main challenges, design optimisation techniques for FPGA-based deep learning with examples. The second part discusses a basic FPGA design for feed-forward networks (FFNs). The design accelerates the back-propagation process for FFN training and can be extended to support more complicated network architectures.
The sky is not the limit -- Machine Learning Applications in Astroparticle Physics
Tim Ruhe
Astro-particle Physics Group, TU Dortmund University, Dortmund, Germany
Although cosmic rays were already discovered more than 100 years ago, their exact origin, as well as the physical mechanisms involved in their acceleration remain largely unknown. In order to resolve this mystery large scale facilities have been set up around the globe, which target different messenger particles. Popular examples For many of these facilities the use of machine learning algorithms has become a standard analysis technique. Algorithms and their application differ between individual analyses, but especially Boosting, Random Forests and Deep Neural Networks are not only populare but also highly successful choices. This lecture will provide an overview over the challenges associate with the detection and analysis of different messenger particles and how these challenges can be addressed via the use of machine learning and deep learning algorithms.
Hackathon Lecture: Predicting Virus-Like Particles in Liquid Samples
Frank Weichert
Computer graphics lab, TU Dortmund University, Dortmund, Germany
Roland Hergenröder
Leibniz-Institut für Analytische Wissenschaften-ISAS, Dortmund, Germany
The COVID-19 pandemic has shown the importance of medical testing for an early detection of regional disease hot spots and for monitoring the course of the pandemic. In particular, the coupling of medical biosensors with concepts of machine learning has the potential to meet the requirements for efficient and robust detection of current and future pathogens. The lecture illustrates this with the example of the plasmon-assisted microscopy sensor that can make nanometer-sized particles (e.g., viruses) visible. The principle of operation of the sensor and the concept for the detection of nanometer-sized particles is explained. The challenge is that the analysis is carried out on the basis of data-intensive and very noisy or artefact afflicted image sequences and that the processing of the image sequences should be done in (soft) real-time while minimising resource consumption, e.g., of energy and memory. The lecture is thus at the same time an introduction to the hackathon.
Hackathon: Predicting Virus-Like Particles in Liquid Samples
Frank Weichert
Computer graphics lab, TU Dortmund University, Dortmund, Germany
Roland Hergenröder
Leibniz-Institut für Analytische Wissenschaften-ISAS, Dortmund, Germany
Konstantin Wüstefeld
Computer graphics lab, TU Dortmund University, Dortmund, Germany
Fitting the context of the COVID-19 pandemic, the summer school is accompanied by a challenge on the detection of nanoparticles such as viruses. Using a plasmon-assisted microscopy sensor that can make nanometer-sized particles visible, we provide real-world images containing virus-like signals. The participants are challenged to test their knowledge of Machine Learning and cyber-physical systems in this real-world scenario. In this hackathon, they aim to achieve the most reliable and rapid detection possible with limited resources. They will receive training datasets with particles of defined sizes for training and validating their approaches. All submitted approaches are evaluated against a previously unknown dataset and ranked using a metric that considers both the predictive quality and resource efficiency of the model.
Randomised Bayesian inference
Han Cheng Lie
Mathematics Institute, University of Potsdam
Bayesian methods are often used to solve inverse problems and machine learning tasks. In a Bayesian method, one represents one's state of knowledge about an unknown object of interest using a probability measure, and then iteratively updates this probability measure each time a new data point is obtained, by using a likelihood function and Bayes' formula. One challenge common to many Bayesian methods is that evaluating the likelihood function for an arbitrary input can be computationally expensive. This motivates the use of cheaper approximations of the likelihood function. Random approximations of the likelihood --- for example, using randomised linear algebra --- have become popular in recent years, because they are often parallelisable. However, since these approximations introduce errors into the probability measure, one must analyse the errors to ensure that they do not 'break' the Bayesian method. In this lecture, we will present the basic ideas of Bayesian inference, motivate the use of random approximations of the likelihood function using some powerful ideas from mathematics, and analyse the approximation errors of the corresponding randomised Bayesian method.
Counterfactual Evaluation and Learning for Interactive Systems
Thorsten Joachims
Department of Computer Science and Department of Information Science, Cornell University
This tutorial provides an introduction to a rapidly growing area of new methods for training and evaluating intelligent systems that act autonomously, including applications from recommendation and search engines to automation and robotics. At the core of these methods are counterfactual estimators that enable the use of existing log data to estimate how some new target policy would have performed, if it had been used instead of the policy that logged the data. We say that those estimators work "off-policy", since the policy that logged the data is different from the target policy. In this way, counterfactual estimators enable Off-policy Evaluation (OPE) akin to an unbiased offline A/B test, as well as learning new decision-making policies through Off-policy Learning (OPL). The goal of this tutorial is to summarize the foundations of OPE and OPL, and provide an overview of activity and future directions in this field.
Machine Learning without batteries: the next frontier wireless sensors
Andres Gomez
Interaction- and Communication-based Systems, University of St. Gallen, Switzerland
Over the last decade, energy harvesting has seen significant growth as different markets adopt green, sustainable ways to produce electrical energy. Even though costs have fallen, the embedded sensing and Internet of Things community have not yet widely adopted energy-harvesting-based solutions. This is partly due to a mismatch between power density in energy harvesters and electronic devices which, until recently, required a battery or super-capacitor to be functional. This mismatch is especially accentuated in indoor environments, where there is comparably less primary energy available than in outdoor environments. In this talk, I will present a design methodology that can optimize energy flow in dynamic environments without requiring batteries or super-capacitors. Furthermore, I will discuss the general applicability of this approach by presenting two light-powered batteryless sensing systems, smartcards and cameras, together with optimization techniques to maximize their performance and energy efficiency.
A Painless Introduction to Coresets
Chris Schwiegelshohn
MADALGO, Department of Computer Science, Aarhus University
Coresets are arguably the most important paradigm used in the design and analysis of big data algorithms. Succintly, a coreset compresses the input such that for any candidate query, the query evaluation on the coreset and the query evaluation on the original data are approximately the same. For clustering, this means that a coreset is a small weighted sample of the points such that for any set of centers, the cost on the original point set and the cost on the coreset are equal up to some small multiplicative distortion. In this talk, we will give an in-depth and yet also very simple and basic introduction into coreset algorithms and their analysis.