Advanced Machine Learning

WS 2016
V3 + Ü1 (6 ECTS credits)
Course Dates:
Lecture Mon 14:15-15:45 UMIC 025
Lecture/Exercise Thu 14:15-15:45 UMIC 025

Lecture Description

This lecture will extend the scope of the "Machine Learning" lecture with additional and, in parts, more advanced concepts. In particular, the lecture will cover the following areas:

  • Regression techniques (linear regression, ridge regression, lasso, support vector regression)
  • Gaussian Processes
  • Learning with latent variables
  • Dirichlet Processes
  • Structured output learning


Successful completion of the class "Machine Learning" is recommended, but not a hard prerequisite.


The class is accompanied by exercises that will allow you to collect hands-on experience with the algorithms introduced in the lecture. There will be both pen&paper exercises and practical programming exercises based on Matlab (roughly 1 exercise sheet every 2 weeks). Please turn in your solutions to the exercises by e-mail to the appropriate TA the night before the exercise class.

We ask you to work in teams of 2-3 students.

Course Schedule
Date Title Content Material
Introduction Introduction, Polynomial Fitting, Least-Squares Regression, Overfitting, Regularization, Ridge Regression
Exercise 0 Intro Matlab
Linear Regression I Probabilistic View of Regression, Maximum Likelihood, MAP, Bayesian Curve Fitting
Linear Regression II Basis Functions, Sequential Learning, Multiple Outputs, Regularization, Lasso, Bias-Variance Decomposition
Gaussian Processes I Kernels, Kernel Ridge Regression, Gaussian Processes, Predictions with noisy observations
Gaussian Processes II Influence of hyperparameters, Bayesian Model Selection
Approximate Inference I Sampling Approaches, Monte Carlo Methods, Transformation Methods, Ancestral Sampling, Rejection Sampling, Importance Sampling
Approximate Inference II Markov Chain Monte Carlo, Metropolis-Hastings Algorithm, Gibbs Sampling
Exercise 1 Regression, Least-Squares, Ridge, Kernel Ridge, Gaussian Processes
Linear Discriminants Revisited Generalized Linear Discriminants, Gradient Descent, Logistic Regression, Softmax Regression, Error function analysis
Neural Networks Single-Layer Perceptron, Multi-Layer Perceptron, Mapping to Linear Discriminants, Error Functions, Regularization
--no class-- Charlemagne Lecture by Yann LeCun (SuperC, 15:00-16:30h)
Exercise 2 Rejection Sampling, Importance Sampling, MCMC, Metropolis-Hastings
Backpropagation Multi-layer networks, Chain rule, gradient descent, implementation aspects
Tricks of the Trade I Stochastic Gradient Descent, Minibatch Learning, Data Augmentation, Effects of Nonlinearities, Initialization (Glorot, He)
Tricks of the Trade II Optimizers (Momentum, Nesterov-Momentum, AdaGrad, RMS-Prop, Ada-Delta, Adam), Drop-out, Batch Normalization
Convolutional Neural Networks I CNNs, Convolutional layer, pooling layer, LeNet, ImageNet challenge, AlexNet
Exercise 3 Hands-on tutorial on Softmax, Backpropagation
Convolutional Neural Networks II VGGNet, GoogLeNet, Visualizing CNNs
Exercise 4 Hands-on tutorial on Theano and Torch7
CNN Architectures & Applications Residual Networks, Siamese Networks, Triplet Loss, Applications of CNNs
Dealing with Discrete Data Word Embeddings, word2vec, Motivation for RNNs
Recurrent Neural Networks I Plain RNNs, Backpropagation through Time, Practical Issues, Initialization
Recurrent Neural Networks II LSTM, GRU, Success Stories
Exercise 5 TBD
Deep Reinforcement Learning I Reinforcement Learning, TD Learning, Q-Learning, SARSA, Deep RL
Deep Reinforcement Learning II Deep RL, Deep Q-Learning, Deep Policy Gradients, Case studies
Repetition Repetition
Exercise 6 Weight sharing, Autoencoders, RNNs.
Exam 1
Disclaimer Home Visual Computing institute RWTH Aachen University