Back to menu

Machine Learning for Bioinformatics & Systems Biology

A yearly course, part of the BioSB Research School

  • dr. ir. Perry Moerland (Amsterdam UMC, location Academic Medical Center)
  • prof. dr. ir. Marcel Reinders (Delft University of Technology)
  • prof. dr. Lodewyk Wessels (Netherlands Cancer Institute)

Course coordinator:

Learning objectives

After having followed this course, the student has a good understanding of a wide range of machine learning techniques and is able to recognize what method is most applicable to data analysis problems (s)he encounters in bioinformatics and systems biology applications.

Target audience

The course is aimed at PhD students with a background in bioinformatics, systems biology, computer science or a related field, and life sciences. Participants from the private sector are also welcome. A working knowledge of basic statistics and linear algebra is assumed. Preparation material on statistics and linear algebra will be distributed before the course, to be studied by students missing the required background.


Modern biology is a data-rich science, driven by our ability to measure the detailed molecular characteristics of cells, organs, and individuals at many different levels. Interpretation of these large-scale biological data requires the detection of statistical dependencies and patterns in order to establish useful models of complex biological systems. Techniques from machine learning are key in this endeavour. Typical examples are the visualization of single-cell RNA-seq data using dimensionality reduction methods, base calling for nanopore sequencing data using hidden Markov models and (recurrent) neural networks, and classification of high-throughput microscopy image data using convolutional neural networks.

In this one-week course, the foundations of machine learning will be laid out and commonly used methods for unsupervised (clustering, dimensionality reduction, visualization) and supervised (mainly classification) learning will be explained in detail. Methods will be illustrated using recent examples from the fields of systems biology and bioinformatics. Methods discussed in the morning lectures will be put into practice during the afternoon computer lab sessions. The course has to be completed afterwards with a 5-10 page report describing the analysis of a biological dataset using some of the methods taught in the course.


You can register for this course by filling out the BioSB enrolment form. The maximum number of participants is 25, so register soon to be sure of a course seat!

The course fee (to be determined) includes:

  • Course material: Lecture slides, a lab course manual and software required for the lab course (MATLAB toolboxes) will be made available online.
  • Catering: Coffee, tea, soft drinks and lunch will be provided.

Information about ho(s)tel accommodation in Amsterdam can be found here. Participants have to book (and pay for) the accommodation themselves if they need it. This is not included in the course fee.

Course material

The course material is available here and includes the handouts of the slides, a lab course manual and the required data and Matlab toolboxes.

For the moment you are already advised to have a look at the following documents:
  • To prepare for the course: a self-evaluation test (PDF, 90 Kb) on the prerequisite prior knowledge (probability theory and linear algebra). If you have a lot of trouble answering some of these exercises, consult the text books mentioned in the PDF, or a few primers (ZIP/PDF, 4.9 Mb) on these topics.
  • The lab courses will make extensive use of Matlab. You do not need to be a fluent programmer, but if you have never worked with Matlab before it may help to try to get a hold of a copy of Matlab (your university may have a campus license) before the course and have a look at the Appendices of the lab course manual. An extensive Matlab primer is also available. During the course Matlab and all software/data are available on the PCs in the lab, so there is no need to bring your laptop.


Participants requiring a certificate of successful completion should make a final assignment. The student will analyse a biological dataset (preferably one from his/her own practice) using the tools provided in the course, and write a small report (5-10 pages) on the results. If the student has no dataset available, one will be provided. The report will have to be mailed to p.d.moerland@amc.uva.nl no later than three weeks after the course has finished (October 20, 2017). We will strictly adhere to this deadline; if you require extension, you should contact us well in advance. The proposal will be graded "fail" or "pass", with one possible resubmission. Those who choose not to make the final assignment will receive a certificate of participation.

Schedule (2019 edition)

The course will run October 7-11 2019. Preparation material on statistics and linear algebra will be distributed before the course, to be studied by students missing the required background. After the course, 2-3 days will have to be spent on the report to be handed in. Each course day will have the following layout:

  • 9.00 - 12.00 Lecture (HvA, B2-10 (Mo,Th, Fr), K01-222 (Tu, We, Fr))
  • 12.00 - 13.00 Lunch
  • 13.00 - 17.00 Computer lab (L01-243/245 (Mo,Th, Fr), L0-211 (Tu), L0-230 (We), L0-227 (Th), L0-227/L0-229 (Fr))

L01-211 (basement) and L0-211 (ground floor) are located in the main building of the Academic Medical Center (map), Meibergdreef 9, Amsterdam. Travel directions can be found here.

Monday (October 7) - Introduction
Lecturer Marcel Reinders
Subjects Introduction to pattern recognition: measurements, features, classification. Supervised vs. unsupervised learning, relation to regression. Bayesian framework: risk, cost; evaluation: ROCs, cross-validation. Density estimation: histograms, nearest neighbour, Parzen, Gaussian Bayesian classification.

Tuesday (October 8) - Classifiers
Lecturer Perry Moerland
Subjects Parametric classifiers: (D)LDA, (D)QDA. Nonparametric classifiers: k-NN, Parzen. Discriminant analysis: LDA, logistic regression. Decision trees and random forests.

Wednesday (October 9) - Feature selection and extraction
Lecturer Lodewyk Wessels
Subjects Feature selection: criteria, search algorithms (forward, backward, branch & bound). Sparse classifiers: Ridge, LASSO. Feature extraction: PCA, Fisher. Embeddings: MDS.

Thursday (October 10) - Clustering and HMMs
Lecturer Perry Moerland
Subjects Hierarchical clustering. Agglomerative clustering. Model-based clustering: mixtures-of-Gaussians, Expectation-Maximization. Hidden Markov models.

Friday (October 11) - Selected advanced topics
Lecturer Marcel Reinders
Subjects Artificial neural networks. Support vector machines. Classifier ensembles. Complexity and regularisation. Deep learning.

For more information about the course programme, please contact Perry Moerland; for more information about registration or logistics, please contact Femke Francissen.

Edit |  | Print version | History: r7 | r4 < r3 < r2 < r1 | Backlinks | View wiki text | Edit WikiText | More topic actions...
Topic revision: r1 - 11 Sep 2019, PerryMoerland

https://wiki.bioinformaticslaboratory.nl/foswiki/bin/view/BioLab/WebHome Search
This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Foswiki? Send feedback