Back to menu

Machine Learning for Bioinformatics & Systems Biology
A yearly course, part of the BioSB Research School

  • dr. ir. Perry Moerland (Amsterdam UMC, location: Academic Medical Center)
  • prof. dr. ir. Marcel Reinders (Delft University of Technology)
  • prof. dr. Lodewyk Wessels (Netherlands Cancer Institute)

Course coordinator:

The fifth BioSB Pattern Recognition Course will be given September 25-29, 2017, at the Academic Medical Center, Amsterdam. General information on the course can be found here. This page only contains the material used during the course; you can download it to work at your own institution. Note that this material is free for academic use only and should not be redistributed.

Note that some of the course material is still likely to change before the course week.

  • To prepare for the course:
    • a self-evaluation test (PDF, 90 Kb) on the prerequisite prior knowledge (probability theory and linear algebra). If you have a lot of trouble answering some of these exercises, consult the text books mentioned in the PDF, or:
    • a few primers (ZIP/PDF, 4.9 Mb) on these topics.

The lab courses will make extensive use of Matlab. You do not need to be a fluent programmer, but if you have never worked with Matlab before it may help to try to get a hold of a copy of Matlab (your university may have a campus license) and have a look at the Appendices of the lab course manual (see below). An extensive Matlab primer is also available. During the course Matlab and all software/data are available on the PCs in the lab, so there is no need to bring your laptop.

To use the code and data, download the ZIP file, unpack everything in the same directory and run prstartup from the Matlab command prompt. A not too old version of Matlab (R2006a or newer) is required.

  • Additional tools (not required for the course, but perhaps interesting):
    • GenLab and PRLab (ZIP), a GUI for microarray data analysis, clustering and classification (poorly maintained, use at your own risk!)
    • BRB ArrayTools, an Excel-based microarray data analysis package using R in the background
    • WEKA, a Java-based collection of machine learning algorithms for data mining
    • Shogun, a Matlab toolbox focusing on large scale kernel methods
    • R is becoming ever more popular for solving data analysis problems. Here is a short reference that provides a mapping between Matlab and R commands.
    • R packages relevant for some of the topics treated in the course are (spread out over a whole range of packages, list is far from complete):
      • First have a look at mlr which is the machine learning package in R.
      • Then have a look at caret which also provides a nice set of functions that attempt to streamline the process for creating predictive models.
      • e0171: support vector machines and a flexible framework for cross-validation/bootstrapping using the tune function
      • MCResimate: flexible framework for feature selection and cross-validation providing a wrapper for several classifiers (PAM, SVM, random forests, ...). Easily extended with classifiers available in other packages
      • MASS: dla, qda
      • rpart: decision trees
      • stats (installed by default): hierarchical clustering, kmeans
      • glmnet: lasso, elastic net
      • See CRAN Task View: Machine Learning & Statistical Learning and CRAN Task View: Cluster Analysis & Finite Mixture Models for pointers to other packages

Some good material for further reading:
Topic revision: r20 - 27 May 2020, UnknownUser Search
This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Foswiki? Send feedback