Medical Bioinformatics

Back to menu

Pattern Recognition

Visit the 2011 course website for more detailed information and all course material.

Part of the NBIC PhD school

  • Dr. ir. Dick de Ridder
  • Dr. Lodewyk Wessels
    • NKI
  • Dr. ir Perry Moerland
    • Bioinformatics Laboratory, Academic Medical Center

NBIC PhD School

NBIC PhD School: advanced courses for bioinformaticians
In the field of bioinformatics, there is a continuous flow of new insights, tools and applications. The NBIC PhD School targets the need to stay up to date through an advanced programme developed and taught by experts with hands-on experience. The courses cover a variety of topics and technologies and allow the creation of a personalised education programme that specifically fits your research and interests. The NBIC PhD school courses are accessible for PhD students and post-docs worldwide. to broaden the international scope of the NBIC PhD School, partnerships with other institutes, for example with the Swiss Institute for Bioinformatics (SIB), are developed.

In short, the NBIC PhD School aims to:
  • Offer a top-level education and training programme in bioinformatics
  • Create opportunities for PhD students to broaden their scientific scope
  • Provide an environment for international networking and exchange

Pattern recognition module

Goal pattern recognition module
After having followed this course, a student should have an overview of basic pattern recognition techniques and be able to recognize what method is most applicable to classification problems (s)he encounters in bioinformatics applications.

Target audience
The course is aimed at PhD students with a background in bioinformatics, computer science or a related field; a working knowledge of basic statistics and linear algebra is assumed. Preparation material on statistics and linear algebra will be distributed before the course, to be studied by students missing the required background.

Many problems in bioinformatics require classification: prediction of the class to which a certain object (i.e. a gene, protein, cell, patient, ?) belongs. This calls for algorithms that can assign the most likely label (discrete output) to an object, given one or more measurements on that object. For most interesting problems, the underlying physics are too complex to explicitly formulate such an algorithm. In such cases, a machine learning approach is taken: an algorithm is constructed, with parameters that are tuned based on an available dataset of training examples. The algorithm should predict the labels for these examples as well as possible, yet still generalize, i.e. perform well on objects not seen before. Some examples of classification problems in bioinformatics are gene finding (sequence in, gene presence out), diagnostics (microarray data in, diagnosis out), data integration (measurements in, probability of interaction out), etc.

In this course, we will introduce basic techniques from the fields of pattern recognition and machine learning to solve such problems. We will introduce the pattern recognition pipeline: measuring, feature extraction and selection, classification and evaluation. The first two days will introduce the basic classification problem and a number of classic approaches to solve it. Next, methods for selecting or extracting informative features from a large set of measurements will be introduced. This will be followed by an introduction to a number of unsupervised techniques, that allow to find natural groupings or probabilistic descriptions of (unlabeled) data. The course will end with a cursory introduction to a number of intricate classifiers, artificial neural networks and support vector machines, and an overview of approaches to solve the generalization problem. For a large number of the methods discussed, we will turn to recent bioinformatics literature for examples.

The student will analyse a biological dataset (preferably one from his/her own practice) using the tools provided in the course, and write a small report (5-10 pages) on the results. If the student has no dataset available, one will be provided. The report will have to be handed in no later than three weeks after the course has finished.
Edit |  | Print version | History: r35 | r7 < r6 < r5 < r4 | Backlinks | View wiki text | Edit WikiText | More topic actions...
Topic revision: r6 - 05 Oct 2012, PerryMoerland

http://wiki.bioinformaticslaboratory.nl/foswiki/bin/view/BioLab/WebHome Search
This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Foswiki? Send feedback