MATH 5610: Computational Biology
This course is designed to introduce a broad range of computational problems in molecular biology. Solution techniques draw from several branches of mathematics: combinatorics, probability, optimization, and dynamical systems.

Prerequisites: Fundamentals of Computing (CSC 1410)
Linear algebra, Algorithm analysis, Graph theory, and Differential equations (MATH 5198)
Probability and statistics (MATH 3800)
Introductory knowledge of molecular biology (BIOL 5099)

This is generally taught each spring semester.

Topics
  • Sequence alignment
      Evolutionary events change DNA sequences, and alignment is a way to understand how one genome relates to another. The alignment is done at the nucleotide level, aligning DNA or RNA segments. It is also done at the amino acid (residue) level, aligning proteins. Two sequences are easy to align, using dynamic progrramming as a solution technique. Multiple sequences are hard and there are several approaches to this.
  • Phylogenetic trees
      A phylogenetic tree is a graphical presentation of the evolutionary history of some species or its parts. We want to compute a phylogenetic tree in order to understand how life works. Specifically, we can help with sequence alignment, predict protein structure, predict gene expression, design enhanced organisms (like wheat, rice, ...), map pathogen strain diversity for vaccines, and assist in epidemiology of infectious diseases or genetic defect. Some computational methods are distance based; others are based on parsimony or maximum likelihood.
  • Gene expression arrays
      Modern biology uses high throughput data methods, which are on chips, called microarrays, which measure gene expression indirectly. The volume of data on one chip can be enormous, and one biologist might generate several chips per day. It is the goal of computational biology to obtain knowledge from this large volume of data that is riddled with error. Clustering techniques are presented with illustrations of what can happen if they are used inproperly, particularly issues of data conditioning. Principle Component Analysis and data visualization are among other topics, using data from a variety of biological databases.
  • Markov models
      The elementary Markov model is described and applied to a variety of problems, such as CpG island recognition, coding region recognition, gene finding. These are extended to Hidden Markov Models, showing many modern applications in biology. (HMMs originally were motivated by problems in speech recognition.)
  • Protein structure I: Basics
      This begins with what proteins are and how they are made and categorized. This moves into an understanding of the Protein Data Bank - what information it contains, how to obtain it, and how it relates to other databases. Visualization of proteins is illustrated with RasMol and/or CHIME.
  • Protein structure II: Analysis
      Starting with molecular geometry, we relate coordinates, angles, and distances. With perfect information we can transform from any one of these to any other, but there are problems in which we do not have perfect information, such as given distances between only some of the pairs of atoms. The dynamics of folding is then considered, showing the mathematical equations in context of the biology, chemistry, and physics. The fundamental protein folding problem is defined, and its complexity discussed in depth (viz., Levinthal paradox).
  • Systems biology
      Networks describe complex biological systesm, such as Protein-protein interactions, gene regulation, cell signaling, metabolic processes, and more. The cutting edge of modern biology is treating these networks as part of a system with highly interacting parts.


[Home] [About CCB] [Education Programs] [Join Us] [Directories] [Events] [Jobs]
Send questions and comments about this site to the webmaster or visit Contact Information
Copyright © 1999 - 2004 The CU Center for Computational Biology