Probalign

From Wikipedia - Reading time: 3 min

Probalign is a sequence alignment tool that calculates a maximum expected accuracy alignment using partition function posterior probabilities.[1] Base pair probabilities are estimated using an estimate similar to Boltzmann distribution. The partition function is calculated using a dynamic programming approach.

Algorithm[edit]

The following describes the algorithm used by probalign to determine the base pair probabilities.[2]

Alignment score[edit]

To score an alignment of two sequences two things are needed:

  • a similarity function (e.g. PAM, BLOSUM,...)
  • affine gap penalty:

The score of an alignment a is defined as:

Now the boltzmann weighted score of an alignment a is:

Where is a scaling factor.

The probability of an alignment assuming boltzmann distribution is given by

Where is the partition function, i.e. the sum of the boltzmann weights of all alignments.

Dynamic Programming[edit]

Let denote the partition function of the prefixes and . Three different cases are considered:

  1. the partition function of all alignments of the two prefixes that end in a match.
  2. the partition function of all alignments of the two prefixes that end in an insertion .
  3. the partition function of all alignments of the two prefixes that end in a deletion .

Then we have:

Initialization[edit]

The matrixes are initialized as follows:

Recursion[edit]

The partition function for the alignments of two sequences and is given by , which can be recursively computed:

  • analogously

Base pair probability[edit]

Finally the probability that positions and form a base pair is given by:

are the respective values for the recalculated with inversed base pair strings.

See also[edit]

  • ProbCons
  • Multiple Sequence Alignment

References[edit]

  1. ^ U. Roshan and D. R. Livesay, Probalign: multiple sequence alignment using partition function posterior probabilities, Bioinformatics, 22(22):2715-21, 2006 (PDF)
  2. ^ Lecture "Bioinformatics II" at University of Freiburg

External links[edit]

This article is licensed under CC BY-SA 3.0.
Original source: https://en.wikipedia.org/wiki/Probalign
Status: article is cached
Encyclosphere.org EncycloReader is supported by the EncyclosphereKSF