Multiple alignment parameters


These parameters control the final multiple alignment. This is the core of the program and the details are complicated. To fully understand the use of the parameters and the scoring system, you will have to refer to the documentation.

Each step in the final multiple alignment consists of aligning two alignments or sequences. This is done progressively, following the branching order in the GUIDE TREE. The basic parameters to control this are two gap penalties and the scores for various identical/non-indentical residues.

1) and 2) The GAP PENALTIES are set by menu items 1 and 2. These control the cost of opening up every new gap and the cost of every item in a gap. Increasing the gap opening penalty will make gaps less frequent. Increasing the gap extension penalty will make gaps shorter. Terminal gaps are not penalised.

3) The DELAY DIVERGENT SEQUENCES switch, delays the alignment of the most distantly related sequences until after the most closely related sequences have been aligned. The setting shows the percent identity level required to delay the addition of a sequence; sequences that are less identical than this level to any other sequences will be aligned later.

4) The TRANSITION WEIGHT gives transitions (A <--> G or C <--> T i.e. purine-purine or pyrimidine-pyrimidine substitutions) a weight between 0 and 1; a weight of zero means that the transitions are scored as mismatches, while a weight of 1 gives the transitions the match score. For distantly related DNA sequences, the weight should be near to zero; for closely related sequences it can be useful to assign a higher score.

5) PROTEIN WEIGHT MATRIX leads to a new menu where you are offered a choice of weight matrices. The default for proteins is the BLOSUM series of matrices by Jorja and Steven Henikoff. Note, a series is used! The actual matrix that is used depends on how similar the sequences to be aligned at this alignment step are. Different matrices work differently at each evolutionary distance.

6) DNA WEIGHT MATRIX leads to a new menu where a single matrix (not a series) can be selected. The default is the matrix used by BESTFIT for comparison of nucleic acid sequences.

Further help is offeredn in the weight matrix menu.

7) In the weight matrices, you can use negative as well as positive values if you wish, although the matrix will be automatically adjusted to all positive scores, unless the NEGATIVE MATRIX option is selected.

8) PROTEIN GAP PARAMETERS displays a menu allowing you to set some Gap Penalty options which are only used in protein alignments.