These parameters control the final multiple alignment. This is the core of
the program and the details are complicated. To fully understand the use
of the parameters and the scoring system, you will have to refer to the
documentation.
Each step in the final multiple alignment consists of aligning two alignments
or sequences. This is done progressively, following the branching order in
the GUIDE TREE. The basic parameters to control this are two gap penalties and
the scores for various identical/non-indentical residues.
1) and 2) The GAP PENALTIES are set by menu items 1 and 2. These control the
cost of opening up every new gap and the cost of every item in a gap.
Increasing the gap opening penalty will make gaps less frequent. Increasing
the gap extension penalty will make gaps shorter. Terminal gaps are not
penalised.
3) The DELAY DIVERGENT SEQUENCES switch, delays the alignment of the most
distantly related sequences until after the most closely related sequences have
been aligned. The setting shows the percent identity level required to delay
the addition of a sequence; sequences that are less identical than this level
to any other sequences will be aligned later.
4) The TRANSITION WEIGHT gives transitions (A <--> G or C <--> T
i.e. purine-purine or pyrimidine-pyrimidine substitutions) a weight between 0
and 1; a weight of zero means that the transitions are scored as mismatches,
while a weight of 1 gives the transitions the match score. For distantly related
DNA sequences, the weight should be near to zero; for closely related sequences
it can be useful to assign a higher score.
5) PROTEIN WEIGHT MATRIX leads to a new menu where you are offered a
choice of weight matrices. The default for proteins is the BLOSUM series of
matrices by Jorja and Steven Henikoff. Note, a series is used! The actual
matrix that is used depends on how similar the sequences to be aligned at this
alignment step are. Different matrices work differently at each
evolutionary distance.
6) DNA WEIGHT MATRIX leads to a new menu where a single matrix (not a series)
can be selected. The default is the matrix used by BESTFIT for comparison of
nucleic acid sequences.
Further help is offeredn in the
weight matrix menu.
7) In the weight matrices, you can use negative as well as positive values if
you wish, although the matrix will be automatically adjusted to all positive
scores, unless the NEGATIVE MATRIX option is selected.
8) PROTEIN GAP PARAMETERS displays a menu allowing you to set some Gap Penalty
options which are only used in protein alignments.