profalign
Align two profiles using Needleman-Wunsch global alignment
Syntax
Prof
= profalign(Prof1
, Prof2
)
[Prof, H1, H2
] = profalign(Prof1
, Prof2
)
profalign(..., 'ScoringMatrix', ScoringMatrixValue
,
...)
profalign(..., 'GapOpen', {G1Value
, G2Value
},
...)
profalign(..., 'ExtendGap', {E1Value
, E2Value
},
...)
profalign(..., 'ExistingGapAdjust', ExistingGapAdjustValue
,
...)
profalign(..., 'TerminalGapAdjust', TerminalGapAdjustValue
,
...)
profalign(..., 'ShowScore', ShowScoreValue
,
...)
Description
returns
a new profile (Prof
= profalign(Prof1
, Prof2
)Prof
) for the optimal global
alignment of two profiles (Prof1
, Prof2
).
The profiles (Prof1
, Prof2
)
are numeric arrays of size [(4 or 5 or 20 or 21) x Profile
Length]
with counts or weighted profiles. Weighted profiles
are used to down-weight similar sequences and up-weight divergent
sequences. The output profile is a numeric matrix of size [(5
or 21) x New Profile Length]
where the last row represents
gaps. Original gaps in the input profiles are preserved. The output
profile is the result of adding the aligned columns of the input profiles.
[
returns
pointers that indicate how to rearrange the columns of the original
profiles into the new profile.Prof, H1, H2
] = profalign(Prof1
, Prof2
)
profalign(..., '
calls PropertyName
', PropertyValue
,
...)profalign
with optional properties
that use property name/property value pairs. You can specify one or
more properties in any order. Each PropertyName
must
be enclosed in single quotation marks and is case insensitive. These
property name/property value pairs are as follows:
profalign(..., 'ScoringMatrix',
defines the scoring matrix to be used for the alignment. ScoringMatrixValue
,
...)
ScoringMatrixValue
can be either
of the following:
Character vector or string specifying the scoring matrix to use for the alignment. Choices for amino acid sequences are:
'BLOSUM62'
'BLOSUM30'
increasing by5
up to'BLOSUM90'
'BLOSUM100'
'PAM10'
increasing by10
up to'PAM500'
'DAYHOFF'
'GONNET'
Default is:
'BLOSUM50'
— WhenAlphabetValue
equals'AA'
'NUC44'
— WhenAlphabetValue
equals'NT'
Note
The above scoring matrices, provided with the software, also include a structure containing a scale factor that converts the units of the output score to bits. You can also use the
'Scale'
property to specify an additional scale factor to convert the output score from bits to another unit.Matrix representing the scoring matrix to use for the alignment, such as returned by the
blosum
,pam
,dayhoff
,gonnet
, ornuc44
function.Note
If you use a scoring matrix that you created or was created by one of the above functions, the matrix does not include a scale factor. The output score will be returned in the same units as the scoring matrix.
Note
If you need to compile
profalign
into a stand-alone
application or software component using
MATLAB®
Compiler™, use a matrix instead of a character
vector or string for
ScoringMatrixValue
.
profalign(..., 'GapOpen', {
sets the penalties for opening a gap in the first
and second profiles respectively. G1Value
, G2Value
},
...)G1Value
and G2Value
can
be either scalars or vectors. When using a vector, the number of elements
is one more than the length of the input profile. Every element indicates
the position specific penalty for opening a gap between two consecutive
symbols in the sequence. The first and the last elements are the gap
penalties used at the ends of the sequence. The default gap open penalties
are {10,10}
.
profalign(..., 'ExtendGap', {
sets the penalties for extending a gap in the first
and second profile respectively. E1Value
, E2Value
},
...)E1Value
and E2Value
can
be either scalars or vectors. When using a vector, the number of elements
is one more than the length of the input profile. Every element indicates
the position specific penalty for extending a gap between two consecutive
symbols in the sequence. The first and the last elements are the gap
penalties used at the ends of the sequence. If ExtendGap
is
not specified, then extensions to gaps are scored with the same value
as GapOpen
.
profalign(..., 'ExistingGapAdjust',
, if ExistingGapAdjustValue
,
...)ExistingGapAdjustValue
is false
,
turns off the automatic adjustment based on existing gaps of the position-specific
penalties for opening a gap. When ExistingGapAdjustValue
is true
(default),
for every profile position, profalign
proportionally
lowers the penalty for opening a gap toward the penalty of extending
a gap based on the proportion of gaps found in the contiguous symbols
and on the weight of the input profile.
profalign(..., 'TerminalGapAdjust',
, when TerminalGapAdjustValue
,
...)TerminalGapAdjustValue
is true
,
adjusts the penalty for opening a gap at the ends of the sequence
to be equal to the penalty for extending a gap. Default is false
.
profalign(..., 'ShowScore',
, when ShowScoreValue
,
...)ShowScoreValue
is true
,
displays the scoring space and the winning path.
Examples
Read in sequences and create profiles.
ma1 = ['RGTANCDMQDA';'RGTAHCDMQDA';'RRRAPCDL-DA']; ma2 = ['RGTHCDLADAT';'RGTACDMADAA']; p1 = seqprofile(ma1,'gaps','all','counts',true); p2 = seqprofile(ma2,'counts',true);
Merge two profiles into a single one by aligning them.
p = profalign(p1,p2); seqlogo(p)
Use the output pointers to generate the multiple alignment.
[p, h1, h2] = profalign(p1,p2); ma = repmat('-',5,12); ma(1:3,h1) = ma1; ma(4:5,h2) = ma2; disp(ma)
Increase the gap penalty before cysteine in the second profile.
gapVec = 10 + [p2(aa2int('C'),:) 0] * 10 p3 = profalign(p1,p2,'gapopen',{10,gapVec}); seqlogo(p3)
Add a new sequence to a profile without inserting new gaps into the profile.
gapVec = [0 inf(1,11) 0]; p4 = profalign(p3,seqprofile('PLHFMSVLWDVQQWP'),... 'gapopen',{gapVec,10}); seqlogo(p4)
Version History
Introduced before R2006a