Bowtie2AlignOptions
Options to map reads to reference sequence
Description
A Bowtie2AlignOptions
object contains options to run the
bowtie2
function, which aligns reads to a reference
sequence.
Creation
Syntax
Description
creates a alignOptions
= Bowtie2AlignOptionsBowtie2AlignOptions
object with default property
values.
Bowtie2AlignOptions
requires the Bowtie 2 Support Package for Bioinformatics Toolbox™. If this support package is not installed, then the function provides a download
link. For details, see Bioinformatics Toolbox Software Support Packages.
sets properties using one or more name-value pair arguments. Enclose each
property name in quotes. For example, alignOptions
= Bowtie2AlignOptions(Name,Value)alignOptions =
Bowtie2AlignOptions('Trim5',10)
specifies to trim 10 residues from
the 5' end.
Input Arguments
S
— Alignment parameters
character vector
Alignment parameters, specified as a character vector.
S
must be in the Bowtie 2 option syntax
(prefixed by one or two dashes) [1].
Properties
AllowDovetail
— Flag to allow dovetail configurations
false
(default) | true
Flag to allow dovetail configurations, specified as
true
or false
. This property
specifies whether the alignment of one mate can extend past the beginning of
the alignment of the other mate and be considered concordant.
This property applies to paired-end reads only.
Example: 'AllowDovetail',true
Data Types: logical
AmbiguousPenalty
— Penalty for positions with ambiguous characters
1
(default) | nonnegative integer
Penalty for positions with ambiguous characters on the read sequence, reference sequence, or both, specified as a nonnegative integer.
Example:
'AmbiguousPenalty',2
Data Types: double
Encoding
— Encoding format of base quality
'Phred33'
(default) | 'Phred64'
| 'Solexa'
Encoding format of the base quality in the input files, specified as one
of the following: 'Phred33'
,
'Phred64'
, or 'Solexa'
.
Example: 'Encoding','Phred64'
Data Types: char
| string
ExcludeContain
— Flag to allow one mate alignment to contain other mate
false
(default) | true
Flag to allow one mate alignment to contain the alignment of the other
mate and to be considered concordant, specified as true
or false
.
This property applies to paired-end reads only.
Example: 'ExcludeContain',true
Data Types: logical
ExcludeDiscordant
— Flag to include discordant alignments
false
(default) | true
Flag to include discordant alignments, specified as
true
or false
. A discordant
alignment is an alignment where both mates align uniquely, but not in a way
that satisfies the paired-end constraints.
Example: 'ExcludeDiscordant',true
Data Types: logical
ExcludeMixed
— Flag to exclude mixed alignments
false
(default) | true
Flag to exclude mixed alignments, specified as true
or
false
. A mixed alignment consists of mate reads that
are not concordant or discordant, but align individually.
This property applies to paired-end reads only.
Example: 'ExcludeMixed',true
Data Types: logical
ExcludeOverlap
— Flag to allow mate alignment overlap
false
(default) | true
Flag to allow the alignment of one mate to overlap with the alignment of
the other mate and to be considered concordant, specified as
true
or false
.
Example: 'ExcludeOverlap',true
Data Types: logical
ExcludeUnaligned
— Flag to exclude reads that failed to align
false
(default) | true
Flag to exclude reads that failed to align, specified as
true
or false
.
Example: 'ExcludeUnaligned',true
Data Types: logical
ExtraBowtie2Command
— Additional options not included in object properties
''
(default) | character vector
Additional options not included in the object properties, specified as
a character vector. The character vector must be in the Bowtie 2
option syntax (prefixed by one or two dashes). The default value
is an empty character vector ''
.
Example: 'ExtraBowtie2Command','--version'
Data Types: char
| string
IgnoreQuality
— Flag to ignore read position quality
false
(default) | true
Flag to ignore the actual read position quality when a mismatch occurs,
specified as true
or false
. Setting
this property to true
allows the quality value at that
mismatched position to be the highest possible, regardless of the actual
value.
Example: 'IgnoreQuality',true
Data Types: logical
MatchBonus
— Reward added to alignment score
2
(default) | nonnegative integer
Reward added to the alignment score when a position in the read matches a position in the reference, specified as a nonnegative integer.
Example: 'MatchBonus',5
Data Types: double
MaxAmbiguousFunction
— Function governing maximum number of ambiguous characters
'L,0,0.15'
(default) | character vector | string
Function governing the maximum number of ambiguous characters allowed in a read, specified as a character vector or string.
The function has the format 'f,B,A'
, where
f is a function type, B is a constant term, and
A is a coefficient. Available function types are:
'C'
– Constant'L'
– Linear'S'
– Square root'G'
– Natural log
The resulting function is H(x) = B + A * f(x)
, where
x is the read length.
The default function is 'L,0,0.15'
, that is,
H(x) = 0 + 0.15 * x
.
Example: 'MaxAmbiguousFunction','L,-0.4,-0.6'
Data Types: char
| string
MemoryMappedIndex
— Flag to use memory mapping when loading index
false
(default) | true
Flag to use memory mapping (instead of file I/O) when loading the index,
specified as true
or false
. Memory
mapping allows many concurrent processes to share the memory image of the
index, resulting in a more efficient parallelization of the task.
Example: 'MemoryMappedIndex',true
Data Types: logical
MinScoreFunction
— Function governing minimum score threshold of alignment
character vector | string
Function governing the minimum score threshold of an alignment, specified as a character vector or string.
The function has the format 'f,B,A'
, where
f is a function type, B is a constant term, and
A is a coefficient. Available function types are:
'C'
– Constant'L'
– Linear'S'
– Square root'G'
– Natural log
The resulting function is H(x) = B + A * f(x)
, where
x is the read length.
For the 'EndToEnd'
alignment mode, the default function
is 'L,-0.6,-0.6'
. For the 'Local'
mode, the default function is 'G,20,8'
.
Example: 'MinScoreFunction','L,-0.4,-0.6'
Data Types: char
| string
MismatchPenalty
— Maximum and minimum values to compute mismatch penalty
[6 2]
(default) | two-element vector
Maximum and minimum values to compute the mismatch penalty during alignment, specified as a two-element vector. The first element is the maximum value and the second element is the minimum value.
A number less than or equal to the maximum value, and greater than or
equal to the minimum value is subtracted from the alignment score for each
position where a read character aligns to a reference character, the
characters do not match, and neither is an N
character.
Example: 'MismatchPenalty',[5 3]
Data Types: double
Mode
— Alignment mode
'EndToEnd'
(default) | 'Local'
Alignment mode, specified as 'EndToEnd'
or
'Local'
.
In the 'Local'
mode, only part of the read must align
to the reference, and some residues can be omitted (soft-clipped) to achieve
the best alignment score. In the 'EndToEnd'
mode, the
entire read must align without any soft-clipping.
Example: 'Mode','Local'
Data Types: char
| string
Nondeterministic
— Flag to reinitialize pseudo-random generator
false
(default) | true
Flag to reinitialize the pseudo-random generator for each read using the
current time, specified as true
or
false
. If true
, the alignments
reported for two identical reads can be different. The default value is
false
, that is, the pseudo-random generator is
reinitialized using a seed derived from read information and the seed
number.
Example: 'Nondeterministic',true
Data Types: logical
NoGapPositions
— Number of positions where gaps are not allowed
4
(default) | nonnegative integer
Number of positions at the beginning or end of each read where gaps are not allowed, specified as a nonnegative integer.
Example: 'NoGapPositions',5
Data Types: double
NumAlignments
— Maximum number of valid alignments to report
'Best'
(default) | 'All'
| positive integer
Maximum number of valid alignments to report before terminating the
search, specified as a positive integer, 'Best'
, or
'All'
. If you specify a positive integer
N, the function searches for up to
N distinct, valid alignments for each read.
'Best'
reports the best alignment for each read.
'All'
reports all the valid alignments for each read
sorted by alignment scores.
The alignment score for a paired-end alignment equals the sum of the alignment scores of individual mates.
Example: 'NumAlignments','All'
Data Types: double
| char
| string
NumReseedings
— Maximum number of reseeding attempts
2
(default) | nonnegative integer
Maximum number of reseeding attempts with repetitive seeds, specified as a nonnegative integer. During reseeding, the function chooses a new set of reads at different offsets to find more alignments.
Example: 'NumReseedings',5
Data Types: double
NumSeedExtensions
— Maximum number of consecutive seed extension attempts
15
(default) | nonnegative integer
Maximum number of consecutive seed extension attempts before getting a new seed, specified as a nonnegative integer. A seed extension fails if it does not yield an alignment with the best (or second-best) score.
Example: 'NumSeedExtensions',10
Data Types: double
NumSeedMismatches
— Number of allowed mismatches in seed alignment
0
(default) | 1
Number of allowed mismatches in a seed alignment during the multiseed
alignment, specified as 0
or 1
.
Example: 'NumSeedMismatches',1
Data Types: double
NumThreads
— Number of parallel threads to perform alignment
1
(default) | positive integer
Number of parallel threads to perform the alignment, specified as a positive integer. Threads run on separate processors or cores. Increasing the number of threads provides a significant increase in speed (close to linear) but also increases the memory footprint.
Example: 'NumThreads',4
Data Types: double
Offrate
— Offrate to use when reading index
NaN
(default) | positive integer
Offrate to use when reading the index to reduce the memory footprint, specified as a positive integer. The offrate must be greater than the offrate used to build the index.
Example: 'Offrate',20
Data Types: double
PadPositions
— Position in reference sequence where alignment begins
15
(default) | nonnegative integer
Position in the reference sequence where the alignment for each sequence begins, specified as a nonnegative integer.
Example: 'PadPositions',10
Data Types: double
ReadGapCosts
— Gap costs for opening and extending gap
[5 3]
(default) | two-element vector of nonnegative integers
Gap costs for opening and extending a gap on the read, specified as a
two-element vector of nonnegative integers. The first element is the cost of
opening a gap, and the second element is the cost of extending a gap. Given
the cost vector [GO
GE]
, a read gap of length
N is assigned a penalty of
GO + N *
GE
.
Example: 'ReadGapCosts',[4 2]
Data Types: double
ReadGroupID
— Read group ID to add on @RG
header line
''
(default) | character vector | string
Read group ID to add on the @RG
header line in the
output SAM report, specified as a character vector or string. If you specify
any read group ID, the function prints the @RG
header
line with the tag ID:
followed by the specified group
ID.
Example: 'ReadGroupID','ID1'
Data Types: char
| string
ReadGroup
— Read group information to add as field on @RG
header line
''
(default) | character vector | string
Read group information to add as a field on the @RG
header line in the output SAM report, specified as a character vector or
string. This property applies only if you specify
'ReadGroupID'
.
Example: 'ReadGroup','Control'
Data Types: char
| string
RefGapCosts
— Gap costs for opening and extending gap
[5 3]
(default) | two-element vector of nonnegative integers
Gap costs for opening and extending a gap on the reference, specified as a
two-element vector of nonnegative integers. The first element is the cost of
opening a gap, and the second element is the cost of extending a gap. Given
the cost vector [GO
GE]
, a reference gap of length
N is assigned a penalty of
GO + N *
GE
.
Example: 'RefGapCosts',[4 2]
Data Types: double
Reorder
— Flag to reorder SAM records
false
(default) | true
Flag to reorder SAM records to maintain the same order as in the input
files, specified as true
or false
.
This property applies only when the number of parallel threads is greater
than one. When you use one thread, the order of the records in the output is
the same as the order of the input.
Example: 'Reorder',true
Data Types: logical
Seed
— Number to set seed in pseudo-random number generator
0
(default) | nonnegative integer
Number to set the seed in the pseudo-random number generator, specified as a nonnegative integer.
Example: 'Seed',3
Data Types: double
SeedIntervalFunction
— Function governing distance between seed substrings
character vector | string
Function governing the distance between seed substrings during the multiseed alignment, specified as a character vector or string.
The function has the format 'f,B,A'
, where
f is a function type, B is a constant term, and
A is a coefficient. Available function types are:
'C'
– Constant'L'
– Linear'S'
– Square root'G'
– Natural log
The resulting function is H(x) = B + A * f(x)
, where
x is the read length.
For the 'EndToEnd'
alignment mode, the default function
is 'S,1,1.15'
. For the 'Local'
mode,
the default function is 'S,1,0.75'
.
Example: 'SeedIntervalFunction','S,2,2.15'
Data Types: char
| string
SeedLength
— Seed substring length to align during multiseed alignment
20
(default) | positive integer
Seed substring length to align during the multiseed alignment, specified as a positive integer.
Example: 'SeedLength',25
Data Types: double
Skip
— Number of reads to ignore
0
(default) | nonnegative integer
Number of reads to ignore from the beginning of the input files, specified as a nonnegative integer.
Example: 'Skip',5
Data Types: double
Trim3
— Number of residues to trim from 3' end
0
(default) | nonnegative integer
Number of residues to trim from the 3' end of each read before aligning, specified as a nonnegative integer.
Example: 'Trim3',5
Data Types: double
Trim5
— Number of residues to trim from 5' end
0
(default) | nonnegative integer
Number of residues to trim from the 5' end of each read before aligning, specified as a nonnegative integer.
Example: 'Trim5',5
Data Types: double
UpTo
— Number of reads to consider from beginning of input files
Inf
(default) | positive integer
Number of reads to consider from the beginning of input files, specified
as a positive integer. The default value is Inf
, that is,
all reads are considered.
Example: 'UpTo',1000
Data Types: double
Object Functions
getBowtie2Command | Translate object properties to Bowtie 2 options |
getBowtie2Table | Retrieve table with object properties and equivalent Bowtie 2 options |
preset | Set combination of alignment options |
run | Map sequence reads to reference sequence using Bowtie 2 |
Examples
Align Reads to Reference Sequence Using Bowtie 2
Build a set of index files for the Drosophila genome. An error message appears if you do not have the Bowtie 2 Support Package for Bioinformatics Toolbox installed when you run the function. Click the provided link to download the package from the Add-on menu.
For this example, the reference sequence Dmel_chr4.fa
is already
provided with the toolbox.
status = bowtie2build('Dmel_chr4.fa', 'Dmel_chr4_index');
If the index build is successful, the function returns 0
and
creates the index files (*.bt2
) in the current folder. The files have
the prefix 'Dmel_chr4_index'
.
Sometimes the index files exist, and you want to know the reference sequence used to
build the index. In this case, use the bowtie2inspect
function to get more information about the
reference.
bowtie2inspect('Dmel_chr4', 'Dmel_chr4_retrieved.fa');
By default, the output file Dmel_chr4_retrieved.fa
contains the sequence of the reference. You can also get a summary information about the reference name and lengths instead of the actual sequence. For details on the available options, see Bowtie2InspectOptions
.
Once the index is ready, map the read sequences to the reference using the
bowtie2
function. The paired-end read files
(SRR6008575_10k_1.fq
and SRR6008575_10k_2.fq
)
are already provided with the toolbox.
bowtie2('Dmel_chr4','SRR6008575_10k_1.fq','SRR6008575_10k_2.fq','SRR6008575_10k_chr4.sam');
The output is a SAM-formatted file that contains the mapping results.
You can specify different alignment options by passing in a Bowtie 2 syntax string or
using a Bowtie2AlignOptions
object.
Suppose you want to trim some residues from the 3'
end before
aligning. First, create a Bowtie2AlignOptions
object.
alignOpt = Bowtie2AlignOptions;
Trim four residues from the 3'
end before aligning.
alignOpt.Trim3 = 4;
Map reads to the reference using the specified alignment option.
flag = bowtie2('Dmel_chr4','SRR6008575_10k_1.fq','SRR6008575_10k_2.fq','SRR6008575_10k_chr4_trimmed.sam',alignOpt);
References
[1] Langmead, B., and S. Salzberg. "Fast gapped-read alignment with Bowtie 2." Nature Methods. 9, 2012, 357–359.
Version History
Introduced in R2018a
See Also
bowtie2
| bowtie2inspect
| bowtie2build
| Bowtie2BuildOptions
| Bowtie2InspectOptions
External Websites
MATLAB Command
You clicked a link that corresponds to this MATLAB command:
Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.
Select a Web Site
Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .
You can also select a web site from the following list:
How to Get Best Site Performance
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.
Americas
- América Latina (Español)
- Canada (English)
- United States (English)
Europe
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)