featureparse
Parse features from GenBank, GenPept, or EMBL data
Syntax
FeatStruct
= featureparse(Features
)
FeatStruct
=
featureparse(Features
, ...'Feature', FeatureValue
,
...)
FeatStruct
= featureparse(Features
,
...'Sequence', SequenceValue
, ...)
Input Arguments
Features | Any of the following:
|
FeatureValue | Name of a feature contained in Features .
When specified, featureparse returns only the substructure
that corresponds to this feature. If there are multiple features with
the same FeatureValue , then FeatStruct is
an array of structures. |
SequenceValue | Property to control the extraction, when possible, of the sequences
respective to each feature, joining and complementing pieces of the
source sequence and storing them in the Sequence field
of the returned structure, FeatStruct .
When extracting the sequence from an incomplete CDS feature, featureparse uses
the codon_start qualifier to adjust the frame of
the sequence. Choices are true or false (default). |
Output Arguments
FeatStruct | Output structure containing a field for every database feature.
Each field name in FeatStruct matches the
corresponding feature name in the GenBank, GenPept, or EMBL database,
with the exceptions listed in the table below. Fields in FeatStruct contain
substructures with feature qualifiers as fields. In the GenBank,
GenPept, and EMBL databases, for each feature, the only mandatory
qualifier is its location, which featureparse translates
to the field Location . When possible, featureparse also
translates this location to numeric indices, creating an Indices field.Note If you use the |
Description
parses
the features from FeatStruct
= featureparse(Features
)Features
, which contains GenBank,
GenPept, or EMBL features. Features
can
be a:
Character vector or string containing GenBank, GenPept, or EMBL features
MATLAB character array including text describing GenBank, GenPept, or EMBL features
MATLAB structure with fields corresponding to GenBank, GenPept, or EMBL data, such as those returned by
genbankread
,genpeptread
,emblread
,getgenbank
,getgenpept
, orgetembl
FeatStruct
is the output structure
containing a field for every database feature. Each field name in FeatStruct
matches
the corresponding feature name in the GenBank, GenPept, or EMBL
database, with the following exceptions.
Feature Name in GenBank, GenPept, or EMBL Database | Field Name in MATLAB Structure |
---|---|
-10_signal | minus_10_signal |
-35_signal | minus_35_signal |
3'UTR | three_prime_UTR |
3'clip | three_prime_clip |
5'UTR | five_prime_UTR |
5'clip | five_prime_clip |
D-loop | D_loop |
Fields in FeatStruct
contain substructures
with feature qualifiers as fields. In the GenBank, GenPept, and
EMBL databases, for each feature, the only mandatory qualifier is
its location, which featureparse
translates to
the field Location
. When possible, featureparse
also
translates this location to numeric indices, creating an Indices
field.
Note
If you use the Indices
field to extract sequence
information, you may need to complement the sequences.
calls FeatStruct
= featureparse
(Features
, ...'PropertyName
', PropertyValue
,
...)featureparse
with optional
properties that use property name/property value pairs. You can specify
one or more properties in any order. Each PropertyName
must
be enclosed in single quotation marks and is case insensitive. These
property name/property value pairs are as follows:
returns only the substructure that corresponds
to FeatStruct
=
featureparse(Features
, ...'Feature', FeatureValue
,
...)FeatureValue
, the name of a feature
contained in Features
. If there are multiple
features with the same FeatureValue
, then FeatStruct
is
an array of structures.
controls
the extraction, when possible, of the sequences respective to each
feature, joining and complementing pieces of the source sequence and
storing them in the field FeatStruct
= featureparse(Features
,
...'Sequence', SequenceValue
, ...)Sequence
. When extracting
the sequence from an incomplete CDS feature, featureparse
uses
the codon_start
qualifier to adjust the frame of
the sequence. Choices are true
or false
(default).
Examples
Version History
Introduced in R2006b
See Also
emblread
| genbankread
| genpeptread
| getgenbank
| getgenpept