Documentation

seqinsertgaps

Insert gaps into nucleotide or amino acid sequence

Syntax

NewSeq = seqinsertgaps(Seq, Positions)
NewSeq = seqinsertgaps(Seq, GappedSeq)
NewSeq = seqinsertgaps(Seq, GappedSeq, Relationship)

Input Arguments

SeqEither of the following:
  • String specifying a nucleotide or amino acid sequence

  • MATLAB® structure containing a Sequence field

PositionsVector of integers to specify the positions in Seq before which to insert a gap.
GappedSeqEither of the following:
  • String specifying a nucleotide or amino acid sequence

  • MATLAB structure containing a Sequence field

RelationshipInteger specifying the relationship between Seq and GappedSeq. Choices are:
  • 1 — Both sequences use the same alphabet, that is both are nucleotide sequences or both are amino acid sequences.

  • 3Seq contains nucleotides representing codons and GappedSeq contains amino acids (default).

Output Arguments

NewSeqSequence with gaps inserted, represented by a string specifying a nucleotide or amino acid sequence.

Description

NewSeq = seqinsertgaps(Seq, Positions) inserts gaps in the sequence Seq before the positions specified by the integers in the vector Positions.

NewSeq = seqinsertgaps(Seq, GappedSeq) finds the gap positions in the sequence GappedSeq, then inserts gaps in the corresponding positions in the sequence Seq.

NewSeq = seqinsertgaps(Seq, GappedSeq, Relationship) specifies the relationship between Seq and GappedSeq. Enter 1 for Relationship when both sequences use the same alphabet, that is both are nucleotide sequences or both are amino acid sequences. Enter 3 for Relationship when Seq contains nucleotides representing codons and GappedSeq contains amino acids. Default is 3.

Examples

  1. Retrieve two nucleotide sequences from the GenBank® database for the neuraminidase (NA) protein of two strains of the Influenza A virus (H5N1).

     hk01 = getgenbank('AF509094');
     vt04 = getgenbank('DQ094287');
    
  2. Extract the coding region from the two nucleotide sequences.

    hk01_cds = featuresparse(hk01,'feature','CDS','Sequence',true);
    vt04_cds = featuresparse(vt04,'feature','CDS','Sequence',true);
    
  3. Align the amino acids sequences converted from the nucleotide sequences.

     [sc,al]=nwalign(nt2aa(hk01_cds),nt2aa(vt04_cds),'extendgap',1);
    
  4. Use the seqinsertgaps function to copy the gaps from the aligned amino acid sequences to their corresponding nucleotide sequences, thus codon-aligning them.

     hk01_aligned = seqinsertgaps(hk01_cds,al(1,:))
     vt04_aligned = seqinsertgaps(vt04_cds,al(3,:))
    
  5. Once you have code aligned the two sequences, you can use them as input to other functions such as dnds, which calculates the synonymous and nonsynonymous substitutions rates of the codon-aligned nucleotide sequences. By setting Verbose to true, you can also display the codons considered in the computations and their amino acid translations.

    [dn,ds] = dnds(hk01_aligned,vt04_aligned,'verbose',true)
    

See Also

| | |

Was this topic helpful?