Documentation

sffread

Read data from SFF file

Syntax

SFFStruct = sffread(File)

sffread(..., 'Blockread', BlockreadValue, ...)
sffread(..., 'Feature', FeatureValue, ...)

Description

SFFStruct = sffread(File) reads a Standard Flowgram Format (SFF) file and returns the data in a MATLAB® array of structures.

sffread(..., 'PropertyName', PropertyValue, ...) calls sffread with optional properties that use property name/property value pairs. You can specify one or more properties in any order. Enclose each PropertyName in single quotation marks. Each PropertyName is case insensitive. These property name/property value pairs are as follows:


sffread(..., 'Blockread', BlockreadValue, ...)
reads a single sequence entry or block of sequence entries from an SFF file containing multiple sequences.

sffread(..., 'Feature', FeatureValue, ...) specifies the information to include in the return structure.

Input Arguments

File

String specifying a file name or path and file name of an SFF file produced by version 1.0 of the Genome Sequencer System data analysis software from 454 Life Sciences®. If you specify only a file name, that file must be on the MATLAB search path or in the current folder.

BlockreadValue

Scalar or vector that controls the reading of a single sequence entry or block of sequence entries from an SFF file containing multiple sequences. Enter a scalar N, to read the Nth entry in the file. Enter a 1-by-2 vector [M1, M2], to read a block of entries starting at the M1 entry and ending at the M2 entry. To read all remaining entries in the file starting at the M1 entry, enter a positive value for M1 and enter Inf for M2.

FeatureValue

String specifying the information to include in the output structure. The string includes letters from the alphabet H, S, Q, C, F, and I, which represent the fields Header, Sequence, Quality, Clipping, FlowgramValue, and FlowgramIndex, respectively.

Default: 'HSQ'

Output Arguments

SFFStruct

Array of structures containing information from an SFF file. There is one structure for each read or entry in the file. Each structure contains one or more of the following fields.

FieldDescription
HeaderUniversal accession number.
SequenceNumeric representation of nucleotide sequence.
QualityPer-base quality scores.
ClippingClipping boundary positions.
FlowgramValueSequence of flowgram intensity values.
FlowgramIndexSequence of flowgram intensity indices.

Examples

The SFF file, SRR013472.sff, used in these examples is not provided with the Bioinformatics Toolbox™ software. You can download sample SFF files from:

http://www.ncbi.nlm.nih.gov/Traces/sra/sra.cgi?cmd=show&f=main&m=main&s=main

Read an entire SFF file:

% Read the contents of an entire SFF file into an
% array of structures
reads = sffread('SRR013472.sff')

reads = 

3546x1 struct array with fields:
    Header
    Sequence
    Quality

Read a block of entries from an SFF file:

% Read only the header and sequence information of the
% first five reads from an SFF file into an array of structures
reads5 = sffread('SRR013472.sff', 'block', [1 5], 'feature', 'hs')

reads5 = 

5x1 struct array with fields:
    Header
    Sequence
Was this topic helpful?