ilmnbslookup
Look up Illumina BeadStudio target (probe) sequence and annotation information
Syntax
AnnotStruct =
ilmnbslookup(AnnotationFile, ID)
AnnotStruct =
ilmnbslookup(AnnotationFile, ID,
'LookUpField', LookUpFieldValue)
Input Arguments
AnnotationFile | Character vector or string specifying a file name or a path and file name of an Illumina® annotation file (CSV, BGX, or TXT format). If you specify only a file name, that file must be on the MATLAB® search path or in the current folder. Tip You can download Illumina annotation files, such as
|
ID | Character vector, string, string vector, or cell array of character vectors representing a unique identifier(s) for one or more targets (probes) on an Illumina microarray. Tip By default, |
LookUpFieldValue | Field in Tip Set this property so that it corresponds to the |
Output Arguments
AnnotStruct | Structure containing the probe
sequence and annotation information for one or more targets (probes)
specified by
|
Description
returns AnnotStruct =
ilmnbslookup(AnnotationFile, ID)AnnotStruct,
a structure containing probe sequence and annotation information for
one or more targets (probes) specified by ID,
and by AnnotationFile, an Illumina annotation
file (CSV, BGX, or TXT format).
AnnotStruct contains the same fields
as AnnotationFile. The fields are described
in the following two tables.
Structure Created from Illumina CSV Annotation File
| Field | Description |
|---|---|
Search_key | Internal identifier for the target, useful for custom design array |
Target | Unique identifier for the target |
ProbeId | Illumina probe identifier |
Gid | GenBank® identifier for the gene |
Transcript | Illumina internal transcript identifier |
Accession | GenBank accession number for the gene |
Symbol | Typically, the gene symbol |
Type | Probe type |
Start | Starting position of the probe sequence in the GenBank record |
Probe_Sequence | Sequence of the probe |
Definition | Definition field from the GenBank record |
Ontology | Gene Ontology terms associated with the gene |
Synonym | Synonyms for the gene (from the GenBank record) |
Structure Created from a BGX or TXT Annotation File
| Field | Description |
|---|---|
Accession | GenBank accession number for the gene |
Array_Address_Id | Decoder identifier |
Chromosome | Chromosome on which the gene is located |
Cytoband | Cytogenetic banding region of the chromosome on which the gene associated with the target is located |
Definition | Definition field from the GenBank record |
Entrez_Gene_ID | Entrez Gene database identifier for the gene |
GI | GenBank identifier for the gene |
ILMN_Gene | Illuminainternal gene symbol |
Obsolete_Probe_Id | Probe identifier before BGX annotation files |
Ontology_Component | Gene Ontology cellular components associated with the gene |
Ontology_Function | Gene Ontology molecular functions associated with the gene |
Ontology_Process | Gene Ontology biological processes associated with the gene |
Probe_Chr_Orientation | Orientation of the probe on the NCBI genome build |
Probe_Coordinates | Genomic position of the probe on the NCBI genome build |
Probe_Id | Illuminaprobe identifier |
Probe_Sequence | Sequence of the probe |
Probe_Start | Start position of the probe relative to the 5' end
of the source transcript sequence |
Probe_Type | Information about what the probe is targeting |
Protein_Product | NCBI protein accession number |
RefSeq_ID | Identifier from the NCBI RefSeq database |
Reporter_Composite_map | Information associated with control probes |
Reporter_Group_Name | Information associated with control probes |
Reporter_Group_id | Information associated with control probes |
Search_Key | Internal identifier for the target, useful for custom design array |
Source | Source from which the transcript sequence was obtained |
Source_Reference_ID | Source's identifier |
Species | Species associated with the gene |
Symbol | Typically, the gene symbol |
Synonyms | Synonyms for the gene (from the GenBank record) |
Transcript | Illuminainternal transcript identifier |
Unigene_ID | Identifier from the NCBI UniGene database |
looks
for AnnotStruct =
ilmnbslookup(AnnotationFile, ID,
'LookUpField', LookUpFieldValue)ID in the annotation file in the field
specified by LookUpFieldValue. Default
is the Search_key field.
Examples
Note
The gene expression file, TumorAdjacent-probe-raw.txt,
and the annotation file, HumanRef-8_V3_0_R0_11282963_A.bgx,
used in the following examples are not provided with the Bioinformatics Toolbox™ software.
Read the contents of a tab-delimited file exported from the Illumina BeadStudio™ software into a MATLAB structure.
ilmnStruct = ilmnbsread('TumorAdjacent-probe-raw.txt') ilmnStruct = Header: [1x1 struct] TargetID: {22184x1 cell} ColumnNames: {1x37 cell} Data: [22184x37 double] TextColumnNames: {1x23 cell} TextData: {22184x23 cell}Find the number of the
Search_keycolumn in theTextColumnNamescell array, which is returned in theilmnStructstructure by theilmnbsreadfunction.srchCol = find(strcmpi('Search_Key',ilmnStruct.TextColumnNames)) srchCol = 1Look up the probe sequence and annotation information for the 10th entry in the annotation file,
HumanRef-8_V3_0_R0_11282963_A.bgx.annotation = ilmnbslookup('HumanRef-8_V3_0_R0_11282963_A.bgx',... ilmnStruct.TextData{10,srchCol}) annotation = Accession: 'NM_144670.2' Array_Address_Id: '0004050154' Chromosome: '12' Cytoband: '12p13.31b' Definition: 'Homo sapiens alpha-2-macroglobulin-like 1 (A2ML1), mRNA.' Entrez_Gene_ID: '144568' GI: '74271844' ILMN_Gene: 'A2ML1' Obsolete_Probe_Id: '' Ontology_Component: '' Ontology_Function: 'endopeptidase inhibitor activity [goid 4866] [evidence IEA]' Ontology_Process: '' Probe_Chr_Orientation: '+' Probe_Coordinates: '8920412-8920461' Probe_Id: 'ILMN_2136495' Probe_Sequence: 'TGTAATCGCAGCCCCTTGGAAGGCCAAGGCAGGAGAATCGCCTCAACACT' Probe_Start: '4889' Probe_Type: 'S' Protein_Product: 'NP_653271.2' RefSeq_ID: 'NM_144670.2' Reporter_Composite_map: '' Reporter_Group_Name: '' Reporter_Group_id: '' Search_Key: 'ILMN_17375' Source: 'RefSeq' Source_Reference_ID: 'NM_144670.2' Species: 'Homo sapiens' Symbol: 'A2ML1' Synonyms: [1x141 char] Transcript: 'ILMN_17375' Unigene_ID: ''
Use the ilmnbslookup function with the 'LookUpField' property
to look up the annotation information for all targets located on chromosome
12 in the annotation file, HumanRef-8_V3_0_R0_11282963_A.bgx.
chr12annotation = ilmnbslookup('HumanRef-8_V3_0_R0_11282963_A.bgx',...
'12','LookUpField','Chromosome')
chr12annotation =
Accession: {1x1186 cell}
Array_Address_Id: {1x1186 cell}
Chromosome: {1x1186 cell}
Cytoband: {1x1186 cell}
Definition: {1x1186 cell}
Entrez_Gene_ID: {1x1186 cell}
GI: {1x1186 cell}
ILMN_Gene: {1x1186 cell}
Obsolete_Probe_Id: {1x1186 cell}
Ontology_Component: {1x1186 cell}
Ontology_Function: {1x1186 cell}
Ontology_Process: {1x1186 cell}
Probe_Chr_Orientation: {1x1186 cell}
Probe_Coordinates: {1x1186 cell}
Probe_Id: {1x1186 cell}
Probe_Sequence: {1x1186 cell}
Probe_Start: {1x1186 cell}
Probe_Type: {1x1186 cell}
Protein_Product: {1x1186 cell}
RefSeq_ID: {1x1186 cell}
Reporter_Composite_map: ''
Reporter_Group_Name: ''
Reporter_Group_id: ''
Search_Key: {1x1186 cell}
Source: {1x1186 cell}
Source_Reference_ID: {1x1186 cell}
Species: {1x1186 cell}
Symbol: {1x1186 cell}
Synonyms: {1x1186 cell}
Transcript: {1x1186 cell}
Unigene_ID: {1x1186 cell}The output structure indicates that there are 1,186 targets located on chromosome 12.
Version History
Introduced in R2008a