Error message: iMAT-CobraTools

1 view (last 30 days)
Deborah
Deborah on 7 Oct 2024
Commented: Walter Roberson on 8 Oct 2024
Hi Everyone! I'm a MatLab begginner and I'd like to run the iMAT plugin to reach a comparative proteomic analysis.
My input is an Excel file with data about 2 growth conditions: Wildtype and Fe-growth.
initCobraToolbox(false)
changeCobraSolver('gurobi');
% Load the expression data from an Excel file
expressionDataFile = 'C:\Users\usuario\cobratoolbox\160824_BP-Fe.xlsx'; % Replace with the correct path
[num, txt, raw] = xlsread(expressionDataFile);
% Load the model .mat file
modelFile = 'C:\Users\usuario\cobratoolbox\iBP1870.mat';
load(modelFile);
% Select the columns I want to analyze from the Excel file
geneNames = txt(2:end, 3);
expressionLevels = num(:, 17);
% Check if there are any cells with NaN values
nanIndices = isnan(expressionLevels);
if any(nanIndices)
warning('There are %d NaN values in the expression levels.', sum(nanIndices));
end
% Remove NaN values from the expression data
expressionLevels = expressionLevels(~any(isnan(expressionLevels), 2), :);
% Assuming `nanIndices` is the index of the elements that were NaN in expressionLevels
% and that you have already used it to remove the NaN values:
% Create indices that are NOT NaN
nanIndices = ~isnan(expressionLevels);
% Filter geneNames using nanIndices, i.e., removing gene names that returned NaN values
filteredGeneNames = geneNames(nanIndices);
% Filter genes and expression levels that are in model.genes
isMember = ismember(filteredGeneNames, model.genes);
if any(~isMember)
warning('Some genes in expressionDataadjusted are not in model.genes');
end
filteredGenes = filteredGeneNames(isMember);
% Run this:
expressionRxns = mapExpressionToReactions(model, filteredGenes, filteredExpressionLevels);
% Set the expression level thresholds
threshold_ub = 50.0; % Upper threshold for expression levels
threshold_lb = 5.0; % Lower threshold for expression levels
% This is where I run iMAT:
tissueModel = iMAT(model, expressionRxns, threshold_lb, threshold_ub);
tissueModel = iMAT(model, expressionRxns, threshold_lb, threshold_ub);
Warning: There are 216 NaN values in the expression levels.
Warning: Some genes in expressionDataadjusted are not in model.genes
RHindex:
53
55
81
156
184
197
........
RLindex:
1
3
4
5
6
7
9
10
11
18
19
20
21
22
1652
1653
1655
1656
1658
1660
1662
1664
1665
1666
1667
1668
1669
1672
Verificando lb y ub para NaN o Inf...
Error using iMAT (line 61)
Vector ub contiene NaN o Inf valores
However, I can't run this code because I keep getting error messages relationated to RHindex.
There is not Nan o Inf values. It was checked
Someone knows how it can solve?
Thanks a lot
DEb

Accepted Answer

Deborah
Deborah on 7 Oct 2024
ok..thank you..I attach the file

More Answers (1)

Walter Roberson
Walter Roberson on 7 Oct 2024
Moved: Walter Roberson on 7 Oct 2024
expressionDataFile = '160824_BP-Fe.xlsx';
T = readtable(expressionDataFile, 'VariableNamingRule', 'preserve');
Warning: Table variable names were truncated to the length namelengthmax. The original names are saved in the VariableDescriptions property.
summary(T)
T: 1309x25 table Variables: Uniprot Accession: cell array of character vectors (Uniprot Accession) Uniprot ID: cell array of character vectors (Uniprot ID) locus: cell array of character vectors (locus) gene: cell array of character vectors (gene) description: cell array of character vectors (description) number total peptides [WT:WT Fe-:Hfq_Hfq Fe-, without OFF prote: double (number total peptides [WT:WT Fe-:Hfq_Hfq Fe-, without OFF protei...) number unique peptides [WT:WT Fe-:Hfq_Hfq Fe-, without OFF prot: double (number unique peptides [WT:WT Fe-:Hfq_Hfq Fe-, without OFF prote...) number of peptides for quantification [WT:WT Fe-:Hfq_Hfq Fe-, w: double (number of peptides for quantification [WT:WT Fe-:Hfq_Hfq Fe-, wi...) C1_WT_control_BR1: double (C1_WT_control_BR1) C2_WT_control_BR2: double (C2_WT_control_BR2) C3_WT_control_BR3: double (C3_WT_control_BR3) C4_WT_control_BR4: double (C4_WT_control_BR4) E1_WT_Fe-lim_BR1: double (E1_WT_Fe-lim_BR1) E2_WT_Fe-lim_BR2: double (E2_WT_Fe-lim_BR2) E3_WT_Fe-lim_BR3: double (E3_WT_Fe-lim_BR3) E4_WT_Fe-lim_BR4: double (E4_WT_Fe-lim_BR4) WT control average: double (WT control average) WT Fe- average: double (WT Fe- average) WT control standard deviation: double (WT control standard deviation) WT Fe- standard deviation: double (WT Fe- standard deviation) WT control CV [%]: double (WT control CV [%]) WT Fe- CV [%]: double (WT Fe- CV [%]) WT Fe-/WT control: double (WT Fe-/WT control) WT Fe-/WT control p-value: double (WT Fe-/WT control p-value) WT Fe-/WT control q-value: double (WT Fe-/WT control q-value) Statistics for applicable variables: NumMissing Min Median Max Mean Std UniprotAccession 0 UniprotID 0 locus 0 gene 0 description 0 numberTotalPeptides_WT_WTFe__Hfq_HfqFe__WithoutOFFProte 121 0.0854 0.2550 19 0.9510 1.4072 numberUniquePeptides_WT_WTFe__Hfq_HfqFe__WithoutOFFProt 119 0.0854 0.2549 19 0.9452 1.4000 numberOfPeptidesForQuantification_WT_WTFe__Hfq_HfqFe__W 0 0.0854 3 3 2.0535 1.2091 C1_WT_control_BR1 311 1.1642e+06 217696632 3.6296e+10 7.1089e+08 2.0728e+09 C2_WT_control_BR2 320 2.2051e+05 217695616 4.6086e+10 6.8294e+08 2.2208e+09 C3_WT_control_BR3 310 3.2960e+06 217695616 4.0850e+10 6.4484e+08 2.0203e+09 C4_WT_control_BR4 332 1.5889e+05 217695616 4.3209e+10 8.0607e+08 2.4268e+09 E1_WT_Fe_lim_BR1 220 7.8416e+04 217696032 3.8837e+10 7.6263e+08 2.1097e+09 E2_WT_Fe_lim_BR2 216 9.6895e+04 217695616 2.7517e+10 6.9164e+08 1.7477e+09 E3_WT_Fe_lim_BR3 221 1.0184e+05 217695904 4.1295e+10 7.5235e+08 2.2386e+09 E4_WT_Fe_lim_BR4 217 7.0382e+05 217696088 3.2369e+10 7.4632e+08 2.0445e+09 WTControlAverage 310 4.2700e+06 218899698 4.1610e+10 7.0495e+08 2.1508e+09 WTFe_Average 214 2170451 222730012 3.5005e+10 7.3520e+08 2.0007e+09 WTControlStandardDeviation 311 3.4850e+05 3.9544e+07 6.0446e+09 1.4589e+08 3.9954e+08 WTFe_StandardDeviation 216 8.5928e+05 3.7720e+07 6.8503e+09 1.4324e+08 4.3356e+08 WTControlCV___ 311 2.6730 20.8692 141.2817 27.2149 20.2196 WTFe_CV___ 216 1.1092 19.7966 151.4677 26.0117 20.3106 WTFe__WTControl 404 0.0354 1.2057 317.7651 1.9056 10.6628 WTFe__WTControlP_value 410 3.4392e-08 0.0539 0.9974 0.1940 0.2651 WTFe__WTControlQ_value 410 3.0919e-05 0.1077 0.9974 0.2408 0.2812
The "num" output of xlsread() would skip leading character vector arrays, so would skip the first four columns.
expressionLevels = num(:, 17);
That would refer to column 17 of the num output, which would be column 21 of T.
summary(T(:,21))
1309x1 table Variables: WT control CV [%]: double (WT control CV [%]) Statistics for applicable variables: NumMissing Min Median Max Mean Std WTControlCV___ 311 2.6730 20.8692 141.2817 27.2149 20.2196
That column is missing 311 inputs. Let's find them
idxmissing = ismissing(T{:,21});
obj = detectImportOptions(expressionDataFile, 'VariableNamingRule', 'preserve');
obj = setvartype(obj, 21, 'char');
T2 = readtable(expressionDataFile, obj);
Warning: Table variable names were truncated to the length namelengthmax. The original names are saved in the VariableDescriptions property.
summary(T2(:,21))
1309x1 table Variables: WT control CV [%]: cell array of character vectors (WT control CV [%]) Statistics for applicable variables: NumMissing WTControlCV___ 0
T2(idxmissing, 21)
ans = 311x1 table
WT control CV [%] _________________ {'na'} {'na'} {'na'} {'na'} {'na'} {'na'} {'na'} {'na'} {'na'} {'na'} {'na'} {'na'} {'na'} {'na'} {'na'} {'na'}
So the entries might not be inf or nan, but they are 'na'
  2 Comments
Deborah
Deborah on 8 Oct 2024
Thank you!
I already checked the Excel table. You were right, the numeric column is the 12th one. I ran it again, and it gave the same error. I reviewed each of the outputs, and a doubt arose regarding the limits for the model and the expression data configuration. For the model, it's 99999, and for the expression data, it's more than 1*10^8. Could this be the cause of the error message about NaN and Inf?
Verificando lb y ub para NaN o Inf...
Error using iMAT (line 61)
Vector ub contiene NaN o Inf valores
I looked for information, and it seems possible to normalize the expression data. Or is there another method that could be used to adjust my numerical data?
Walter Roberson
Walter Roberson on 8 Oct 2024
No, the base problem is that the file contains 'na' entries that are being converted to nan or inf by xlsread().

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!