Clear Filters
Clear Filters

Stop xlsread converting hex values to double

4 views (last 30 days)
Reading in data using xlsread the raw values are not exactly raw. The cells contain hex data and whenever e is present (2e39) it is converting it to a double as 2.000e+39. Is there anyway to force it to read in as '2e39'? I found were it's being read in xlsreadCOM as rawData = DataRange.Value, I also see that VBA has xlRangeValueXMLSpreadsheet that is supposed to read in the exact value but I'm not sure how to apply that in Matlab. Using Matlab2012a

Accepted Answer

Walter Roberson
Walter Roberson on 4 Jun 2018
We have seen this difficulty for .xlsx files in the past. .xlsx files code all numbers as strings, so MATLAB normally converts to numeric any string that str2double() says looks numeric enough to convert to double. This causes various problems, including the one you indicated, and including losing leading 0's on strings that just happen to consist entirely of numbers.
My investigation indicates that there are two ways of storing strings in .xlsx files, one that uses a "shared strings" table, and the other that uses (for lack of better term) "immediate" strings. Strings that are in the "shared strings" table do not seem to be processed the same way that "immediate" strings are.
However.. even for .xls files, there is a challenge. The [num, txt, raw] processing that is done mostly happens by attempting to convert to numeric form, and if the numeric conversion works then the entry is filled into the num table and the corresponding raw entry has the converted value; entries that cannot be converted are put in the txt table and the corresponding raw entry is left as text. This means that if you have a string that looks like a number, it may get converted.
We find that sometimes using readtable() works better at getting the correct data type.
  9 Comments
Walter Roberson
Walter Roberson on 6 Jun 2018
NumHeaders = 17; %adjust as needed, can be 0
S = fileread('AppropriateFileName.csv');
filelines = regexp(S, '\r?\n', 'split');
if isempty(filelines{end})); filelines(end) = []; end %very common that file ends with \n leading to empty file field
filelines(1:NumHeaders) = [];
filefields = regexp(filelines, ',', 'split');
Now filefields is a cell array, and each entry in it is a cell array with as many entries as there were fields.
This structure can be less fun to deal with than some other structures, but if you need to process the lines with fewer or more fields differently than the other lines, then you need to maintain something similar to this (though possibly you might want to look for groups of lines with the same number of fields and merge them into blocks.)
It is practical to proceed from here to
empty_field = ''; %could also be numeric
numfields = cellfun(@length, filefields);
maxfields = max(numfields);
pad = repmat({empty_field}, 1, maxfields);
FirstFields = @(S) S(1:maxfields);
PadField = @(S) FirstFields( [S, pad] );
padded_fields = cellfun(PadField, filefields, 'Uniform', 0);
data = vertcat(padded_fields{:});
Now data would be a rectangular cell array in which all lines have been padded out to the maximum number of fields used in the file, with the empty fields using the content of empty_field as the placeholder
kevin
kevin on 7 Jun 2018
That looks like it. Thanks a lot this was really helpful.

Sign in to comment.

More Answers (0)

Tags

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!