Extract data from HTML Table
5 views (last 30 days)
Show older comments
I have a HTML file (attached at .txt) with a table of data I would like to extract. I found the FEX file htmltabletocell, but I don't thtink that will work for me as my table has several repetitions of simalar data blocks with the same initial string.
Ultimately, I would like to extract the data values from only a few of the selected fields below (for example Date/Time and Diagnostic Reason only)
Any help is much appreciated.
0 Comments
Accepted Answer
Mathieu NOE
on 5 Mar 2021
hello Marcus
here you are, simple code shipset :
Filename = 'html_table.txt';
[myDate, myDiagnostic] = extract_data(Filename)
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
function [myDate, myDiagnostic] = extract_data(Filename)
fid = fopen(Filename);
tline = fgetl(fid);
% initialization
k = 0;
p = 0;
q = 0;
k_parameter = 0;
q_parameter = 0;
myDate = [];
myDiagnostic = [];
while ischar(tline)
k = k+1; % loop over line index
% retrieve line Date/Time
if contains(tline,'Date/Time')
k_parameter = k;
p = p+1;
end
if p>0 & k == k_parameter + 3
myDate = [myDate; cellstr(tline)];
end
% retrieve line Date/Time
if contains(tline,'Diagnostic')
q_parameter = k;
q = q+1;
end
if q>0 & k == q_parameter + 3
myDiagnostic = [myDiagnostic; cellstr(tline)];
end
tline = fgetl(fid);
end
fclose(fid);
end
gives following results :
myDate =
3×1 cell array
{'2021-01-20 18:17'}
{'2021-01-20 18:16'}
{'2020-12-10 22:44'}
myDiagnostic =
3×1 cell array
{'LARGE PT'}
{'LARGE PT'}
{'LG PT' }
0 Comments
More Answers (0)
See Also
Categories
Find more on Logical in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!