Reading data from a specific CSV file

Hi all, I need to write a program for reading csv files, produced while recording trajectories of reflective markers (points) with a 3D camera. The first 44 lines of the csv file do not contain important data of the tracked markers. All the next lines contain comma separated data taken with a frequency of 120 Hz. So the 45th line is taken at time 0 and so on. These lines include x,y and z coordinates of all the tracked markers at this time. It is not necessary that the same markers were tracked in all the lines, therefore markers with new id numbers appear (Marker-idNumber).
I would like to store the x,y,z data for each marker in a cell array with marker names, which contain the marker's coordinates through time. For beginning we could assume that same markers appear in each line!- I guess this is a much easier task and I have almost done it (look ahead).
I found an existing program, which performs a similar job. I adapted the program for my needs, but due to my lack of programming knowledge I failed to succeed. If anyone knows what I am doing wrong I would appreciate your help.
Here is my code up to now (example data file attached):
clear all
openedFile = fopen('ReducedData.csv','r');
currentLineNo = 0;
%we go through the first lines which dont contain data of interest
for lineIndex = 1:43
lineContent = fgetl(openedFile);
currentLineNo = currentLineNo +1;
end
commaSeperatedValues = regexp(lineContent, ',', 'split');
numberOfFrames = str2num(commaSeperatedValues{3});
%We go to frame 0
for lineIndex = 44 : 45
lineContent = fgetl(openedFile);
currentLineNo = currentLineNo + 1;
end
frameIndex = 0;
for lineIndex = 1:numberOfFrames
currentLineNo = currentLineNo + 1;
frameIndex = frameIndex+1;
frameLine = fgetl(openedFile);
commaSeperatedValues = regexp(frameLine, ',', 'split');
frameIndexFromFile = str2num(commaSeperatedValues{2});
timestamp = str2num(commaSeperatedValues{3});
numberOFTrackedMarkersInThisFrame = str2num(commaSeperatedValues{5});
% Get marker coordinates
idx = find(~cellfun(@isempty, strfind(commaSeperatedValues,'Marker-'))); %searches for locations of word 'Marker'
for c2 = 1 : length(idx)
coor = cellfun(@str2double, commaSeperatedValues(idx(c2)-4:idx(c2)-2)); %reads the coordinates of each Marker
nums = regexp(commaSeperatedValues{idx(c2)},'\d*','match'); %reads the identification numbers of markers
MarkerCoordinates(str2double(nums{c2})).(sprintf('Marker%s',nums{c2}))(frameIndex,:) = coor; %stores data into a CellArray??
end
end
The exemplar data contains rows of different markers. But we can check if the code works for lines with same markers in all lines by changing the perturbation field of the second last for loop ( "for lineIndex = 1:numberOfFrames" - instead of numberOfFrames = 1). When I do this I get error: "Index exceeds matrix dimensions." So how can I make the program work for lines with same markers in each line, and how could I upgrade the program to work for various markers in lines? Thank you.
Regards,
Jurij Hladnik

 Accepted Answer

per isakson
per isakson on 10 Nov 2015
Edited: per isakson on 19 Nov 2015
This code is based on your description of the problem, rather than your code. I choose to store the data in a structure, rather than a in cell array. I'm not sure, I exactly understand the format. Why the integer in the position before Marker in ...,15238,Marker-15238,...? I have skipped it. And why do the rows end with a marker-Id, e.g. 15251,Marker-15251. I skipped that too. However, try
>> M = OptiTrack( 'ReducedData.csv' );
>> M
M =
RowHead: [506x7 double]
m01003: [506x4 double]
m01004: [506x4 double]
m01005: [506x4 double]
m01006: [506x4 double]
m15232: [3x4 double]
m15229: [3x4 double]
....
>> plot( M.m15339(:,1), M.m15339(:,2:4) )
>> plot( M.m15330(:,1), M.m15330(:,2:4) )
(I failed to include an image. It displays in the Preview, but disappears when I Submit.)
where
function M = OptiTrack( filespec )
fid = fopen( filespec );
cac = textscan( fid, '%s', 'Headerlines',44, 'Delimiter','\n' );
fclose( fid );
cac = cac{1};
M = struct( 'RowHead', zeros(0,7) );
len = length( cac );
for jj = 1 : len % loop over all rows
row = regexp( cac{jj}, 'Marker-', 'split' );
num = cell2mat( textscan( row{1}, '%*s%f%f%f%f%f%f%f%*f', 'Delimiter',',' ) );
M.RowHead = cat( 1, M.RowHead, num );
sec = num(2);
for kk = 2 : length(row) - 1 % loop over all markers of one row
mmm = textscan( row{kk}, '%d%f%f%f%*f', 'Delimiter',',', 'CollectOutput',true );
str = sprintf( 'm%05d', mmm{1} );
if ismember( str, fields(M) )
M.(str) = cat( 1, M.(str), [ sec, mmm{2} ] );
else
M.(str) = [ sec, mmm{2} ];
end
end
end
end
Yes, the function isn't fast, but I think it does the job.
Replacing
if ismember( str, fields(M) )
by
if any( strcmp( str, fields(M) ) )
improves the speed of the function by 25% (on R2013b and with the csv-file already in the cache).
&nbsp
OptiTrack version 2.0
Assumptions
  • all data rows are trackable rows containing marker data (the value of the first column is 'frame')
  • the number embedded in the marker-name is equal to the marker-id number
function M = OptiTrack( filespec )
fid = fopen( filespec );
cac = textscan( fid, '%s', 'Headerlines',44, 'Delimiter','\n' );
fclose( fid );
cac = cac{1};
M = struct( 'RowHead', zeros(0,4) );
len = length( cac );
for jj = 1 : len
row = regexp( cac{jj}, ',Marker-\d+,', 'split' );
buf = textscan( row{1},'%*s%f%f%f%f%[^\n]','Delimiter',',' ...
, 'CollectOutput',true );
%
M.RowHead = cat( 1, M.RowHead, buf{1} );
sec = buf{1}(2);
%
row(1) = buf{2};
%
for kk = 1 : length(row)
mmm = textscan( row{kk},'%f%f%f%d','Delimiter',',' ...
, 'CollectOutput',true );
str = sprintf( 'm%05d', mmm{2} );
if any( strcmp( str, fields(M) ) )
M.(str) = cat( 1, M.(str), [ sec, mmm{1} ] );
else
M.(str) = [ sec, mmm{1} ];
end
end
end
end
Here is a new version to test. The format of the data row is a bit tricky, since there is no explicit delimiter between the "row header" and the first marker data. It was easier to make and test the required changes than to explain :).
&nbsp
OptiTrack version 3.0
function M = OptiTrack( filespec ) %
fid = fopen( filespec );
cac = textscan( fid, '%s', 'Headerlines',44, 'Delimiter','\n' );
fclose( fid );
rows = cac{1};
M = struct( 'RowHead', zeros(0,4) );
for jj = 1 : length( rows ) % loop over all data rows
%
% Strip off the row header
cac = textscan( rows{jj}, '%s%f%f%f%f%[^\n]' ...
, 'Delimiter',',', 'CollectOutput',true );
M.RowHead = cat( 1, M.RowHead, cac{2} );
sec = cac{2}(2);
% split the rest of the row into cells of single marker
marker_data = regexp( char( cac{3} ) ...
, '([\d\.\-]+,){4}Marker\-\d+,?', 'match' );
for kk = 1 : length( marker_data ) % loop over all markers
% split into numerical data and marker name
marker = textscan( marker_data{kk} ,'%f%f%f%f%s' ...
, 'Delimiter' , ',' ...
, 'CollectOutput' , true );
% make a short marker name, which is legal as a Matlab name
mmm = textscan( char(marker{2}), '%*s%d', 'Delimiter','-' );
str = sprintf( 'm%05d', mmm{1} );
if any( strcmp( str, fields(M) ) )
M.(str) = cat( 1, M.(str), [ sec, marker{1}(1:3) ] );
else
M.(str) = [ sec, marker{1}(1:3) ];
end
end
end
end

14 Comments

JH
JH on 11 Nov 2015
Edited: per isakson on 12 Nov 2015
Dear Per Isakson,
thank you very much for your code. It almost does the work. The data of each trackable row contain comma separated values in the following order:
  1. frame,
  2. frame number,
  3. time stamp,
  4. trackable rigid bodies (0 in our case),
  5. number of trackable markers,
  6. coordinates of the first marker (x,y,z),
  7. marker id number,
  8. marker name... the last three data (5 columns) appear for all the tracked markers.
Each line ends with a markers name because its coordinates are given prior to its name ( 5th to 3rd column before its name).
Your code therefore reads the wrong coordinates of each marker (the coordinates after the marker's name instead of the coordinates given before the markers name).
How should I change the code to work as I wish? Thank you very much for your help.
Jurij Hladnik
PS: I found that I need to split the lines 4 'columns' ahead of the word'Marker-'. How can I specify this in row: row = regexp( cac{jj}, 'Marker-', 'split' ); ?
Great, now it works perfectly. Still, I'll need some time to understand the code. Thank you very much for your help.
JH
Hi, I have one more question. Now I get a structure containing markers with their time and location data (x,y,z):
RowHead: [506x4 double]
m01003: [506x4 double]
m01004: [506x4 double]
m01005: [506x4 double]
m01006: [506x4 double]
m15232: [3x4 double]
m15229: [3x4 double]
m15230: [506x4 double]
m15238: [143x4 double]
m15233: [268x4 double]
m15234: [382x4 double]
m15235: [60x4 double]
m15236: [143x4 double]
m15239: [3x4 double]
m15240: [3x4 double]
m15241: [164x4 double]
m15242: [2x4 double]
m15243: [4x4 double]
m15244: [2x4 double]
m15245: [6x4 double]
m15246: [111x4 double]
m15247: [29x4 double]
m15248: [43x4 double]...
How can I convert this data to a cell array by keeping the marker names (eg. m15248,...)?
If MarkerSets is a structure, MarkerSets_cell is a cell array, but with new names of their size. I want those names to be the same as those of the structure fields (structFields):
structFields=fieldnames(MarkerSets);
MarkerSets_cell = struct2cell(MarkerSets);
Thank you. Regards,
JH
per isakson
per isakson on 12 Nov 2015
Edited: per isakson on 12 Nov 2015
  • "but with new names of their size" &nbsp What do you mean?
  • Where do you want to store the names?
  • struct2cell puts the value of each field in a cell of a cell array.
  • Why do you want to have the data in a cell array? I cannot see any advantage over a structure.
One way to merge names and values is
filespec = 'H:\m\cssm\ReducedData.csv'
M = OptiTrack( filespec );
N = fields( M );
V = struct2cell( M );
MarkerSets_cell = cat( 2, N, V )
which outputs
MarkerSets_cell =
'RowHead' [506x4 double]
'm01003' [506x4 double]
'm01004' [506x4 double]
'm01005' [506x4 double]
'm01006' [506x4 double]
'm15232' [ 3x4 double]
...
However, the structure, M, will lead to better code, i.e. more robust and readable (IMO).
JH
JH on 14 Nov 2015
Edited: JH on 14 Nov 2015
Dear Per Isakson,
I thought I could have the data stored in a cell array where the names of the cells are the same as the previous names of the fields. Your solution gives me for each marker two cells, one containing the name of the marker and the other its coordinates - I would like just one cell for each marker.. I wanted such formulation because of another program, which uses the marker coordinates. However, I found out that it will be the best to change that other program. Thank you for helping.
Regards, JH
"where the names of the cells are the same as the previous names of the fields" &nbsp The cells of a cell array don't have any names. You can store a string, e.g a name, in a cell. Character and numerical data cannot be stored in a single cell. However, a cell array can be stored in a cell of a cell array.
"a program written for ... such formulation of the cell array" &nbsp To understand, I would need a sample of the input data for that program.
The minus, "-", introduces a little problem
>> genvarname( 'Marker-12345' )
ans =
Marker0x2D12345
thus I choose "m"
There is just one thing more I would like to ask. Its a dumb one. I don't know how to perform a loop operation on structure elements: I would like to subtract from all the markers' coordinates (from all of their rows) the coordinates of the marker m01003 in the first row (MarkerSets.m01003(1,2:4)). Thank you.
JH
I have missed your reply... "a program written for ... such formulation of the cell array" To understand, I would need a sample of the input data for that program....: The program reads data of the markers, which are stored in 3-layered cell array instead of two-layered structures. Therefore I will need to change the program no matter if it is a cell array or structure. Just for clarification I am sending you sample data and program for reading data.
JH
No it's not dumb.
M = OptiTrack( 'ReducedData.csv' );
name_list = permute( fieldnames( M ), [2,1] );
% most matlabbers would write
% name_list = fieldnames(M)'; but not me
for name = name_list
disp( name{:} )
end
outputs
RowHead
m01003
m01004
m01005
m01006
....
"subtract from all ... m01003"
Assumption: The coordinates of m01003 are constant, i.e. they don't vary with time. (If they vary it becomes a bit more complicated.)
M = OptiTrack( 'ReducedData.csv' );
name_list = permute( fieldnames( M ), [2,1] );
N = M;
r01003 = [ 0, M.m01003(1,2:4) ];
for name = name_list( 1, 3:end ) % exclude RowHead and m01003
N.(name{:}) = bsxfun( @minus, M.(name{:}), r01003 );
end
Needs to be tested!
Great, it works. I thought it was less complicated :) Thank you a lot!
Dear Per Isakson,
there is still one thing that's bothering me regarding your code for collecting marker coordinates. The name of the marker is defined by the number before the word 'Marker-#'. Instead of this I would like the number # in the word 'Marker-#' to define its name. How can I do that? Regards,
JH
per isakson
per isakson on 19 Nov 2015
Edited: per isakson on 19 Nov 2015
  • Who came up with this file format in the first place?
  • "Instead of this I would like the number # in the word 'Marker-#'" &nbsp You could have said that in your response to my first answer.
Thank you for the upgrade of the code. It works perfectly. I am also sorry for not telling you this wish at the beginning, but unfortunately I have noticed this need later on. Regards,
JH

Sign in to comment.

More Answers (0)

Asked:

JH
on 10 Nov 2015

Commented:

JH
on 23 Nov 2015

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!