Why are some csv files imported incorrectly into my cell array?

12 views (last 30 days)
Hi,
I have a cell array called alldata whcih contains the contents of 24 csv files. However, when importing these files I can see that the last five (for example the csv file: 5422_task.csv) have been incorrectly imported in that the first column inlcudes two values (seperated by a comma) with an apostrophe infront.
alldata{1, 24}
ans =
1216×3 cell array
{'media_open; media_play; medi…'} {'2022/09/23 15:06:18:984'} {' 2022/09/23 15:11:37:652"'}
{'Multimedia File,"task_com.Ut…'} {1×1 missing } {1×1 missing }
{'Lower Label,"Weak Presence"' } {1×1 missing } {1×1 missing }
{'Upper Label,"Strong Presence"' } {1×1 missing } {1×1 missing }
{'Minimum Value,-100' } {1×1 missing } {1×1 missing }
{'Maximum Value,100' } {1×1 missing } {1×1 missing }
{'Number of Steps,9' } {1×1 missing } {1×1 missing }
{'Second,"Rating"' } {1×1 missing } {1×1 missing }
{'%%%%%%,"%%%%%%"' } {1×1 missing } {1×1 missing }
{'10.5,96.09' } {1×1 missing } {1×1 missing }
{'10.75,96.09' } {1×1 missing } {1×1 missing }
{'11,96.09' } {1×1 missing } {1×1 missing }
{'11.25,96.16375' } {1×1 missing } {1×1 missing }
{'11.5,96.45875' } {1×1 missing } {1×1 missing }
On the other hand, all the other csv files have been correctly imported so that the first two columns show two different values that have been seperated by a comma (for example the csv file: 1311_task.csv).
alldata{1, 1}
ans =
682×3 cell array
{'media_open; media_play; medi…'} {'2022/09/19 14:42:27:371' } {' 2022/09/19 14:54:07:167"'}
{'Multimedia File' } {'com.UtrechtUniversity.XRPS_Q…'} {1×1 missing }
{'Lower Label' } {'Negative Affect' } {1×1 missing }
{'Upper Label' } {'Positive Affect' } {1×1 missing }
{'Minimum Value' } {[ -100]} {1×1 missing }
{'Maximum Value' } {[ 100]} {1×1 missing }
{'Number of Steps' } {[ 9]} {1×1 missing }
{'Second' } {'Rating' } {1×1 missing }
{'%%%%%%' } {'%%%%%%' } {1×1 missing }
{[ 1]} {[ 0.7800]} {1×1 missing }
{[ 2]} {[ 0.8975]} {1×1 missing }
{[ 3]} {[ 0.7800]} {1×1 missing }
{[ 4]} {[ 0.7800]} {1×1 missing }
{[ 5]} {[ 0.8385]} {1×1 missing }
{[ 6]} {[ 0.7800]} {1×1 missing }
{[ 7]} {[ 0.7800]} {1×1 missing }
Any idea why this might be the case?
Thank you!

Accepted Answer

Voss
Voss on 13 Dec 2022
"Any idea why this might be the case?"
It's because the different files have commas and semicolons in different places, e.g. line 10 of 1311_task.csv looks like this:
1;0.78;
but line 10 of 5422_task.csv looks like this:
10.5,96.09;;
So in one file you've got a semicolon after each number, and in the other file a comma in between the numbers and two semicolons at the end of the line.
I don't know what function(s) you're using to import the files, but here's an attempt to handle both of those situations with one piece of code:
files = {'1311_task.csv' '5422_task.csv'};
C = cell(1,numel(files));
for ii = 1:numel(files)
C{ii} = readcell(files{ii},'Delimiter',{',' ';'},'ConsecutiveDelimitersRule','join');
end
C{:}
ans = 682×3 cell array
{'media_open; media_play; media_end'} {'2022/09/19 14:42:27:371' } {' 2022/09/19 14:54:07:167"'} {'Multimedia File' } {'com.UtrechtUniversity.XRPS_Quest-20220919-135434.mkv'} {1×1 missing } {'Lower Label' } {'Negative Affect' } {1×1 missing } {'Upper Label' } {'Positive Affect' } {1×1 missing } {'Minimum Value' } {[ -100]} {1×1 missing } {'Maximum Value' } {[ 100]} {1×1 missing } {'Number of Steps' } {[ 9]} {1×1 missing } {'Second' } {'Rating' } {1×1 missing } {'%%%%%%' } {'%%%%%%' } {1×1 missing } {[ 1]} {[ 0.7800]} {1×1 missing } {[ 2]} {[ 0.8975]} {1×1 missing } {[ 3]} {[ 0.7800]} {1×1 missing } {[ 4]} {[ 0.7800]} {1×1 missing } {[ 5]} {[ 0.8385]} {1×1 missing } {[ 6]} {[ 0.7800]} {1×1 missing } {[ 7]} {[ 0.7800]} {1×1 missing }
ans = 1216×3 cell array
{'media_open; media_play; media_end,"2022/09/23 15:06:11:215' } {'2022/09/23 15:06:18:984'} {' 2022/09/23 15:11:37:652"'} {'Multimedia File,"task_com.UtrechtUniversity.XRPS_Quest-20220923-142855.mkv"'} {1×1 missing } {1×1 missing } {'Lower Label,"Weak Presence"' } {1×1 missing } {1×1 missing } {'Upper Label,"Strong Presence"' } {1×1 missing } {1×1 missing } {'Minimum Value' } {[ -100]} {1×1 missing } {'Maximum Value' } {[ 100]} {1×1 missing } {'Number of Steps' } {[ 9]} {1×1 missing } {'Second,"Rating"' } {1×1 missing } {1×1 missing } {'%%%%%%,"%%%%%%"' } {1×1 missing } {1×1 missing } {[ 10.5000]} {[ 96.0900]} {1×1 missing } {[ 10.7500]} {[ 96.0900]} {1×1 missing } {[ 11]} {[ 96.0900]} {1×1 missing } {[ 11.2500]} {[ 96.1637]} {1×1 missing } {[ 11.5000]} {[ 96.4587]} {1×1 missing } {[ 11.7500]} {[ 96.0900]} {1×1 missing } {[ 12]} {[ 96.0900]} {1×1 missing }
As you can see there, the header info (lines 1-9) is not parsed the same between the two files, but the data section (lines 10-end) is, so maybe that's good enough?
  5 Comments
lil brain
lil brain on 14 Dec 2022
It seems that this error appears no matter what files I select. It is always the first file in the list though.

Sign in to comment.

More Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!