MATLAB Answers

How to import some large data please

4 views (last 30 days)
Mate 2u
Mate 2u on 24 Dec 2013
Commented: Mate 2u on 5 Jan 2014
Hi all I have a file called DJ.csv which has 5 columns. 1) Dates (01/02/2007), 2) Times (30.42.0), 3) prices 12553, 12442, 4) Codes (DJ123) and 5) trade size.
I want to take column 3 and 5 (price and trade size into matlab). I am having some trouble as the csv is quite big.
I tried this:
fileID = fopen('K:\test\test\DJ.csv');
A = fread(fileID,'double');
fclose(fileID);
But it only gives me a vector of values which are not the same as my data. Any help would be very much appreciated.
Thanks.

  1 Comment

Mate 2u
Mate 2u on 24 Dec 2013
As a note, importdata works, but it is not suitable for very large files.

Sign in to comment.

Accepted Answer

dpb
dpb on 25 Dec 2013
Edited: dpb on 26 Dec 2013
fread is for stream unformatted files; you have formatted delimited file--
doc textscan % and friends
If you really only want/need the two columns sotoo (air-code, untested)
[p,s]=textread('K:\test\test\DJ.csv','%*s%*s,%f%*f%f','delimiter',',');
ought to do unless the third column is indeed a comma-for-a-decimal point as well as a comma-delimited file. In that case you've got a problem. You'll have to read three values instead of just two or preprocess the file or otherwise handle the decimal separator as Matlab can't (and you can't expect it to) know the difference between comma-delimiters and decimal places.

  7 Comments

Show 4 older comments
Mate 2u
Mate 2u on 2 Jan 2014
Hi there, There is no header row just data. The data as open in notepad is the following:
01/02/2007,00:15:00.000,12540,DJH07,1
01/02/2007,00:21:58.000,12541,DJH07,1
01/02/2007,00:22:50.000,12541,DJH07,1
01/02/2007,00:30:42.000,12545,DJH07,1
01/02/2007,01:11:31.000,12553,DJH07,2
01/02/2007,01:51:48.000,12554,DJH07,2
01/02/2007,02:13:30.000,12554,DJH07,1
01/02/2007,02:16:14.000,12554,DJH07,3
01/02/2007,02:21:40.000,12554,DJH07,1
01/02/2007,02:26:48.000,12558,DJH07,1
01/02/2007,02:50:44.000,12555,DJH07,1
01/02/2007,03:14:57.000,12557,DJH07,1
01/02/2007,03:22:41.000,12559,DJH07,1
But each data entry is different lines within the notepad file. Thanks so much for your help.
Walter Roberson
Walter Roberson on 3 Jan 2014
datacell = textscan(fid,'%*s%*s%f%*s%f','delimiter',',');
The previous version had a stray comma in the format.
Mate 2u
Mate 2u on 5 Jan 2014
Thank you, worked well.

Sign in to comment.

More Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!