I have a Large csv file that I want to plot

9 views (last 30 days)
I have a large csv file, when I try to open it in matlab to plot it I run out of memory. I tried the 'tabularTextDatastore' but I don't know how to plot it after selecting the variable names and the formats, it seems like 'Datastore' is only meant for reading and displaying the data because I cant find anything on changing the data let alone plotting it. The problem is that I have 6 columns of endless data and I'm plotting a 4D graph. The original idea was to interpolate but since its very dense I won't need to interpolate I can just create a mesh gird and call 'GriddedInterpolant'. How can I plot each point from my csv file without running out of memory?
Update: I looked into tall arrays and it seems doable, BUT I don't think griddedInterpolant supports tall arrays.
  6 Comments
Walter Roberson
Walter Roberson on 26 Jul 2018
Which MATLAB release are you using? And are you using 32 bit or 64 bit? How much RAM do you have?
fid = fopen('data_632.0_43ddd.csv','rt');
data = cell2mat(textscan(fid, '%f%f%f%f%f%f', 'headerlines', 1, 'delimiter',',','collect',true));
fclose(fid)
For your 50 megabyte file, the result would be on the order of 45 megabytes of data storage.
You cannot use single precision for your first column without losing some of the information you have stored. For example in single precision, 0.994271503 would become approximately 0.994271517
Leen Almadani
Leen Almadani on 26 Jul 2018
I'm using the latest version (R2018a), I have access to other versions if needed. My laptop is a 64 bit and I have 15.9 GB in RAM.

Sign in to comment.

Answers (2)

Sushant Mahajan
Sushant Mahajan on 26 Jul 2018
Edited: Sushant Mahajan on 26 Jul 2018
The fact that your data file is 9 MB after zipping tells me that there is a significant wastage of RAM here. I can suggest ways to reduce your RAM usage so that more of it is available for plotting purposes:
Read the .csv file into MATLAB. You can read it linewise using fopen() and fgets() if you are running out of memory while reading the file itself. Then, store all variables you need in binary format on your hard drive (see here: fwrite() ).
This binary file should be multiple times smaller than your .csv. Restart MATLAB to reduce its memory usage ("clear all" does not actually let go of all the RAM MATLAB is using). Load the variables from the binary file and then try plotting again.
Other considerations for reducing memory usage:
1. When you are creating the binary file, consider using single precision format for storing the data instead of double precision (default). This itself shall cut down your memory usage in half. Use this only if that extra level of precision does not matter in your plots/results.
2. Open your task manager (on Windows) or System monitor (linux) and check which apps are using the most RAM. Most probably it is your web browser. Free as much RAM as you can by closing unnecessary apps and restart MATLAB to try plotting again.
  3 Comments
Walter Roberson
Walter Roberson on 26 Jul 2018
I calculate that the bin file would be about 89% of the size of the csv for your sample data. I don't think it would be worth going that route.
Sushant Mahajan
Sushant Mahajan on 1 Aug 2018
OK, can you let us know the size of your .csv file, and how many data points are you trying to plot?
Am I correct in understanding that you can load the complete data file in memory, but when you try to interpolate, only then you run out of memory? Can you also post the error message you get?

Sign in to comment.


KSSV
KSSV on 26 Jul 2018
Are you looking for some thing like this?
T = readtable('data_632.0_43ddd.csv') ;
t = T.(1)(~isnan(T.(1))) ;
p = T.(2)(~isnan(T.(1))) ;
uoph = T.(3)(~isnan(T.(1))) ;
%
dt = delaunayTriangulation(t,p) ;
tri = dt.ConnectivityList ;
trisurf(tri,t,p,uoph) ; view(2) ; shading interp
I have takes location as first and second column.
  1 Comment
Walter Roberson
Walter Roberson on 26 Jul 2018
Note that they indicated they were running out of memory. It turns out that the amount of storage required by the table version is 3+1/3 times larger than storing it purely numerically such as the textscan I posted.

Sign in to comment.

Categories

Find more on Large Files and Big Data in Help Center and File Exchange

Products


Release

R2018a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!