Textscan and cpu usage

1 view (last 30 days)
Ari
Ari on 7 Oct 2017
Commented: per isakson on 7 Oct 2017
I used textscan to read a csv file of 0.5 GB. Sometimes it took less than a minute to complete but in another time it took almost 10 minutes! When I compare the cpu usages during those instances, I noticed that the cpu usage is high for the former (25% in a quadcores machine, so a full core) and low for the latter (less than 5%). Anybody has this experience?
  3 Comments
Ari
Ari on 7 Oct 2017
I have 8GB RAM, and no I didn't run anything simultaneously.
When it reads fast then the memory usage (in the Task Manager - process - Matlab) increases rapidly and so is the CPU. When it reads slow the memory usage stays constant and so is the CPU (at low percentage). What is strange is that if I read the file first using fileread, although just as a dummy, then do the textscan (on the file and not on the string), it reads fast all the time. I came across this 'trick' by looking at what importdata does. Importdata uses also textscan, but instead of textscanning the file direclty, it reads the file into a string first (using fileread), then do the textscan on the string.
per isakson
per isakson on 7 Oct 2017
  • This large difference in speed makes me think about swapping, but that isn't likely with 8GB RAM. Did try to use the Resource Manager to see what's going on?
  • I once tested speed of I/O with some large files. I had problems to reproduce the results. End of story: I drew the conclusion that my test "messed up" the system cache, which in turn increased the execution times, but certainly not an order of magnitude.
  • I often use fileread in combination with textscan, when the string needs some fixing before parsing. I was initially surprised it is nearly as fast as reading with textscan.
  • Which versions of Matlab and Windows do you use?

Sign in to comment.

Answers (1)

Kian Azami
Kian Azami on 7 Oct 2017
Edited: per isakson on 7 Oct 2017
I just heard that the computation process by the cpu is a very nonlinear process and for this reason every time you see a different behavior. There are some publication about this issue, to study the behavior of the cpu computations.
I put a youtube link which one of the prominent scientists talks about this issue. Worth to listen! https://www.youtube.com/watch?time_continue=17&v=iW2QJRDEBMw

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!