Memory Usage and Speed
8 views (last 30 days)
Show older comments
Sorry for the double question, but the two are, I think, somewhat related. I am working on an application that loops through a set of files in a single folder. There are typically 16,000 to upwards of 55,000 files in the folder. They are always dicom files. The loop looks at each file and retrieves a small bit of information from each file and then saves that information to a structure which is eventually saved to a .mat file.
The app is running fairly slow, in my opinion. It is requiring approximately 35 seconds per 100 files which would come out to be about 320 minutes, or almost 6 hours, to run through just this section of the program. I think this is a bit too long - maybe not. But I'd like your opinion on this speed. Does this sound right? I'm working on a Windows 7 64-bit machine with 16 GB of RAM.
Secondly, I looked at the Task Manager and noticed that MatLab is using 300,000K of memory during the looping process. Is this normal?
3 Comments
dpb
on 10 Mar 2014
Well, given Sean's timings, it seems likely that there's something to look at. Even an outline of the basic could let folks see if there's anything obvious being done that could be the killer.
Again, did you get the list from dir first and have you run the profiler yet? The question about local/network is a very good one I hadn't thought about...
Answers (5)
Sean de Wolski
on 10 Mar 2014
Are you actually reading in the files using dicomread()? If you are, perhaps the quicker dicominfo() could provide what you're looking for (meta info) without having to load all of the data.
Without knowing what you're doing, it's a shot in the dark to know if the timings are reasonable
0 Comments
Jeff
on 10 Mar 2014
2 Comments
Sean de Wolski
on 10 Mar 2014
I just ran dicominfo over 60 files and it took 1.3s. Wimpy windows7x64 laptop.
Jeff
on 10 Mar 2014
2 Comments
Sean de Wolski
on 10 Mar 2014
d = dir('*.dcm');
nfiles = numel(d); % 60
niter = 100;
c = cell(niter*60,1);
tic;
for kk = 1:niter
for ii = 1:nfiles
c(ii) = {dicominfo(d(ii).name)};
end
end
6000 iterations in:
Elapsed time is 56.090968 seconds.
Jeff
on 11 Mar 2014
2 Comments
Sean de Wolski
on 11 Mar 2014
Do you run out of RAM?
How big is the array that you're storing this information in?
Cells and structs have some overhead so perhaps you're hitting a memory limitation that causes MATLAB to require using swap space, which will be much slower.
Run your loop for a smaller number of iterations and report back with the output from:
whos
dpb
on 11 Mar 2014
...What I have seen is that once the folder gets above a certain number of files...
Which Windows version? Not sure if the Change Notification Handle thingie still hangs around with later versions or not--I've not upgraded so no experience.
Also, check whether your antivirus software is interfering.
MS has moved MSDN around so much I can't find anything any more, but at least in years gone by there definitely were issues with large subdirectories--how well they've been solved/resolved in later versions I don't know.
Jeff
on 11 Mar 2014
2 Comments
dpb
on 12 Mar 2014
What did the profiler show?
You've still not even shown an outline of what actually are doing for 2nd-set-of-eyes sanity check, what more the segment of actual code.
Is it all local storage or is it network access?
Did you check on the AV software interference issue?
In last resort, what about "embarrasingly parallel" offloading to multiple processors?
I still have to think there's something unexplained going on that's a factor.
See Also
Categories
Find more on DICOM Format in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!