Evaluate GPU reshape performance
3 views (last 30 days)
Show older comments
Hello all,
I am considering using GPU processing for moving around all the bits of my data, which is acquired in real time via PCI-express cards. Right now, I would like the communities help in order for me to evaluate if a new GPU would really help me out, or if this calculation is just not in the nature of the processing on the GPU.
Attached is code below to send various sizes of data to the GPU and call the reshape command on it, and compare it to the speed of your CPU. I can choose to send any buffer chunk sizes I need, for optimization of GPU transfer/calculation ratio (which is why it does this in various sizes). I am looking for people with near equivalent CPUs as my system, but much better GPUs.
The code just clears the command window, and dumps the results there. So a copy and paste all would be sufficient.
My system: Core i7-2700K CPU @ 3.4GHZ (4 cores, 8 threads) GPU: Nvidia 560 2GB 256-bit GDDR5, 336 CUDA Cores, 850MHz core clock. Ideally, I am hoping someone has a system same with many more CUDA Cores and a similar CPU to send me the result of. Using Matlab 2012a
Thanks!
Shane,
clc
% CPU_Time =0;
% GPU_Time =0;
for a = 0:4:128
b=uint16(1:(512*4096*2*a));
for repeat = 1:10 %Average it 10 times.
tStart = tic;
RawData = reshape(b, [2,4,512,512,2,a]);
CPU_Time = toc(tStart);
tStartB = tic;
RawDataGPU = gpuArray(reshape(b, [2,4,512,512,2,a]));
GPU_Time = toc(tStartB);
end %end repeat loop
a
GPU_CPU_Ratio = ( (GPU_Time/10) / (CPU_Time/10))
end %end main loop selecting data size
4 Comments
Jill Reese
on 5 Jul 2012
You are not performing any work on the GPU at all; is that your intent? Both calls to reshape occur on the CPU. RawDataGPU is just transferring the reshaped array to the GPU; the reshape was performed on the CPU.
Furthermore, execution on the GPU happens asynchronously in R2012a. In order to get accurate timings you need to use the wait command like so:
g = gpuDevice();
tic
%do stuff on the gpu
wait(g)
toc
Answers (1)
Jan
on 6 Jul 2012
Edited: Jan
on 7 Jul 2012
RESHAPE does not perform any operations on the data. You can check this by measuring the times for reshaping a 10x10 and a 1000x1000 matrix to e.g. row-vectors on the CPU: Both require the same time, because in the first case Matlab writes a UINT64[1, 100] to the header of the variable, and in the 2nd case UINT64[1, 1e6]. In consequence it does not matter if this is done on the CPU or GPU, because writing 16 bytes does not bother at all.
I'm not sure, if the header of the variable is copied to the GPU at all, or if only the data are sent.
In opposite to RESHAPE, the commands TRANSPOSE or PERMUTE do touch the data.
BTW: uint16(1:(512*4096*2*a)) creates a DOUBLE vector at first and converts it to a UINT16 afterwards. More efficient: uint16(1):uint16(512*4096*2*a).
[EDITED, minor changes, Jan]
0 Comments
See Also
Categories
Find more on GPU Computing in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!