MATLAB crashes during MEX file generation with GPU Coder:Access violation detected
Show older comments
I attempted to generate CUDA code for the ALNS_insert function using GPU Coder. This function is invoked by TestScript and selects between two subfunctions based on the value of insert_operator:
- If insert_operator == 1, it calls MinCost_greedy_insert.
- If insert_operator == 2, it calls MinCost_regretK_insert.
function newS = ALNS_insert(PartialS, orders_removed, insert_operator, Order_inf, S_B, q_B, t_ib, B, C_UNIT) %#codegen
% ALNS_remove: Applies a specified insert operator to generate a new solution.
%
% Inputs:
% PartialS - Partial solution after removal.
% orders_removed - Orders that were removed.
% insert_operator - Integer indicating the insert operator to use:
% 1: MinCost_greedy_insert
% 2: MinCost_regretK_insert
%
% Outputs:
% newS - Updated solution with orders re-inserted.
if insert_operator==1
newS = MinCost_greedy_insert(PartialS, orders_removed, Order_inf, S_B, q_B, t_ib, B, C_UNIT);
% newS = MinCost_greedy_insert_mex(PartialS, orders_removed, Order_inf, S_B, q_B, t_ib, B, C_UNIT);
else
newS = MinCost_regretK_insert(PartialS, orders_removed, Order_inf, S_B, q_B, t_ib, B, 2, C_UNIT);
% newS = MinCost_regretK_insert_mex(PartialS, orders_removed, Order_inf, S_B, q_B, t_ib, B, 2, C_UNIT);
end
end
The code executes successfully on the CPU. However, during GPU testing, MATLAB crashes during the MEX file generation phase, reporting an "Access violation detected" error in the crash logs.
Interestingly, when I use GPU Coder to generate MEX files for the two subfunctions individually and invoke the mex files with the same input data, they run without any issue. The crash only occurs when generating a MEX file with ALNS_insert as the entry-point function.
I have attached the system crash logs and the relevant source code files for your reference. I would greatly appreciate any suggestions you might have to help resolve this problem.
7 Comments
Gary Liu
on 27 Apr 2025
Hi Shi, I have confirmed that the issue is caused by a bug in GPU Coder. As a workaround, could you try rgenerating code with the attached updated regretK_values_update.m? Let me know if this resolves the issue. Thanks.
Shi
on 27 Apr 2025
Hi Shi, regarding your questions:
1. When and which functions should (or shouldn't) be inlined for GPU codegen?
By default, GPU coder inlines every function to the entry-point function (in your case, ALNS_insert). GPU coder then tries to map vectorized operations and loops in "ALNS_insert" to GPU kernels.
When a function is not inlined into the entry-point function, GPU coder only generates kernels for the function if it is tagged with "coder.gpu.kernelfun". Without the pragma, all operations within the function will be executed on CPU. See https://www.mathworks.com/help/gpucoder/ref/coder.gpu.kernelfun.html. for details. This is generally discouraged for the risk of missing parallelization opportunities.
On the other hand, for a function with lightweight workload, e.g. sum(sorted_costs) in regretK_values_update.m , which operates on a array with no more than 2 elements. Applying coder.inline('never') to generate CPU code for sum helps us to avoid the performance overhead of invoking GPU, and in this special case, also helps avoid triggering a bug in the GPU code generated for sum.
Do you need to update other call-sites to sum? Generally no, only if the bug is triggered again or if you are convinced that GPU coder is generating a kernel for a light-weight sum.
The takeaway is that "coder.inline('never')" can be used to instruct GPU coder to generate CPU-only code for certain operations. Although it's generally not recommended to use "coder.inline('never')", it works in cases where GPU coder is generating a trivial kernel for a light-weight operation.
Ideally, GPU coder should automatically select which operation to run on CPU/GPU for maximum performance. But in cases where it failed to do so, "coder.inline('never')" can be used to correct the mistake.
2. Which functions are safe (or recommended) to inline for maximum performance?
In general, functions with small-input sizes and lightweight workload. It's recommend to profile the generated code using gpuPerformanceAnalyzer to find the trivial GPU kernels and decide if execution on CPU is preferred. We provide tracing tools in gpuPerformanceAnalyzer that allows you to find the MATLAB code corresponding to the trivial kernels in the generated code.
3. Where I can find official MathWorks documentation on controlling inlining and GPU vs. CPU code placement.
Controling CPU/GPU code generation with inlining is a by-product of the effect of coder.gpu.kernelfun, which is documented in https://www.mathworks.com/help/gpucoder/ref/coder.gpu.kernelfun.html. GPU code generation only maps computation to GPU if the function has coder.gpu.kernelfun, the only exception being the entry-point function.
Shi
on 28 Apr 2025
Gary Liu
on 28 Apr 2025
Glad to help. I wasn't aware of any locking on this post. I've now added my response as an official answer. Could you try accepting it again? If the issue persists, it might be a temporary glitch. Refreshing the page or trying a different browser could help. Let me know if you still encounter problems.
Accepted Answer
More Answers (0)
Categories
Find more on Get Started with GPU Coder in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!