MATLAB crashes during MEX file generation with GPU Coder:Access violation detected

I attempted to generate CUDA code for the ALNS_insert function using GPU Coder. This function is invoked by TestScript and selects between two subfunctions based on the value of insert_operator:​
  • If insert_operator == 1, it calls MinCost_greedy_insert.
  • If insert_operator == 2, it calls MinCost_regretK_insert.​
function newS = ALNS_insert(PartialS, orders_removed, insert_operator, Order_inf, S_B, q_B, t_ib, B, C_UNIT) %#codegen
% ALNS_remove: Applies a specified insert operator to generate a new solution.
%
% Inputs:
% PartialS - Partial solution after removal.
% orders_removed - Orders that were removed.
% insert_operator - Integer indicating the insert operator to use:
% 1: MinCost_greedy_insert
% 2: MinCost_regretK_insert
%
% Outputs:
% newS - Updated solution with orders re-inserted.
if insert_operator==1
newS = MinCost_greedy_insert(PartialS, orders_removed, Order_inf, S_B, q_B, t_ib, B, C_UNIT);
% newS = MinCost_greedy_insert_mex(PartialS, orders_removed, Order_inf, S_B, q_B, t_ib, B, C_UNIT);
else
newS = MinCost_regretK_insert(PartialS, orders_removed, Order_inf, S_B, q_B, t_ib, B, 2, C_UNIT);
% newS = MinCost_regretK_insert_mex(PartialS, orders_removed, Order_inf, S_B, q_B, t_ib, B, 2, C_UNIT);
end
end
The code executes successfully on the CPU. However, during GPU testing, MATLAB crashes during the MEX file generation phase, reporting an "Access violation detected" error in the crash logs.
Interestingly, when I use GPU Coder to generate MEX files for the two subfunctions individually and invoke the mex files with the same input data, they run without any issue. The crash only occurs when generating a MEX file with ALNS_insert as the entry-point function.
I have attached the system crash logs and the relevant source code files for your reference. I would greatly appreciate any suggestions you might have to help resolve this problem.​

7 Comments

Hi Shi, thanks for sharing. I’ve reproduced the issue in R2024b and am investigating it. I’ll keep you updated.
Hi Liu, thanks for your help, looking forward to your reply.
Hi Shi, I have confirmed that the issue is caused by a bug in GPU Coder. As a workaround, could you try rgenerating code with the attached updated regretK_values_update.m? Let me know if this resolves the issue. Thanks.
Hi Liu,
Awesome! After replacing my source files with the ones you provided, GPU Coder successfully generated the MEX file.
I looked through your revised code and saw that you made the following change:
% Original
regret_cost = sum(sorted_costs) - K_curr * sorted_costs(1); % Regret value
% Modified
regret_cost = sumWrapper(sorted_costs) - K_curr * sorted_costs(1); % Regret value
……
function s = sumWrapper(arr)
coder.inline('never');
s = sum(arr);
end
However, I’m not clear why this specific change fixed the crash. My understanding is that coder.inline('never') prevents the code generator from inlining sumWrapper into the generated code.Does this mean that the root cause is that the GPU coder inlines the built-in function sum during code generation? Yet, I’ve used sum in other parts of my code without issue. Could you explain when and which functions should (or shouldn’t) be inlined during GPU code generation?
Although I can now build the MEX file successfully, its execution on the GPU is much slower than running in MATLAB. I suspect that I haven’t explicitly controlled which functions get generated for the GPU versus which remain on the CPU. Since coder.inline('never') appears to influence that, I would greatly appreciate any guidance on:
  1. Which functions are safe (or recommended) to inline for maximum performance?
  2. Where I can find official MathWorks documentation on controlling inlining and GPU vs. CPU code placement.
Thanks again for your help!
Hi Shi, regarding your questions:
1. When and which functions should (or shouldn't) be inlined for GPU codegen?
By default, GPU coder inlines every function to the entry-point function (in your case, ALNS_insert). GPU coder then tries to map vectorized operations and loops in "ALNS_insert" to GPU kernels.
When a function is not inlined into the entry-point function, GPU coder only generates kernels for the function if it is tagged with "coder.gpu.kernelfun". Without the pragma, all operations within the function will be executed on CPU. See https://www.mathworks.com/help/gpucoder/ref/coder.gpu.kernelfun.html. for details. This is generally discouraged for the risk of missing parallelization opportunities.
On the other hand, for a function with lightweight workload, e.g. sum(sorted_costs) in regretK_values_update.m , which operates on a array with no more than 2 elements. Applying coder.inline('never') to generate CPU code for sum helps us to avoid the performance overhead of invoking GPU, and in this special case, also helps avoid triggering a bug in the GPU code generated for sum.
Do you need to update other call-sites to sum? Generally no, only if the bug is triggered again or if you are convinced that GPU coder is generating a kernel for a light-weight sum.
The takeaway is that "coder.inline('never')" can be used to instruct GPU coder to generate CPU-only code for certain operations. Although it's generally not recommended to use "coder.inline('never')", it works in cases where GPU coder is generating a trivial kernel for a light-weight operation.
Ideally, GPU coder should automatically select which operation to run on CPU/GPU for maximum performance. But in cases where it failed to do so, "coder.inline('never')" can be used to correct the mistake.
2. Which functions are safe (or recommended) to inline for maximum performance?
In general, functions with small-input sizes and lightweight workload. It's recommend to profile the generated code using gpuPerformanceAnalyzer to find the trivial GPU kernels and decide if execution on CPU is preferred. We provide tracing tools in gpuPerformanceAnalyzer that allows you to find the MATLAB code corresponding to the trivial kernels in the generated code.
3. Where I can find official MathWorks documentation on controlling inlining and GPU vs. CPU code placement.
Controling CPU/GPU code generation with inlining is a by-product of the effect of coder.gpu.kernelfun, which is documented in https://www.mathworks.com/help/gpucoder/ref/coder.gpu.kernelfun.html. GPU code generation only maps computation to GPU if the function has coder.gpu.kernelfun, the only exception being the entry-point function.
Thank you very much for your prompt and detailed reply. I would like to accept your answer to help others who encounter similar problems, but I found that this question is locked. How can I unlock it?
Glad to help. I wasn't aware of any locking on this post. I've now added my response as an official answer. Could you try accepting it again? If the issue persists, it might be a temporary glitch. Refreshing the page or trying a different browser could help. Let me know if you still encounter problems.

Sign in to comment.

 Accepted Answer

The crash is due to a bug in GPU Coder. To work around it, apply coder.inline('never') to prevent GPU code generation for sum(sorted_costs). For more details, refer to the comment section.

More Answers (0)

Categories

Find more on Get Started with GPU Coder in Help Center and File Exchange

Products

Release

R2024b

Asked:

Shi
on 23 Apr 2025

Commented:

on 28 Apr 2025

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!