About the DeLong test
15 views (last 30 days)
Show older comments
Takeharu Kiso
on 30 Sep 2024
Edited: Sandeep Mishra
on 1 Oct 2024
Dear Professors,
I apologize for bothering you during your busy schedules. We have conducted the program described below, reading three files and performing ROC analysis through machine learning to calculate three AUCs. We have attempted to create a program to demonstrate the statistical significance of the differences among these AUCs, but despite multiple efforts, we have encountered errors that prevent progress. Could you please help us revise the program? I am attaching a program below. I would greatly appreciate it if you could add the necessary modifications to it. I am having a lot of trouble implementing the DeLong test in the program. Thank you very much for your assistance.Thank you very much for your assistance.
Best regards,
% CSVファイルのパスを指定
filePaths = {'C:\Users\rms56\Desktop\1-491B.csv', ...
'C:\Users\rms56\Desktop\1-491C.csv', ...
'C:\Users\rms56\Desktop\1-491D.csv'};
% X と y を格納するための cell 配列を作成
X_all = cell(3, 1); % 特徴量 X 用の cell 配列
y_all = cell(3, 1); % ラベル y 用の cell 配列
% 3つのファイルを順番に読み込み、X と y に割り当てる
for i = 1:3
% CSVファイルの読み込み
data = readmatrix(filePaths{i});
% 各ファイルに応じて X と y の列を指定
if i == 1 % '1-491B.csv': 3列目までがX、4列目がY
X_all{i} = data(:, 1:3); % 1~3列目を X に設定
y_all{i} = data(:, 4); % 4列目を Y に設定
elseif i == 2 % '1-491C.csv': 6列目までがX、7列目がY
X_all{i} = data(:, 1:6); % 1~6列目を X に設定
y_all{i} = data(:, 7); % 7列目を Y に設定
elseif i == 3 % '1-491D.csv': 3列目までがX、4列目がY
X_all{i} = data(:, 1:3); % 1~3列目を X に設定
y_all{i} = data(:, 4); % 4列目を Y に設定
end
end
% ファイルごとの解析をループで実行
for fileIndex = 1:3
% ファイルに対応するデータを取得
X = X_all{fileIndex}; % 特徴量
y = y_all{fileIndex}; % ラベル
% クロスバリデーションの設定
k = 5; % フォールド数
cv = cvpartition(y, 'KFold', k); % クロスバリデーションの分割
accuracy = zeros(k, 1); % 各フォールドの精度を格納する配列
% 各フォールドごとにトレーニングとテストを実行
for i = 1:k
trainIdx = training(cv, i);
testIdx = test(cv, i);
% データの分割
XTrain = X(trainIdx, :);
yTrain = y(trainIdx, :);
XTest = X(testIdx, :);
yTest = y(testIdx, :);
% SVMモデルのトレーニング
model = fitcsvm(XTrain, yTrain, ...
'KernelFunction', 'polynomial', ...
'PolynomialOrder', 2, ...
'KernelScale', 'auto', ...
'BoxConstraint', 1, ...
'Standardize', true);
% モデルを使用してテストセットを予測
[predictions, score] = predict(model, XTest);
% 現在のフォールドの精度を計算
accuracy(i) = sum(predictions == yTest) / length(yTest);
fprintf('ファイル %d - Fold %d Accuracy: %.2f%%\n', fileIndex, i, accuracy(i) * 100);
end
% 全フォールドの平均精度を計算
averageAccuracy = mean(accuracy);
fprintf('ファイル %d - Average Accuracy: %.2f%%\n', fileIndex, averageAccuracy * 100);
% ROC曲線とAUCの計算
[~, ~, ~, AUC_final] = perfcurve(yTest, score(:, 2), 1);
% ブートストラップ法で信頼区間を計算
nBoot = 1000; % ブートストラップの反復回数
[AUC_bootstrap, CI_final] = bootstrapAUC(yTest, score(:, 2), nBoot);
% 混同行列の計算
confusionMatrix_final = confusionmat(yTest, predictions);
tn = confusionMatrix_final(1, 1);
fp = confusionMatrix_final(1, 2);
fn = confusionMatrix_final(2, 1);
tp = confusionMatrix_final(2, 2);
% 指標の計算
sensitivity_final = tp / (tp + fn);
specificity_final = tn / (tn + fp);
ppv_final = tp / (tp + fp);
npv_final = tn / (tn + fn);
accuracy_final = (tp + tn) / sum(confusionMatrix_final(:));
% 結果の表示
fprintf('ファイル %d - 最終的なAUC: %.2f\n', fileIndex, AUC_final);
fprintf('ファイル %d - ブートストラップ法による95%%信頼区間: [%.2f, %.2f]\n', fileIndex, CI_final(1), CI_final(2));
fprintf('感度: %.2f\n', sensitivity_final);
fprintf('特異度: %.2f\n', specificity_final);
fprintf('陽性的中率: %.2f\n', ppv_final);
fprintf('陰性的中率: %.2f\n', npv_final);
fprintf('診断精度: %.2f\n', accuracy_final);
% ROC曲線を描画
figure;
plot(Xroc, Yroc, 'b-', 'LineWidth', 2);
xlabel('特異度');
ylabel('感度');
title(sprintf('ファイル %d - ROC曲線', fileIndex));
grid on;
% 混同行列を描画
figure;
confusionchart(confusionMatrix_final, {'Negative', 'Positive'}, 'RowSummary', 'row-normalized', ...
'ColumnSummary', 'column-normalized');
title(sprintf('ファイル %d - 混同行列', fileIndex));
end
% ブートストラップ法によるAUCと信頼区間を計算する関数
function [AUC, CI] = bootstrapAUC(yTrue, scores, nBoot)
% 初期化
AUC = zeros(nBoot, 1);
for i = 1:nBoot
idx = randi(length(yTrue), [length(yTrue), 1]); % リプレースメントで再サンプリング
yBoot = yTrue(idx);
scoresBoot = scores(idx);
[~, ~, ~, AUC(i)] = perfcurve(yBoot, scoresBoot, 1); % AUC計算
end
% 信頼区間の計算
CI = prctile(AUC, [2.5 97.5]); % 95%信頼区間
end
5 Comments
Sandeep Mishra
on 1 Oct 2024
Edited: Sandeep Mishra
on 1 Oct 2024
I will answer the same to mark the question answered!
Please feel free to reach out if you have any further questions or encounter any issues.
Accepted Answer
Sandeep Mishra
on 1 Oct 2024
Edited: Sandeep Mishra
on 1 Oct 2024
Hi Takeharu,
I executed the code snippet in MATLAB R2023a and encountered the same error you mentioned.
Upon debugging the code, it became evident that there is a size mismatch issue within the 'delongTest' function. Specifically, the variable 'scores_all{i}' has a size of 98x1, while 'y_all{i}' has a size of 491x1 for each variable 'i'. This discrepancy causes an error when calling 'v1(posIdx)' inside the 'delongTest' function.
To resolve this, you need to refactor the code and ensure that the input variables passed to the ‘delongTest’ function are of compatible dimensions.
I hope this helps you rectify the issue.
0 Comments
More Answers (0)
See Also
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!