ImageDatastore オブジェクトでイメージの水増しの前処理について

Question

NicknameAlpha on 14 Oct 2018

0
Link

Direct link to this question

https://ch.mathworks.com/matlabcentral/answers/423884-imagedatastore

Commented: Kenta on 11 Jul 2020

ImageDatastore オブジェクトを作成し, イメージを含むフォルダーに従って各イメージにラベルが付けられた後, 各ラベルのファイル数が最も大きいラベルのファイル数に合わせるように各ラベルでファイルがコピー・複製されるような前処理ができそうな方法はありそうですか?

参考のURL: https://jp.mathworks.com/help/matlab/ref/datastore.counteachlabel.html

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Sign in to answer this question.

Answer 1

Kazuya on 14 Oct 2018

3
Link

Direct link to this answer

https://ch.mathworks.com/matlabcentral/answers/423884-imagedatastore#answer_341318

以前同じ方法を模索していまして、下記を参考にしました。

What is the best CNN for a small dataset?

Alpha Bravo さんの回答ですが参考までにコード転記します。まずラベルの数が（ほぼ）均等になるように、ラベルが少ない画像は単純に増やして、そのあと augmentedImageDatastore を使っておけば、同じ画像をそのまま学習に使うということは避けられるかと。

 trainStore = shuffle(trainStore); % i forgot to add the shuffle in the answer before
bootstrap_factor = 1; % how big do you want the new, balanced datastore to be, as a multiple of the size of the trainStore
  alphabetical_labels = {'happy', 'sad'}; % labels in alphabetical order, to map label names to their indices, if using the foldernames as labels
  labels = trainStore.Labels;
  labelCounts = countEachLabel(trainStore);
  labelCounts = labelCounts.Count;
  weights = labelCounts/sum(labelCounts);
  weights = weights.^(-1); % so less is more
  weightVec = [];
  for lab = 1:length(labels)
    for labidx = 1:length(alphabetical_labels)
      if labels(lab) == alphabetical_labels(labidx)
        weightVec(lab) = weights(labidx);
      end
    end
  end
  trainFiles = trainStore.Files;
  bootstrapSize = round(length(trainFiles) * bootstrap_factor);
  Bootstrap = datasample(trainFiles, bootstrapSize, 'Weights', weightVec);
  bootStrapTrainStore = imageDatastore(Bootstrap, 'LabelSource', 'foldernames', IncludeSubfolders', true);

の後に

 augmentedResolution = [128 128]; % or whatever image resolution you want to use
 augmenter = imageDataAugmenter('RandRotation', [-10 10]); % optional, used to augment data, see documentation for full options
 trainStoreAug = augmentedImageDatastore(augmentedResolution, bootStrapTrainStore, 'DataAugmentation', augmenter);

と続けるイメージ。

2 Comments
Show NoneHide None

Kazuya on 20 Oct 2018

weightVec = [];
for lab = 1:length(labels)
  for labidx = 1:length(alphabetical_labels)
    if labels(lab) == alphabetical_labels(labidx)
      weightVec(lab) = weights(labidx);
    end
  end
end

そのまま引用したこの部分、forループで回すのは効率が悪く時間がかかるので、論理配列を使う形にする方がよいです。ご注意ください。改めてコメントつけて書き直すと、

trainStore = shuffle(trainStore); % もともとの imageDatastore : trainStore （順番をランダム化）
bootstrap_factor = 1; % ラベルの数を合わせた後の画像の総数は、もともとの数の何倍になるようにするか。1の場合は画像総数は変化しないので、結果的にラベル数が多い画像数は少なくなります。
alphabetical_labels = {'happy', 'sad'}; % ラベル例
labels = trainStore.Labels; % ラベルのリスト（修正前）
labelCounts = countEachLabel(trainStore); % ラベルの数（修正前）
labelCounts = labelCounts.Count; % ラベルの数（修正前）
weights = labelCounts/sum(labelCounts); % ラベル数の割合
weights = weights.^(-1); % の逆数（ラベルを増やす割合）
trainFiles = trainStore.Files; % 画像ファイルへのパス
bootstrapSize = round(length(trainFiles) * bootstrap_factor); % ラベル数合わせ後の画像総数
weightVec = zeros(bootstrapSize,1); % ランダムサンプリングに使用する、重みのためのベクトル（数が少ないラベルには、大きな値が付く処理を下で）
for labidx = 1:length(alphabetical_labels) %
    index = labels == alphabetical_labels(labidx);
    werightVec(index) = weights(labidx);
end
% 重み付きランダムサンプリング（置換ナシ）
Bootstrap = datasample(trainFiles, bootstrapSize, 'Weights', weightVec);
bootStrapTrainStore = imageDatastore(Bootstrap, 'LabelSource', 'foldernames', IncludeSubfolders', true);

参考になれば。

Kenta on 11 Jul 2020

こちらの例もあります。上の方法は、確率的におなじ数になるようにするのに対し、こちらは、最も頻繁に現れるクラスの画像数を数えて、その枚数になるよう調整します。https://jp.mathworks.com/matlabcentral/fileexchange/78020-oversampling-for-deep-learning-classification-example

Sign in to comment.

ImageDatastore オブジェクトでイメージの水増しの前処理について

0 Comments
Show -2 older commentsHide -2 older comments

Accepted Answer

2 Comments
Show NoneHide None

More Answers (0)

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

ImageDatastore オブジェクトでイメージの水増しの前処理について

0 Comments Show -2 older commentsHide -2 older comments

Accepted Answer

2 Comments Show NoneHide None

More Answers (0)

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

0 Comments
Show -2 older commentsHide -2 older comments

2 Comments
Show NoneHide None