1 view (last 30 days)

Show older comments

Hello,

I have data that often looks like this:

"HIST1H2BC" "K13"

"HIST1H2BC" "K13;K16"

"HIST1H2BC" "K16"

"HIST1H2BH" "K13"

"HIST1H2BH" "K13;K16"

"HIST1H2BH" "K16"

"HIST1H2BO" "K13;K16"

"HIST1H2BO" "K16"

"HIST2H2BE" "K13;K16"

"HIST2H2BE" "K16"

I have been trying to code a function that splits the second columns at the ';' and then removes any rows for which every element is contained in another row, which would hopefully yield something like this:

"HIST1H2BC" "K13" "K16"

"HIST1H2BH" "K13" "K16"

"HIST1H2BO" "K13" "K16"

"HIST2H2BE" "K13" "K16"

All of the solutions I have tried have been very excessive and difficult to wrap my head around.

Thank you in advance!

Chunru
on 30 Jul 2021

Edited: Chunru
on 30 Jul 2021

x = ["HIST1H2BC" "K13"

"HIST1H2BC" "K13;K16"

"HIST1H2BC" "K16"

"HIST1H2BH" "K13"

"HIST1H2BH" "K13;K16"

"HIST1H2BH" "K16"

"HIST1H2BO" "K13;K16"

"HIST1H2BO" "K16"

"HIST2H2BE" "K13;K16"

"HIST2H2BE" "K16"

"1H1D" "K137"

"1H1D" "K137|K138"

"1H1D" "K138"

"1H1D" "K136"

"1H1E" "K136|K137"

"1H1E" "K137"];

s = split(x(1, 2), {';', '|'});

y = {x(1, 1), s'};

for i=2:size(x, 1)

s = split(x(i, 2), {';', '|'});

[lix, locy] = ismember(x(i, 1), [y{:, 1}]);

if ~lix

% new entry

y =[ y; {x(i, 1) ,s'}];

else

[lis, loc] = ismember(s, y{locy, 2});

y{locy, 2} = [y{locy, 2} s(~lis)'];

end

end

y

KSSV
on 30 Jul 2021

str = ["HIST1H2BC" "K13"

"HIST1H2BC" "K13;K16"

"HIST1H2BC" "K16"

"HIST1H2BH" "K13"

"HIST1H2BH" "K13;K16"

"HIST1H2BH" "K16"

"HIST1H2BO" "K13;K16"

"HIST1H2BO" "K16"

"HIST2H2BE" "K13;K16"

"HIST2H2BE" "K16"];

iwant = strings([],3) ;

count = 0 ;

for i = 1:length(str)

s = strsplit(str(i,2),';') ;

if length(s) == 2

count = count+1 ;

iwant(count,:) = [str(i,1) s] ;

end

end

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!