How can I get the index numbers of cells that form a specific string?

1 view (last 30 days)
I have a column of strings as follows:
Ex1:
str=[{'a.b.c.d'};{'b.c.d.e'};{'a'};{'b'};{'c'};{'d'};{'e'}];
Ex2:
str=[{'a.b'};{'b.c'};{'c.d'};{'d.e'};{'a.b.c'};{'b.c.d'};{'c.d.e'}];
I want to return the index of cells that can form 'a.b.c.d.e'.
In example 1, rows 1&7, 3&2, and 3&4&5&6&7 can form 'a.b.c.d.e'; hence I want the code to return [(1,7);(3,2);(3,4,5,6,7)] (in however format).
In example 2, rows 1&7 and 5&4 can form 'a.b.c.d.e'; hence I want the code to return [(1,7);(5,4)].
Can anyone help me on how to do this? I'm quite new at Matlab and none of my ideas seem to lead anywhere at this point.

Answers (2)

Bruno Luong
Bruno Luong on 16 Jul 2019
Edited: Bruno Luong on 16 Jul 2019
s='a.b.c.d.e'
str=[{'a.b.c.d'};{'b.c.d.e'};{'a'};{'b'};{'c'};{'d'};{'e'}]
c = idexing(str, s);
c{:}
str=[{'a.b'};{'b.c'};{'c.d'};{'d.e'};{'a.b.c'};{'b.c.d'};{'c.d.e'}]
c = idexing(str, s);
c{:}
% recursive algorithm
function c = idexing(str, s)
if isempty(s)
c = {};
else
b = cellfun(@(p) strncmp(p,s,length(p)),str);
i1 = find(b);
c = cell(size(i1));
if ~isempty(i1)
for k=1:length(c)
head = str{i1(k)};
tail = s(length(head)+2:end);
i1k = i1(k);
if ~isempty(tail)
ctail = idexing(str, tail);
c{k} = cellfun(@(i) [i1k,i], ctail, 'unif', 0);
else
c{k} = {i1k};
end
end
c = cat(1,c{:});
end
end
end
Result
s =
'a.b.c.d.e'
str =
7×1 cell array
{'a.b.c.d'}
{'b.c.d.e'}
{'a' }
{'b' }
{'c' }
{'d' }
{'e' }
ans =
1 7
ans =
3 2
ans =
3 4 5 6 7
str =
7×1 cell array
{'a.b' }
{'b.c' }
{'c.d' }
{'d.e' }
{'a.b.c'}
{'b.c.d'}
{'c.d.e'}
ans =
1 7
ans =
5 4

Walter Roberson
Walter Roberson on 16 Jul 2019
This is an algorithm question rather than a MATLAB question, really.
Start with an empty cell array of leading substrings, and a corresponding cell array of vectors of leading indices.
Take a copy of the current cell array of leading substrings, say CS. Find the length of the shortest of them. Loop through all entries in str. For any one entry, test to see whether shortest length + length of entry is longer than the target string: if it is, then skip the next section of the loop. If shortest plus length of candidate is not longer than the targer string, then append the entry to the end of all of the values in CS. Then go through each result and test to see whether it is a valid leading substring of the target. If it is, then pull out put the extended string and put it on a queue, along with its index list appended with the index of the candidate. When the list of candidates has been gone through, replace the list of candidates with the contents of the queue and repeat. Stopping conditions are left as an exercise to the reader.

Categories

Find more on Structures in Help Center and File Exchange

Products


Release

R2019a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!