GETCHUNKS

Version 1.1.0.1 (2.99 KB) by Jiro Doke
Get the number of repetitions that occur in consecutive chunks.
2.7K Downloads
Updated 1 Sep 2016

View License

C = GETCHUNKS(A) returns an array of n elements, where n is the number of consecutive chunks (2 or more repetitions) in A, and each element is the number of repetitions in each chunk. A can be LOGICAL, any numeric vector, or CELL array of strings. It can also be a character array (see below, for its special treatment).
[C, I] = GETCHUNKS(A) also returns the indices of the beginnings of the chunks.

If A is a character array, then it finds words (consecutive non-spaces), returning the number of chararcters in each word and the indices to the beginnings of the words.

GETCHUNKS(A, OPT) accepts an optional argument OPT, which can be any of the following three:

'-reps' : return repeating chunks only. (default)
'-full' : return chunks including single-element chunks.
'-alpha' : (for CHAR arrays) only consider alphabets and numbers as part of words. Punctuations and symbols are regarded as spaces.

Examples:
A = [1 2 2 3 4 4 4 5 6 7 8 8 8 8 9];
getchunks(A)
ans =
2 3 4

B = 'This is a generic (simple) sentence';
[C, I] = getchunks(B)
C =
4 2 1 7 8 8
I =
1 6 9 11 19 28

[C, I] = getchunks(B, '-alpha')
C =
4 2 1 7 6 8
I =
1 6 9 11 20 28

Cite As

Jiro Doke (2024). GETCHUNKS (https://www.mathworks.com/matlabcentral/fileexchange/10038-getchunks), MATLAB Central File Exchange. Retrieved .

MATLAB Release Compatibility
Created with R13SP1
Compatible with any release
Platform Compatibility
Windows macOS Linux
Categories
Find more on Characters and Strings in Help Center and MATLAB Answers
Acknowledgements

Inspired: FINDSEQ

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!
Version Published Release Notes
1.1.0.1

Updated license

1.1.0.0

Added '-alpha' option. Updated license.

1.0.0.0