Why does the unique() function give an unexpected extra output here?

I'm trying to find the unique elements inside a string array (attached, "datestrings_of_problem_times.mat").
When I run this (after loading "datestrings_of_problem_times.mat") ;
[datestrings_of_interest, indices, b] = unique(datestrings_of_problem_times);
The output "datestrings_of_interest" (attached, "datestrings_of_interest.mat") has an empty string ( [""] ) as its first element, but there are no empty strings inside my initial string array.
And, the output "indices" (attached, "indices") has 50877 as its first element. But, my string array has 50876 elements in the first place. So, the unique function somehow checked whether the 50877th element of a 50876x10 string array was unique or not. What am I missing / how could I fix this?

 Accepted Answer

>> load ..\Answers\datestrings_of_problem_times.mat
>> whos d*
Name Size Bytes Class Attributes
datestrings_of_problem_times 50876x10 27473136 string
>> datestrings_of_problem_times(1:10,:)
ans =
10×10 string array
"12/2017" "" "" "" "" "" "" "" "" ""
"12/2017" "" "" "" "" "" "" "" "" ""
"12/2017" "" "" "" "" "" "" "" "" ""
"12/2017" "" "" "" "" "" "" "" "" ""
"12/2017" "" "" "" "" "" "" "" "" ""
"12/2017" "" "" "" "" "" "" "" "" ""
"12/2017" "" "" "" "" "" "" "" "" ""
"12/2017" "" "" "" "" "" "" "" "" ""
"12/2017" "" "" "" "" "" "" "" "" ""
"12/2017" "" "" "" "" "" "" "" "" ""
>>
The problem is you have somehow created a 2D string array, the elements of which for columns 2:10 are all empty strings; ergo unique over the full array is correct to include the empty element as a unique string.
Using only the first column, one gets the expected result.
>> unique(s(:,1))
ans =
21×1 string array
"01/2018"
"02/2018"
"02/2020"
"02/2021"
"03/2019"
"03/2020"
"03/2021"
"04/2020"
"04/2021"
"05/2019"
"06/2018"
"06/2019"
"06/2020"
"07/2018"
"08/2019"
"10/2018"
"11/2018"
"11/2019"
"11/2020"
"12/2017"
"12/2020"
>>
We'd have to see the code that created the array to know how the 2D array came into existence, but that's where the symptoms arise.

1 Comment

Ooh this solved it. I initially didn't catch that I created something like
string_array = strings(50876,10)
and only filled the first column.
Thanks a bunch!

Sign in to comment.

More Answers (0)

Categories

Products

Release

R2019a

Tags

Asked:

on 17 Oct 2021

Edited:

dpb
on 17 Oct 2021

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!