Finding unique values in a table without losing the rest of the data

269 views (last 30 days)
I'm working on a table right now, where the rows are equal except for one column, like this (Departure column has "updated" daytimes):
ID = [153; 153; 153; 153; 124; 124; 124; 211; 211];
Start = {'A';'A';'A';'A';'A';'A';'A';'A';'A'};
End = {'B';'B';'B';'B';'B';'B';'B';'B';'B'};
Departure = datetime({'2014-06-30 12:12'; '2014-06-30 12:15'; '2014-06-30 12:14'; '2014-06-30 12:16'; '2014-06-30 14:32'; '2014-06-30 14:30'; '2014-06-30 14:31'; '2014-06-30 17:31'; '2014-06-30 17:42'});
A = table(ID, Start, End, Departure)
Now I'm trying to use the unique feature to reduce the table, so that I only have each ID once, but without losing the rest of the information of the table, like i would if i just enter
C = unique(A.ID);
This way I receive a 3x1 table/vector and the rest of the information is lost.
The Documentation says for this case (I asume): C = unique(A,vars) returns a dataset that contains only one observation for each unique combination of values for the variables in A specified in vars. vars is a positive integer, a vector of positive integers, a variable name, a cell array containing one or more variable names, or a logical vector. C includes all variables from A. The values in C for the variables not specified in vars are taken from the last occurrence among observations in A with each unique combination of values for the variables specified in vars.
Using this explanation:
C = unique(A, ID);
I received following Error Message: Unrecognized variable name 'Adaptor'.
Please help, if you have any ideas to solve my problem. Thanks
  3 Comments
Philipp Henschel
Philipp Henschel on 30 Nov 2017
Using this syntax I still received an 3x1 table, I still have the loss of information of the other columns. Aiming vor a 3x4 table:
153 'A' 'B' 2014-06-30 12:16
124 'A' 'B' 2014-06-30 14:31
211 'A' 'B' 2014-06-30 17:42
I think I used the wrong Documentation > Statistics and Machine Learning Toolbox for datasets. I used the other one already before with the wrong outcome :P
Stephen23
Stephen23 on 30 Nov 2017
"Now I'm trying to use the unique feature to reduce the table, so that I only have each ID once, but without losing the rest of the information of the table, "
Please explain what you expect to happen with the rest of the data. How would you compress >1 rows into just one row?

Sign in to comment.

Accepted Answer

Elizabeth Reese
Elizabeth Reese on 5 Dec 2017
If you want to keep the data from the rows with unique IDs, you can do that using the indexes that unique returns.
In your case,
[C,IA,IC] = unique(A.ID);
B = A(IA,:)
You can specify the setOrder or occurrence in unique to determine which Departure date is selected and the order of the IDs in the new table.
  2 Comments
QIAO WANG
QIAO WANG on 24 Apr 2019
Hi, thank you so much. I benefit a lot from this answer. Besides, I've got another issue when I play with setOrder and occurrence.
By default, [C,IA,IC] = unique(A.ID); will use 'sorted' and 'first'. However, I want to use 'stable' and 'last' at the same time so that I can get the last occurance of that element in the same order as in the original data. But I got an error, which makes me confused because I assume this is very basic. Sorry, I may explain my question unclearly. I'll show an example here.
>> A
A =
9×4 table
ID Start End Departure
___ _____ ___ ____________________
153 'A' 'B' 30-Jun-2014 12:12:00
153 'A' 'B' 30-Jun-2014 12:15:00
153 'A' 'B' 30-Jun-2014 12:14:00
153 'A' 'B' 30-Jun-2014 12:16:00
124 'A' 'B' 30-Jun-2014 14:32:00
124 'A' 'B' 30-Jun-2014 14:30:00
124 'A' 'B' 30-Jun-2014 14:31:00
211 'A' 'B' 30-Jun-2014 17:31:00
211 'A' 'B' 30-Jun-2014 17:42:00
What I want to obtain is like this (manually made this table). I assumed I could get this table by simply using [C,IA,IC] = unique(A.ID,'stable','last'); B = A(IA,:); However, it didn't work. "You cannot specify 'stable' and 'sorted' with 'first' and 'last'."
B =
3×4 table
ID Start End Departure
___ _____ ___ ___________________
153 'A' 'B' 30-Jun-2014 12:16:00
124 'A' 'B' 30-Jun-2014 14:31:00
211 'A' 'B' 30-Jun-2014 17:42:00
So, can I ask is there any way that can figure this out? I've got a very complicated method which I don't think is efficient.
Elizabeth Reese
Elizabeth Reese on 24 Apr 2019
Please refer to this other MATLAB Answers post regarding "stable" and "last", as specifiying both the "setOrder" and "setOccurence" is not currently supported. There is a workaround on this post and we have recorded the enhancement request for future consideration.

Sign in to comment.

More Answers (1)

Peter Perkins
Peter Perkins on 19 Dec 2017
It's pretty hard to tell what you mean by, "without losing the rest of the information of the table", but an alternative to Amy's suggestion would be to use rowfun, with ID as the grouping variable, and somehow combine all the data from each set of rows that share the same ID.

Products

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!