cheatsheet for dealing with structures?

21 views (last 30 days)
One thing that I find hardest in matlab is dealing with structures and extracting information from then. For me it seems there is not enough clear examples how to extract data in different situations. Is there any cheatsheet made for this? whether its data, [data], {data}, [data].', data{i} etc, there should be examples for all of these situations....
here is example; im trying to get all timestamps "timestamp" (for finding duplicate fields) of sub stucture "trainLocations" from every first cell of main structure "trainTraffic". Everything I have tried has been in vain...
trainTraffic(1).trainLocations(1).timestamp == {trainTraffic.trainLocations(1).timestamp}.'
error is : Intermediate dot '.' indexing produced a comma-separated list with 57 values, but it must produce a single value when followed by subsequent indexing operations.
  2 Comments
Stephen23
Stephen23 on 21 Feb 2024
Edited: Stephen23 on 21 Feb 2024
"For me it seems there is not enough clear examples how to extract data in different situations. Is there any cheatsheet made for this?"
Perhaps this:
and in the official documentation:
"whether its data, [data], {data}, [data].', data{i} etc, there should be examples for all of these situations...."
None of those have anything (directly) to do with structures. Those are various syntactic-sugar operators and some totally unrelated indexing and transpose operations. Just like every other function or operator MATLAB (or any other programming language for that matter) does not exhaustively list out every single example [], [A], [A,B], [A,B,C], etc. for every single version of A, B, C, etc which can be variables, expressions, function calls, comma-separated lists, etc. Nor should documentation that explains the concatenation operator explain what the transpose operator does... just as the documentation that explains comma-separated lists should not explain how to use every other operator and function.
Consider how many permutations of operators your proposed documentation would have...
Your proposed documentation would be ... truly enormous. It would be difficult to find anything in it.
"im trying to get all timestamps "timestamp" (for finding duplicate fields) of sub stucture "trainLocations" from every first cell of main structure "trainTraffic""
TRAINTRAFFIC is a structure, it does not have cells.
"Intermediate dot '.' indexing produced a comma-separated list with 57 values, but it must produce a single value when followed by subsequent indexing operations."
That error message is explained in detail here:

Sign in to comment.

Accepted Answer

Stephen23
Stephen23 on 21 Feb 2024
Edited: Stephen23 on 21 Feb 2024
"im trying to get all timestamps "timestamp" (for finding duplicate fields) of sub stucture "trainLocations" from every first cell of main structure "trainTraffic"."
It is unclear exactly what you want because your terminology is a bit mixed up. For a start, structures themselves have no cells. I guess that you want to get the first element of the TRAINLOCATIONS field of each element of the structure TRAINTRAFFIC. Anyway, lets take a first look at your data:
trainTraffic = load('traintraffic.mat').trainTraffic
trainTraffic = 57×1 struct array with fields:
cancelled trainNumber trainType compositions trainLocations timeTableRows routesetMessages
For that you could use a loop (easy) or use ARRAYFUN or use a comma-separated list with a cell array operator. However the situation is made a bit more complex by the fact that some of the TRAINLOCATIONS fields contain empty structures (which clearly do not have a 1st structure element and would throw an error if you tried to access that non-existent element). So we will use some basic indexing to ignore the empty fields:
C = {trainTraffic.trainLocations}; % comma-separated list
X = ~cellfun(@isempty,C);
find(~X) % Elements of trainTraffic with empty trainLocations field.
ans = 1×2
47 54
F = @(s) s(1).timestamp;
Z = cell(size(C));
Z(X) = cellfun(F,C(X),'uni',0);
Also note that some of those timestamps are empty too:
Z{:}
ans = '2023-09-26T08:16:22.000Z'
ans = '2023-09-26T19:58:15.000Z'
ans = '2023-09-25T23:40:10.000Z'
ans = '2023-09-26T07:13:38.000Z'
ans = '2023-09-26T17:49:16.000Z'
ans = '2023-09-26T17:15:28.000Z'
ans = '2023-09-26T17:04:55.000Z'
ans = '2023-09-26T14:25:14.000Z'
ans = '2023-09-26T18:10:16.000Z'
ans = '2023-09-26T06:13:51.000Z'
ans = '2023-09-26T09:26:42.000Z'
ans = '2023-09-26T16:26:40.000Z'
ans = '2023-09-26T19:14:41.000Z'
ans = '2023-09-25T20:50:51.000Z'
ans = '2023-09-26T03:37:16.000Z'
ans = '2023-09-26T07:52:55.000Z'
ans = '2023-09-26T21:00:24.000Z'
ans = '2023-09-25T22:00:54.000Z'
ans = '2023-09-26T10:30:29.000Z'
ans = '2023-09-25T23:40:10.000Z'
ans = '2023-09-26T13:40:05.000Z'
ans = '2023-09-25T22:22:34.000Z'
ans = '2023-09-25T21:58:50.000Z'
ans = '2023-09-26T13:01:13.000Z'
ans = '2023-09-26T13:00:38.000Z'
ans = '2023-09-26T15:15:44.000Z'
ans = '2023-09-26T19:05:45.000Z'
ans = '2023-09-26T08:16:22.000Z'
ans = '2023-09-26T18:10:16.000Z'
ans = '2023-09-26T06:13:51.000Z'
ans = '2023-09-26T19:58:15.000Z'
ans = '2023-09-26T12:37:41.000Z'
ans = '2023-09-26T13:56:41.000Z'
ans = '2023-09-26T12:17:03.000Z'
ans = '2023-09-26T13:45:34.000Z'
ans = '2023-09-26T09:26:42.000Z'
ans = '2023-09-26T16:26:40.000Z'
ans = '2023-09-26T19:14:41.000Z'
ans = '2023-09-26T13:46:34.000Z'
ans = '2023-09-26T13:28:39.000Z'
ans = '2023-09-26T17:49:16.000Z'
ans = '2023-09-25T23:16:35.000Z'
ans = '2023-09-26T11:14:20.000Z'
ans = '2023-09-26T03:36:37.000Z'
ans = '2023-09-25T22:32:34.000Z'
ans = '2023-09-25T23:13:26.000Z'
ans = []
ans = '2023-09-26T11:44:04.000Z'
ans = '2023-09-26T16:16:07.000Z'
ans = '2023-09-26T14:11:40.000Z'
ans = '2023-09-26T07:47:02.000Z'
ans = '2023-09-25T22:17:03.000Z'
ans = '2023-09-26T11:50:39.000Z'
ans = []
ans = '2023-09-26T13:05:45.000Z'
ans = '2023-09-26T06:40:41.000Z'
ans = '2023-09-26T08:00:02.000Z'
Again we can use some indexing to ignore the empty text and convert the rest to DATETIME:
Y = ~cellfun(@isempty,Z);
D = NaT(size(Z));
D(Y) = datetime(Z(Y), "InputFormat","y-M-d'T'H:m:s.SSS'Z'");
D(:) % lets take a peek
ans = 57×1 datetime array
26-Sep-2023 08:16:22 26-Sep-2023 19:58:15 25-Sep-2023 23:40:10 26-Sep-2023 07:13:38 26-Sep-2023 17:49:16 26-Sep-2023 17:15:28 26-Sep-2023 17:04:55 26-Sep-2023 14:25:14 26-Sep-2023 18:10:16 26-Sep-2023 06:13:51 26-Sep-2023 09:26:42 26-Sep-2023 16:26:40 26-Sep-2023 19:14:41 25-Sep-2023 20:50:51 26-Sep-2023 03:37:16 26-Sep-2023 07:52:55 26-Sep-2023 21:00:24 25-Sep-2023 22:00:54 26-Sep-2023 10:30:29 25-Sep-2023 23:40:10 26-Sep-2023 13:40:05 25-Sep-2023 22:22:34 25-Sep-2023 21:58:50 26-Sep-2023 13:01:13 26-Sep-2023 13:00:38 26-Sep-2023 15:15:44 26-Sep-2023 19:05:45 26-Sep-2023 08:16:22 26-Sep-2023 18:10:16 26-Sep-2023 06:13:51
The inconsistent data (i.e. empty structure and cells) makes this more complex to parse. If the data was consistent it would be easier to parse.
  4 Comments
Sven Larsen
Sven Larsen on 23 Feb 2024
Thank you @Stephen23 for your very helpful and thoughtful answer. This was the culprit:
"AFAICT, the main cause of confusion in this situation is that beginners incorrectly think of nested structures as one big structure with lots of "layers" which can be all accessed simultaneously. In reality, nested structures are nested structures, just like nested cell arrays are nested cell arrays (and each content cell array is its own completely independent cell array, requiring its own indexing). They are called nested cell arrays and nested structures for a reason."
Indeed, I've been thinking structures like this and thats why I have so much troubles with them.
Stephen23
Stephen23 on 23 Feb 2024
@Sven Larsen: take a look at the image/diagram here:
Basically each of those lines (i.e. between s(1), .n(1), .a, etc.) is a pointer to another array in memory. Note how e.g. .n(1) and .n(2) are two totally different arrays. That they are separate arrays is not changed just because they coincidentally have the same type (struct) and fieldnames.

Sign in to comment.

More Answers (1)

Hassaan
Hassaan on 21 Feb 2024
Edited: Hassaan on 21 Feb 2024
Understanding MATLAB Data Containers
Structures (struct): A structure is a data type in MATLAB that allows for the storage of data of different types and sizes. Structures are accessed in two primary ways:
  • Field Access: Use dot notation to access fields within a structure, e.g., struct.fieldName.
  • Element Access: For an array of structures, access individual structures using parentheses, e.g., structArray(index).
Cell Arrays ({}): Cell arrays are containers that can hold data of varying types and sizes. There are two main ways to interact with cell arrays:
  • Content Access: Use curly braces to access the contents of individual cells, e.g., cellArray{index} retrieves the content of the cell at index.
  • Element Access: Use parentheses to access the cell itself as a unit, e.g., cellArray(index) refers to the cell at index without extracting its contents.
Arrays ([]): Arrays are collections of elements of the same type. MATLAB supports various types of arrays, such as numeric arrays, character arrays, and logical arrays.
  • Element Access: Use parentheses to access individual elements of an array, e.g., array(index).
Example Problem
Given a structure array trainTraffic with a substructure trainLocations containing a field timestamp, and you want to get all timestamp values from the first trainLocations of each trainTraffic entry.
% Number of elements in trainTraffic
numElements = numel(trainTraffic);
% Preallocate a cell array for timestamps
timestamps = cell(numElements, 1);
% Loop through each element of trainTraffic
for i = 1:numElements
% Access the first trainLocations of each trainTraffic element and its timestamp
timestamps{i} = trainTraffic(i).trainLocations(1).timestamp;
end
% timestamps cell array now contains the desired timestamp values
Explanation
  • numel(trainTraffic): Counts how many elements are in trainTraffic.
  • Preallocate a cell array timestamps: It's good practice for efficiency and clarity.
  • Loop through each trainTraffic: Extract the timestamp from the first trainLocations and store it in the timestamps cell array.
Additional Points:
  • Nesting: You can nest these data containers within each other for more complex data structures.
  • Specific Methods: Each container type might have additional methods for specific operations, such as adding, removing, or iterating through elements. Refer to the MATLAB documentation for details.
References
-----------------------------------------------------------------------------------------------------------------------------------------------------
If you find the solution helpful and it resolves your issue, it would be greatly appreciated if you could accept the answer. Also, leaving an upvote and a comment are also wonderful ways to provide feedback.
It's important to note that the advice and code are based on limited information and meant for educational purposes. Users should verify and adapt the code to their specific needs, ensuring compatibility and adherence to ethical standards.
Professional Interests
  • Technical Services and Consulting
  • Embedded Systems | Firmware Developement | Simulations
  • Electrical and Electronics Engineering
Feel free to contact me.
  2 Comments
Stephen23
Stephen23 on 21 Feb 2024
Edited: Stephen23 on 21 Feb 2024
"Access elements with dot notation, e.g., struct.field."
No, this is incorrect.
Fields are accessed using dot indexing.
Elements of a structure are accessed using parentheses (just like every other array class in MATLAB).
"Access elements with curly braces, e.g., cellArray{index}."
Again, incorrect.
The cell content is accessed using curly-braces.
Elements of a cell array are accessed using parentheses (just like every other array class in MATLAB).
Hassaan
Hassaan on 21 Feb 2024
Thanks for your insight. Updated.

Sign in to comment.

Categories

Find more on Structures in Help Center and File Exchange

Products

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!