extract data from EEG text file

Question

0 votes

I need help to write script to exatrct MCAP samples with time it occured in seaerate file and plot so I can use these sampes ton signal procsing application on maltlb this is only art of the data , the file contains tens of CAP samples so need genaral code to exatrrct them

Time Date Sample # Type Sub Chan Num Aux

[22:16:05.000 01/01/2007] 0 " 0 0 0 ## time resolution: 256

[22:16:05.000 01/01/2007] 0 0 0 0

[22:34:35.000 01/01/2007] 284160 " 0 0 0 SLEEP-S0 30 W ROC-LOC

[22:35:05.000 01/01/2007] 291840 " 0 0 0 SLEEP-S0 30 W ROC-LOC

[22:35:35.000 01/01/2007] 299520 " 0 0 0 SLEEP-S0 30 W ROC-LOC

[22:36:05.000 01/01/2007] 307200 " 0 0 0 SLEEP-S0 30 W ROC-LOC

[22:36:35.000 01/01/2007] 314880 " 0 0 0 SLEEP-S0 30 W ROC-LOC

[22:37:05.000 01/01/2007] 322560 " 0 0 0 SLEEP-S1 30 S1 ROC-LOC

[22:37:35.000 01/01/2007] 330240 " 0 0 0 SLEEP-S0 30 W ROC-LOC

[22:38:05.000 01/01/2007] 337920 " 0 0 0 SLEEP-S0 30 W ROC-LOC

[22:38:35.000 01/01/2007] 345600 " 0 0 0 SLEEP-S0 30 W ROC-LOC

[22:39:05.000 01/01/2007] 353280 " 0 0 0 SLEEP-S0 30 W ROC-LOC

[22:39:35.000 01/01/2007] 360960 " 0 0 0 SLEEP-S1 30 S1 ROC-LOC

[22:40:05.000 01/01/2007] 368640 " 0 0 0 SLEEP-S1 30 S1 ROC-LOC

[22:40:35.000 01/01/2007] 376320 " 0 0 0 SLEEP-S0 30 W ROC-LOC

[22:41:05.000 01/01/2007] 384000 " 0 0 0 SLEEP-S1 30 S1 ROC-LOC

[22:41:35.000 01/01/2007] 391680 " 0 0 0 SLEEP-S1 30 S1 ROC-LOC

[22:41:37.000 01/01/2007] 392192 " 0 0 0 MCAP-A3 17 S1 EEG-F4-C4

[22:41:57.000 01/01/2007] 397312 " 0 0 0 MCAP-A3 9 S1 EEG-F4-C4

[22:42:05.000 01/01/2007] 399360 " 0 0 0 SLEEP-S2 30 S2 ROC-LOC

[22:42:13.000 01/01/2007] 401408 " 0 0 0 MCAP-A3 11 S2 EEG-F4-C4

[22:42:28.000 01/01/2007] 405248 " 0 0 0 MCAP-A3 23 S2 EEG-F4-C4

[22:42:35.000 01/01/2007] 407040 " 0 0 0 SLEEP-S2 30 S2 ROC-LOC

[22:42:57.000 01/01/2007] 412672 " 0 0 0 MCAP-A3 10 S2 EEG-F4-C4

[22:43:05.000 01/01/2007] 414720 " 0 0 0 SLEEP-S2 30 S2 ROC-LOC

[22:43:11.000 01/01/2007] 416256 " 0 0 0 MCAP-A2 6 S2 EEG-F4-C

1 Comment
Show -1 older comments Hide -1 older comments

per isakson on 28 Apr 2019

See readtable Create table from file and fixedWidthImportOptions Import options object for fixed-width text files (Introduced in R2017a)

Sign in to comment.

Sign in to answer this question.

Follow Question

Answer 1

Cedric on 27 Apr 2019

Edited: Cedric on 28 Apr 2019

Open in MATLAB Online

1 vote

data01.txt

Using the data text file that you provided elsewhere (renamed and attached to this answer), here is a short example of one way to parse it. Note that it is not the best way, but it is good enough for starting the discussion:

buffer = fileread( 'data01.txt' ) ;
pattern = '\[([^\]]+).\s+(\d+)\s+"\s+(\d+)\s+(\d+)\s+(\d)+\s+MCAP-(\S+)\s+(\S+)\s+(\S+)\s+(\S+)' ;
data = regexp( buffer, pattern, 'tokens' ) ;
data = vertcat( data{:} ) ;

Running it outputs a cell array of 830 rows associated with MCAP entries, as follows:

EDIT 04/28/2019@1:59pm UTC: I updated the pattern so REGEXP extracts all other "numeric" columns.

>> data
data =
  830×9 cell array
    {'22:41:37.000 01…'}    {'392192' }    {'0'}    {'0'}    {'0'}    {'A3'}    {'17'}    {'S1'}    {'EEG-F4-C4' }
    {'22:41:57.000 01…'}    {'397312' }    {'0'}    {'0'}    {'0'}    {'A3'}    {'9' }    {'S1'}    {'EEG-F4-C4' }
    {'22:42:13.000 01…'}    {'401408' }    {'0'}    {'0'}    {'0'}    {'A3'}    {'11'}    {'S2'}    {'EEG-F4-C4' }
    {'22:42:28.000 01…'}    {'405248' }    {'0'}    {'0'}    {'0'}    {'A3'}    {'23'}    {'S2'}    {'EEG-F4-C4' }
    ...
    {'07:08:22.000 02…'}    {'8175872'}    {'0'}    {'0'}    {'0'}    {'A1'}    {'8' }    {'S4'}    {'EEG-F4-C4' }
    {'07:11:27.000 02…'}    {'8223232'}    {'0'}    {'0'}    {'0'}    {'A1'}    {'8' }    {'S4'}    {'EEG-Fp2-F4'}
    {'07:12:08.000 02…'}    {'8233728'}    {'0'}    {'0'}    {'0'}    {'A1'}    {'6' }    {'S4'}    {'EEG-Fp2-F4'}
    {'07:18:31.000 02…'}    {'8331776'}    {'0'}    {'0'}    {'0'}    {'A1'}    {'6' }    {'S4'}    {'EEG-F4-C4' }
    {'07:18:53.000 02…'}    {'8337408'}    {'0'}    {'0'}    {'0'}    {'A1'}    {'7' }    {'S4'}    {'EEG-F4-C4' }
    {'07:19:27.000 02…'}    {'8346112'}    {'0'}    {'0'}    {'0'}    {'A1'}    {'15'}    {'S4'}    {'EEG-F4-C4' }
    {'07:20:29.000 02…'}    {'8361984'}    {'0'}    {'0'}    {'0'}    {'A1'}    {'11'}    {'S4'}    {'EEG-F4-C4' }
    {'07:20:48.000 02…'}    {'8366848'}    {'0'}    {'0'}    {'0'}    {'A1'}    {'12'}    {'S4'}    {'EEG-F4-C4' }

Now depending what you want to accomplish, you may prefer using a TIMETABLE or a TIMESERIES object, or just some conversion of these columns.

So now you should define which part of the data you are interested in, and how you are planning to process it.

Let me know if you have any question.

21 Comments
Show 19 older comments Hide 19 older comments

D. Ali on 28 Apr 2019

It is function in physionet WFDB tool box Yes the function read all data and it was easy to use it to convert signals to physical and displayed in signal app It might be good idea to edit this code to extract MCAP samples only with the time I thought if I extract from samples text The code with rdmat function need three data files only provided in physionet

rdmat

function varargout=rdmat(varargin) [tm,signal,Fs,siginfo]=rdmat(recordName) Import a signal in physical units from a *.mat file generated by WFDB2MAT. Required Parameters: recorName String specifying the name of the *.mat file. Outputs are: tm A Nx1 array of doubles specifying the time in seconds. signal A NxM matrix of doubles contain the signals in physical units. Fs A 1x1 integer specifying the sampling frequency in Hz for the entire record. siginfo A LxN cell array specifying the signal siginfo. Currently it is a structure with the following fields: siginfo.Units siginfo.Baseline siginfo.Gain siginfo.Description NOTE: You can use the WFDB2MAT command in order to convert the record data into a *.mat file, which can then be loaded into MATLAB/Octave's workspace using the LOAD command. This sequence of procedures is quicker (by several orders of magnitude) than calling RDSAMP. The LOAD command will load the signal data in raw units, use RDMAT to load the signal in physical units. KNOWN LIMITATIONS: This function currently does support several of the features described in the WFDB record format (such as multiresolution signals) : http://www.physionet.org/physiotools/wag/header-5.htm If you are not sure that the record (or database format) you are reading is supported, you can do an integrity check by comparing the output with RDSAMP: [tm,signal,Fs,siginfo]=rdmat('200m'); [tm2,signal2]=rdsamp('200m'); if(sum(abs(signal-signal2)) !=0); error('Record not compatible with RDMAT'); end Written by Ikaro Silva, 2014 Last Modified: November 26, 2014 Version 1.2 Since 0.9.7 %Example: wfdb2mat('mitdb/200') tic;[tm,signal,Fs,siginfo]=rdmat('200m');toc tic;[signal2]=rdsamp('200m');toc sum(abs(signal-signal2)) See also RDSAMP, WFDB2MAT

Cedric on 1 May 2019

Edited: Cedric on 1 May 2019

Open in MATLAB Online

But the problem is not to extract CAP entries, this is technical, we know how to do this.

Currently the problem, at least on my side, is that I still don't understand what you need to do with this. If I pick a series of lines associated with CAP, form the source file:

[22:41:37.000 01/01/2007]   392192     "    0    0    0	MCAP-A3 17 S1 EEG-F4-C4
[22:41:57.000 01/01/2007]   397312     "    0    0    0	MCAP-A3 9 S1 EEG-F4-C4
[22:42:13.000 01/01/2007]   401408     "    0    0    0	MCAP-A3 11 S2 EEG-F4-C4
[22:42:28.000 01/01/2007]   405248     "    0    0    0	MCAP-A3 23 S2 EEG-F4-C4

Ok, now what do I do with this? Are there data to extract from there that need to be converted to numeric for plotting? This file apparently do not contain signal information, so are these lines only defining time stamps for CAP?

If so, where do you need to add these labels? Is it on the plot of the signal(s) that you generate after calling RDMAT?

If so, is it something that is already done for all labels (how?) and you'd like to keep only CAP labels, or is it something that must be implemented from scratch?

D. Ali on 1 May 2019

perfect thanks alot for your time and patience

Cedric on 2 May 2019

Edited: Cedric on 2 May 2019

Open in MATLAB Online

No problem!

Next issue though: rdmat output arrays that suggest that there are 1e6 samples:

>> [tm,signal,Fs,siginfo]=rdmat('sdb4_edfm');
>> whos
  Name               Size                     Bytes  Class     Attributes
  Fs                 1x1                          8  double              
  siginfo            1x18                     11040  struct              
  signal       1000000x18                 144000000  double              
  tm                 1x1000000              8000000  double     

Here you see tm, the vector of times I suppose, that has 1 million elements and the array of signals has 1 million rows (I guess each corresponding to a sample).

Now after converting the sample # from you annotation file to numeric:

buffer = fileread( 'annotations sdb4.txt' ) ;
pattern = '\[([^\]]+).\s+(\d+)\s+"\s+(\d+)\s+(\d+)\s+(\d)+\s+MCAP-(\S+)\s+(\S+)\s+(\S+)\s+(\S+)' ;
annotations = regexp( buffer, pattern, 'tokens' ) ;
annotations = vertcat( annotations{:} ) ;
sampleId = str2double( annotations(:,2) ) ;

I see that sample # (or IDs) up to 8,36,6848, which is way above 1 million. So most of the sample IDs correspond to regions that are outside of the plot ..(?)

Sign in to comment.

extract data from EEG text file

1 Comment
Show -1 older comments Hide -1 older comments

Accepted Answer

21 Comments
Show 19 older comments Hide 19 older comments

More Answers (0)

Categories

Tags

Community Treasure Hunt

extract data from EEG text file

1 Comment Show -1 older comments Hide -1 older comments

Accepted Answer

21 Comments Show 19 older comments Hide 19 older comments

More Answers (0)

Categories

Tags

See Also

Community Treasure Hunt

1 Comment
Show -1 older comments Hide -1 older comments

21 Comments
Show 19 older comments Hide 19 older comments