how do i read just the header in a csv file and write them into a file

38 views (last 30 days)
hi guys,
i have a csv file that i'd like to get a listing of all the headers, transposed into one column, and then write them out to a new file. so far what i've done is:
str = fileread('filename.csv')
index = strfind(str, '1');
header = str(1:(index-1));
and that gives me a character array with a single a row of all the fields in header seperated by commas, see below
header='field1,field2,field3,field4,field5'
what i need is a new file with the field names written as a column
field1
field2
field3
field4
field5
thanks for whatever help you can give!
Todd

Accepted Answer

Star Strider
Star Strider on 5 Jul 2022
One approach —
header='field1,field2,field3,field4,field5'
header = 'field1,field2,field3,field4,field5'
headerstring = string(strsplit(header,',')).'
headerstring = 5×1 string array
"field1" "field2" "field3" "field4" "field5"
.
  7 Comments
Adam Jurhs
Adam Jurhs on 5 Jul 2022
well now i've got two txt files with the string data in one column, i need to compare the fields within the two files to find out which fields are contained in both files, like this:
file1.txt has
field1
field2
field3
field4
field5
file2.txt has
field1
apples
bannans
field5
oranges
i should get a new file that has
newfile.txt
field1
field5
Star Strider
Star Strider on 5 Jul 2022
Thank you!
The ismember function is appropriate here —
file1 = ["field1"
"field2"
"field3"
"field4"
"field5"];
file2 = ["field1"
"apples"
"bannans"
"field5"
"oranges"];
Lv = ismember(file1, file2) % Logical Vector
Lv = 5×1 logical array
1 0 0 0 1
Result = file1(Lv)
Result = 2×1 string array
"field1" "field5"
... as desired!
You can also use ‘Lv’ here as the row (first) index to refer to multiple columns of the matrix, something like:
newDataExtract = newData(Lv,:)
if necessary.
.

Sign in to comment.

More Answers (2)

Adam Jurhs
Adam Jurhs on 6 Jul 2022
when the string arrays are of the same length, that works really well
but if the string arrays are not the same length (and in general my string arrays are not) then things get tricky, it appears the
Lv = ismember(file1, file2)
still gives me the correct answer, giving zeros for the additional rows if the length of file1 is greater than file2
and in this case Result=file1(Lv) gives the right answer
Also, a string that exists in both files may not be in the same row in both files, again
Lv = ismember(file1, file2)
seems to find them both but then the question of which but Lv = ismember(file2, file1) gives back a differnt answer, and so does the Result commands
Result=file1(Lv) or Result=file2(Lv)
so the question is which do I assign to which? do i always assign the file1 equal to the file with the greater length?
i kind of think that's what i should do, and then also use Result=file1(Lv) i've played with it and i can't seem to find the right combination that works for all cases of different array lengths and fields existing in both files but not in the same order:
Lv = ismember(file1, file2) with Result=file1(Lv)
or
Lv = ismember(file1, file2) with Result=file2(Lv)
or
Lv = ismember(file2, file1) with Result=file1(Lv)
or
Lv = ismember(file2, file1) with Result=file2(Lv)
  1 Comment
Star Strider
Star Strider on 6 Jul 2022
The ‘Lv’ vector indexes into the first argument of ismember. In the situation you describe, the first argument should be the longest vector, since that would correspond to the ‘Lv’ output.
I initially chose ismember because it appeared to be appropriate for the situation you describe. An alternative to experiment with, that may be closer to what you want, is the intersect function.
A = randi(9, 5, 1)
A = 5×1
1 6 4 1 6
B = randi(9, 7, 1)
B = 7×1
2 4 3 6 8 9 6
[C,ia,ib] = intersect(A,B)
C = 2×1
4 6
ia = 2×1
3 2
ib = 2×1
2 4
Aia = A(ia)
Aia = 2×1
4 6
Bib = B(ib)
Bib = 2×1
4 6
The disadvantage of using intersect however is that it returns only the index of the first occurrence in each vector. If you expect only one match in each vector, that works. If there could be more than one match, and you want all of them, it won’t.
.

Sign in to comment.


Adam Jurhs
Adam Jurhs on 7 Jul 2022
Edited: Adam Jurhs on 7 Jul 2022
thanks so much for your help!!!
i don't understand the necessity of ia and ib maybe i dob't need them at all. i think they are the indices of the array where the two overlapping values occur, right? i think in my situation (after reading the help on intersect ) i would use C=intersect(A,B) if that's the case, and knowing that there will only be one match within each array (in your example the number 6 would only occur once in both arrays A and B (they could just be located at different indices), i think the use of intersect and ismember will give the same result, yes?
thanks again for your help
Todd
  3 Comments
Adam Jurhs
Adam Jurhs on 7 Jul 2022
Edited: Adam Jurhs on 7 Jul 2022
gottcha
ismember worked perfectly on the "real" files. the only trick (and i've got it figured out, i just have to be careful) is getting the files' headers into string arrays.
hey thanks again for all your help!!!
how do i close out this thread saying i'm really happy with the answers?
Todd
Star Strider
Star Strider on 7 Jul 2022
As always, my pleasure!
You just did! (Also by accepting my answer, for which I thank you!)

Sign in to comment.

Categories

Find more on Startup and Shutdown in Help Center and File Exchange

Products


Release

R2021b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!