You are now following this question
- You will see updates in your followed content feed.
- You may receive emails, depending on your communication preferences.
Reading only Certain Columns from .CSV
9 views (last 30 days)
Show older comments
I'm trying to read in the 1st and 3rd columns of data from a file.
I know this code shows the 2nd and 3rd:
M = csvread('filename.csv', 2, 2)
How do I skip the 2nd column and just display the 1st and 3rd?
Answers (1)
Walter Roberson
on 20 Mar 2019
Edited: Walter Roberson
on 20 Mar 2019
That cannot be done with csvread() or dlmread().
With textscan() you would use a format specification of '%f %*f %f %*[^\n]' . You would probably use cell2mat() around the result of the textscan() call.
With readtable() the way to proceed would be to use detectImportOptions (new as of R2016b). You might have to give the option 'ReadVariableNames', false . Assign the result to a variable, and set the SelectedVariableNames property of the object to [1 3]. Then use readtable() on the .csv file, passing in that options object.
18 Comments
Megan Stapley
on 20 Mar 2019
Would you be able to provide some sample code? I'm not really too sure what you are referring to
Walter Roberson
on 20 Mar 2019
filename = 'filename.csv';
[fid, msg] = fopen(filename, 'rt');
if fid < 0
error('Failed to open file "%s" because "%s"', filename, msg);
end
data = cell2mat( textscan(fid, '%f,%*f,%f%*[^\n]') );
fclose(fid);
%data is now a numeric table with two columns
or
filename = 'filename.csv';
opts = detectImportOptions(filename, 'ReadVariableNames', false);
opts.SelectedVariableNames = [1 3];
data_table = readtable(filename, opts);
%data_table is now a table object with two variables.
%data_table{:,:} would be a numeric array with two columns
Megan Stapley
on 3 Apr 2019
I am getting the following errors with your second solution:
Error using detectImportOptions
'ReadVariableNames' is not a recognized parameter. For a list of valid name-value pair
arguments, see the documentation for detectImportOptions.
Error in detectImportOptions>getTypedParser/parsefcn (line 287)
p.parse(args{:});
Error in detectImportOptions>textArgs (line 319)
args = parser(otherArgs);
Error in detectImportOptions (line 219)
args = textArgs(p.Unmatched);
Error in FlexTest2 (line 2)
opts = detectImportOptions(lux, 'ReadVariableNames', false);
Megan Stapley
on 3 Apr 2019
And with your first proposed solution I am getting an empty array:
data =
0×2 empty double matrix
Walter Roberson
on 3 Apr 2019
Ah, ReadVariableNames was not a parameter in your R2016b release.
Please show a sample of the first 3 or 4 lines of your file.
Megan Stapley
on 3 Apr 2019
epoc (ms)timestamp (-0400)elapsed (s)illuminance (lx)
15542357188562019-04-02T16.08.38.8560.0008.203
15542357208562019-04-02T16.08.40.8562.0002.323
15542357228552019-04-02T16.08.42.8553.9992.323
15542357248552019-04-02T16.08.44.8555.9992.323
15542357268542019-04-02T16.08.46.8547.9982.323
Walter Roberson
on 3 Apr 2019
Perhaps you should attach the sample as a file, so that we can check for hidden characters such as tab characters.
What you have posted cannot be read by csvread() or xlsread().
In R2017a and later, what you posted could be approached with readtable() and a FixedWidthImportOptions https://www.mathworks.com/help/matlab/ref/matlab.io.text.fixedwidthimportoptions.html . However, you only have R2016b. For your options, see the discussion at https://www.mathworks.com/matlabcentral/answers/453137-sscanf-to-extract-numbers-from-string#answer_367985
Megan Stapley
on 3 Apr 2019
Edited: Megan Stapley
on 3 Apr 2019
See attached CSV file
Walter Roberson
on 3 Apr 2019
t = readtable('BlackModelTest.csv');
times = datetime(t{:,1}/1000, 'convertfrom', 'posixtime');
elapsed = t{:,3};
My tests show that if you do
seconds(times - times(1))
then that exactly matches the elapsed data (column 3).
In order for the second column to match, we would have to assume that there is a 4 hour timezone difference
Megan Stapley
on 3 Apr 2019
Edited: Megan Stapley
on 3 Apr 2019
Hi Walter,
I'm sorry this is getting really confusing. The comment you just posted only outputs the timestamp.
All I want to output is an array with just the TIMESTAMP and LIGHT SENSOR READING (columns 2 and 4).
Can you simply expalin code to achieve this?
Walter Roberson
on 3 Apr 2019
Edited: Walter Roberson
on 3 Apr 2019
t = readtable('BlackModelTest.csv');
timestamp = datetime(t{:,1}/1000, 'convertfrom', 'posixtime');
light_sensor_reading = t{:,4};
In the sample file you provided, the second column was the text version of a timestamp, with the text being equivalent to the information that could be obtained by treating the first column as a POSIX time. There is, however, a timezone difference between the time obtained from the first column, and the text version of the time; the difference would be easiest to explain if the text version is local time in EDT (Eastern Daylight Time). If so, then you could use
t = readtable('BlackModelTest.csv');
timestamp = datetime(t{:,1}/1000, 'convertfrom', 'posixtime', 'TimeZone', 'UTC');
timestamp.TimeZone = 'America/New_York';
light_sensor_reading = t{:,4};
Megan Stapley
on 3 Apr 2019
okay but how do I get ONE SINGLE ARRAY. WIth the timestamp and light sensor reading?
I currently have a seperated timestamp array and light sensor reading array. How do I combine the two into ONE ARRAY 2 COLUMNS?
Megan Stapley
on 3 Apr 2019
final = [timestamp.TimeZone(:), light_sensor_reading(:)]
I am trying this but am getting error "Dimensions of matrices being concatenated are not consistent"
Walter Roberson
on 3 Apr 2019
Your second column is text. How do you intend to represent that in a numeric array?
What numeric value do you want stored for '2019-04-03T14.01.57.690' ?
Megan Stapley
on 3 Apr 2019
Why can't the first column be a string and the second column be a number?
Walter Roberson
on 3 Apr 2019
perhaps
t(:,[2 4])
would work for your purposes. If not, then
[t{:.2}, num2cell(t{:,4})]
Megan Stapley
on 3 Apr 2019
Hi walter,
This works well but for my second application I would prefer not a table. How do I get a matrix?
The table is giving me an error because I am trying to add a greater than operator to find light sensor values greater than 20.
Thanks!
Walter Roberson
on 3 Apr 2019
What would it mean to use a greater than operator with those text timestamps ?
If you want to test the sensor values you can extract them from the table:
mask = t{:,4} > 20;
which could, for example, be used in the context:
t(t{:.4}>20, [2 4])
See Also
Categories
Find more on Data Type Conversion in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!An Error Occurred
Unable to complete the action because of changes made to the page. Reload the page to see its updated state.
Select a Web Site
Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .
You can also select a web site from the following list
How to Get Best Site Performance
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.
Americas
- América Latina (Español)
- Canada (English)
- United States (English)
Europe
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom(English)
Asia Pacific
- Australia (English)
- India (English)
- New Zealand (English)
- 中国
- 日本Japanese (日本語)
- 한국Korean (한국어)