Issue when using categorical variables with functions; fitrgp, bayesopt, optimizableVariable and predict.

4 views (last 30 days)
Hej, Im looking for an answer or some sparring on an issue i encounter when performing bayesopt on some training data. I have a very simple trial phase script, I'm optimizing an experiment that that i have performed 3 times under different circumstanses (Temp and OverNightColony=ON).
When i assign ON as a binary variable all is good, however if I change it to a categorial variable something goes wrong and i get an error regarding the bayesopt command.
Below I have assigned the code. (The script wil run and perform a bayesopt with the optimization toolbox, however if you switch the binary and categorical variable that is currently commented out the error will occur).
clc; clear all; close all;
%3 Sets of training data: experiments A,B & C
n = 0; % Binary variable -- I would like these to be categorial
y = 1; % Binary variable
% ExperimentA (avg of 8 tests)
T_A = 37; O_A = n; VarA = [T_A O_A]; P_A = 100-85;
% ExperimentB (avg of 4 tests)
T_B = 37; O_B = y; VarB = [T_B O_B]; P_B = 100-78;
% ExperimentC (avg of 4 tests)
T_C = 20; O_C = y; VarC = [T_C O_C]; P_C = 100 - ((250+251+236+251)/4)/271;
%Structure Training Data in Array (#Experiments x #TuneVar)
epse = [VarA; VarB; VarC]; %The order of tuning variables must be same as in vars-vector
JJJ = [P_A; P_B; P_C];
% Fit a random gaussian process model
gprMdl = fitrgp(epse,JJJ,'KernelFunction','squaredexponential'); %Also called a surrogate model
%gprMdl = fitrgp(Tab,'KernelFunction','squaredexponential'); %Also called a surrogate model
%rng default
% Define desired variables to be optimized and the span of values they are allowed to attain
T = optimizableVariable('Temp',[15 45],'Type','integer'); % Also reffered to as hyperparameters
%O = optimizableVariable('O_N',{'y' 'n'},'Type','categorical');
O = optimizableVariable('O_N',[0 1],'Type','integer');
vars = [T O];
% Perform bayesian optimization Both bayes opt fials with a categorial
% variable.
bayesObject = bayesopt(@(tbl)mdlfun(tbl,gprMdl),vars,'Verbose',1,...
'AcquisitionFunctionName','expected-improvement-plus',...
'PlotFcn',@plotMinObjective);
% bayesObject = bayesopt(@(tbl)mdlfun(tbl,gprMdl),vars,'OutputFcn',@assignInBase, ...
% 'SaveVariableName','BayesIterations')
% The following function utilizes the new guess of hyperparameters given from the BO to predict the corresponding cost f.
function f = mdlfun(tbl,gprMdl) % By default this executes 30 times
T = tbl.Temp;
O = tbl.O_N;
vars = [T O];
f = predict(gprMdl,vars);
end

Answers (2)

Hrishikesh Borate
Hrishikesh Borate on 23 Jul 2021
Hi,
The following code demonstrates a possible approach to declare the O_N variable as a categorical variable and perform the optimization.
clc; clear all; close all;
%3 Sets of training data: experiments A,B & C
% ExperimentA (avg of 8 tests)
T_A = 37; O_A = 'n'; P_A = 100-85;
% ExperimentB (avg of 4 tests)
T_B = 37; O_B = 'y'; P_B = 100-78;
% ExperimentC (avg of 4 tests)
T_C = 20; O_C = 'y'; P_C = 100 - ((250+251+236+251)/4)/271;
%Structure Training Data in Array (#Experiments x #TuneVar)
epse = table([T_A;T_B;T_C],categorical({O_A;O_B;O_C}),'VariableNames',{'Temp','O_N'});
JJJ = [P_A; P_B; P_C];
% Fit a random gaussian process model
gprMdl = fitrgp(epse,JJJ,'KernelFunction','squaredexponential'); %Also called a surrogate model
%gprMdl = fitrgp(Tab,'KernelFunction','squaredexponential'); %Also called a surrogate model
%rng default
% Define desired variables to be optimized and the span of values they are allowed to attain
T = optimizableVariable('Temp',[15 45],'Type','integer'); % Also reffered to as hyperparameters
O = optimizableVariable('O_N',{'n' 'y'},'Type','categorical');
vars = [T O];
% Perform bayesian optimization Both bayes opt fials with a categorial
% variable.
bayesObject = bayesopt(@(tbl)mdlfun(tbl,gprMdl),vars,'Verbose',1,...
'AcquisitionFunctionName','expected-improvement-plus',...
'PlotFcn',@plotMinObjective);
% bayesObject = bayesopt(@(tbl)mdlfun(tbl,gprMdl),vars,'OutputFcn',@assignInBase, ...
% 'SaveVariableName','BayesIterations')
% The following function utilizes the new guess of hyperparameters given from the BO to predict the corresponding cost f.
function f = mdlfun(tbl,gprMdl) % By default this executes 30 times
f = predict(gprMdl,tbl);
end

Jesper Ankersen
Jesper Ankersen on 23 Jul 2021
Thank you very much, I see that the problem was in the way I structured the training data for the fit rgp model.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!