AI model increase accuracy

3 views (last 30 days)
Christos Papagrigoriou
Christos Papagrigoriou on 8 Dec 2021
Answered: Prince Kumar on 17 Jan 2022
Hello
I do have a problem. I train a model that predicts cancer reccurence based on the factors included in the file attached. The best accuracy that I get is around 79-80 %. I wanna reach an accuracy of 90% and above. I use the learners classification app to achieve that. I have tried different combination of factors ( for example getting rid of age or of gemder or age and gender etc) but kinda hitting on a concrete wall as accuracy does not increase. I use the train all function on the app to check what model fits the best. I dont know if I have included too many factors creating too much noise rather providing useful arguements. Any reccomendations on how to determine the best model posisble or editing the model appropriately to get the deisred results would be more than appreciated!
Columns 1 and 2 arent factors of investigation.
cheers
  2 Comments
yanqi liu
yanqi liu on 30 Dec 2021
yes,sir,what is the rate between train and test split? such as 80% to train、20% to test?
this is 2 classify application,may be use svm、cnn and so on
Walter Roberson
Walter Roberson on 30 Dec 2021
As someone who spent years developing classification software:
79-80% is as good is it gets much of the time for real data. 82-84% if the fit was really good.
Once we got 90% fit, and that got us a paper in a respected journal; and several prizes; and our method was pushed into production... because the "Gold Standard", the best that highly experienced pathologists could do was 86%.

Sign in to comment.

Answers (1)

Prince Kumar
Prince Kumar on 17 Jan 2022
Hi,
You can try the following things :
  • You can try deep learning methods and try using different regularization technique
  • Randomly shuffle the data before doing the spit, this will make sure that data distribution is nearly the same. If your data is in datastore you can use 'shuffle' function else you can use "randperm" function.
  • Make sure each set (train, validation and test) has sufficient samples like 60%, 20%, 20% or 70%, 15%, 15% split for training, validation and test sets respectively.
  • As there is no train, val and test set, you can perform k-fold cross validation

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!