What data do I need to be able to plot an ROC curve?

3 views (last 30 days)
Any help very much appriciated!!
I have a very large proteomic data set which I have analysed using three seperate approaches. I would now like to assess which of these approches has performed better for me data set by plotting an ROC curve however I can not work out what data is required for this.
For context my data is a list of proteins which have been detected as 'interactors' of mutant ion channels. I have curated a 'postivie list' of previously confirmed interacotrs and a 'negative list' of common contaminating proteins. From these list I would like to calculate the TPR and FPR and plot the ROC curve for each approach. However, from what I can see from examples and previous questions, people that are using the ROC function already know their TPR and FPR.
Due to the size of my data set I could not calcualte this manually for every threshold between 0 and 1. I could calculate the rates for a single given threshold.
I was wondering if anyone could tell me what input data I need for this function? Can you simply input your 'postive' and 'negative' list and calculate the TPR and FPR for each threshold in matlab, or do i need to already have this information?
Many thanks to anyone that can give me some adivce!

Answers (1)

Image Analyst
Image Analyst on 27 Jun 2020
You cannot get a "rate" from a single image. So you have three possible approaches, and for each approach you can test your hundreds or thousands of images, so you can and will have 3 ROC curves - one for each approach. But to map out an ROC curve for a single approach, you need to change something. If you don't change anything, the rate will be a single number, meaning that it's just a single point on the curve. What parameters do you plan on changing? What parameters might affect the TPR and FPR? The threshold? Okay, if it's the threshold then you need to use one threshold on all your images and thus get one point on the curve. Then change the threshold to something else and get another point. Now change it to a bunch of thresholds and you will get a bunch of points on the curve for that approach. Then repeat for the other approach and you'll build up 3 curves, which you can then plot.
  2 Comments
Alice Gold
Alice Gold on 28 Jun 2020
Thank you for your response! Yes this all makes sense and you are correct that I was planning on changing the threshold to produce the curves and did expect just three curves, one for each approach. My question was more how would I calculate the TPR and FPR for say, each 0.1 threshold between 0 and 1 without doing this manually. I wondered if this is done within the ROC function on matlab and if so which input documents are required for this, or whether I need to calculate these rates before, and if so is there a way of doing this on matlab so it will test each threshold for me?
Image Analyst
Image Analyst on 28 Jun 2020
There is a function perfcurve() that you might want to look into.
You can't compute the rate unless you have run the experiment and seen the result and compared that result to the known, ground truth result for that image. Then do that for a bunch of images and count the number that got the correct result and divide by the number of images to get the rate.

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!