At this point in the course, you have seen a number of different ways to use individual and combined data types for computational phenotyping. The number of possible combinations and permutations we can apply to identify a patient population is enormous. In this video, we will discuss the different ways that you can evaluate and pick a final algorithm. In previous exercises, you have been calculating four different statistics to determine how well each datatype works, sensitivity, specificity, positive and negative predictive values. A helpful way of understanding the relationship of these measures is with a receiver operating characteristic or ROC plot. An ROC plot shows the sensitivity along the y-axis. As you go higher, you are finding more of the true cases. The x-axis is the inverse of specificity. As you move to the right, you are finding more and more false positives. The diagonal line from zero is equivalent to a random guess. A point in the upper left corner has perfect prediction. It captures all cases with no false positives. Typically, ROC curves are used in machine learning. We can have different measures of sensitivity and specificity based on model parameters, so there will be lines of performance as shown here. However, in computational phenotyping, we only have single values for each algorithm. Plotting these as single points on the ROC plot is still useful to compare algorithm performance. Now that we can see and compare algorithms on a single plot, how do we know which one is best? Like with everything else in clinical data science, it is all about your long-term goal for the data. If you have a very large data set and an algorithm is your only way to identify a patient population, you likely want to select an algorithm that has a good balance of sensitivity and specificity. If you were doing detailed analyses, you may prioritize a high specificity algorithm, even though you'll be missing a number of true cases. Other times, you may be trying to identify a very serious condition, and you have the ability to review records or talk to the patients directly. In this case, a high sensitivity algorithm that captures every single case but has a lot of false positives maybe best. Other factors you should consider are algorithm complexity and portability. If your algorithm is very complex and requires a lot of computational effort and time, it may not be the best algorithm for tasks that need to be run every day or multiple times a day. In these cases, a simpler model, even one that isn't quite as good, may be perfectly acceptable and preferable for the end goal. Similarly, if the algorithm is meant to be used in clinical care, you want to use data types that are easily available in the EHR. This is also true if you plan to apply our algorithms at different hospitals or health systems, who store their data differently. It's important to understand whether your algorithm needs to be portable to other systems as you select your final model. In the readings, we will provide real world examples of computational phenotyping algorithms and have you evaluate the algorithm performance, complexity, and portability to help determine the trade-offs and drawbacks of each algorithm.