[MUSIC] Hi, I'm Elizabeth Sugar and today we're going to cover some advanced topics in clinical trials. Most of these topics require a little more knowledge of bio statistics. We're going to talk about simulations, adaptive trials and the Bayesian approaches to trials. Simulations, are an important tool for evaluating clinical trial designs. Simulation studies, are a computer intensive method to answer questions that cannot be or are hard to answer theoretically. You design a simulation study as you would an experiment. You have your aims, which defined the question to be asked the procedures for performing and analyzing the simulation where you generate the data and evaluate it. And then there's the presentation of results, where you report your findings preferably in a reproducible manner. There are two primary applications for clinical trials. Study design and trial monitoring. With study design the goal of a simulation study is to determine the characteristics of a proposed trial before actually starting. This allows you to identify any problems before they affect your trial. You can look at design characteristics such as power sample size, detectable difference or type one or type two error. You can look at the initial estimates and also robustness to assumptions you make about your trial design. For randomization, you can examine the expected balance you might achieve with a given randomization schema. Or you can look at how different methods impact your balance. For example, the inclusion of stratification variables or an adaptive randomization design. You can also evaluate the impact of stopping rules, including the likelihood of stopping early and the number of individuals who might be enrolled. Either the full cohort or the fraction If you stopped early. This is one of the best applications of simulation studies, because stopping rules are often mechanistic. So they're hard to evaluate without simulations. You can examine the impact of deviations from the protocol, for example, crossovers or noncompliance lost to follow up or missing data. And finally, you can also do one of the more common applications, which is statistical evaluation. Looking at the bias variability or coverage, probability of your intervals as well as robustness to distributional assumptions. For example, determining the impact of non normal data when using a linear model. We look at different factors for study monitoring. Here the aim is to evaluate the characteristics of the trial, during or at the end of the trial. You can use simulation studies to determine whether or not to continue, what is the probability of a significant result from your trial. This can be evaluated for example, with conditional power or Bazian posterior probabilities. You can also project, what your late outcomes might be based on earlier outcomes. So, for example, if your primary outcome is a one year change in visual acuity. You can see how the six month change in visual acuity looks to try to make a guess at what the one year outcomes might be. Both during and at the end of the trial, you can look at estimation and inference. What is the impact of deviations from the protocol. How robust am I to modeling assumptions. And what is the effect of missing data. Let's look at the steps for a simulation. First, you must specify the structure. This includes both the trial design as well as the data structure and distribution. The next, step is to actually generate your data. The computer randomly generates the data for a simulated trial as if those were the actual participants. This can include the predictors and the outcome variables. The third, step is to run the trial. Pretend that you're generated data is real participant data and follow the steps that you would during the trial. This includes evaluating stopping rules and stage recruitment. Finally, you evaluate your characteristics. In this case, you determine the quote unquote outcome of the trial. Do I have a significant result? Did I stop early? Then, you repeat this process many times. You generate multiple actualization of your trial. After you've repeated it enough times and we'll discuss how many times is enough. You summarize your results. For example, the percent of the time that you had a significant result or stopped early. How do we generate the data? Will we assume that we know the underlying truth. So what are we assuming we know? We know the outcome variables, the predictor variables and their distribution as well as the structure. The parameters in your linear model, such as effect size or correlation between measurements. There are many sources to use when determining what the data should be. There's existing data from other studies, biological knowledge and it's always important to consider a range of possibilities. When we're creating data, we will either create the entire data set. This is needed for pretrial evaluation or partial data sets. This will be used when you're evaluating during or after the trial. These partial data sets may include additional follow up or missing data. Again, it's very important to consider a wide range of possibilities, not just a single case. Because we are assuming that we know what the truth is. But we don't actually have any concrete evidence to support our assumptions. Let's walk through an example of how simulation studies can be used. In this case we're evaluating a toxicity monitoring plan. The pancreatic cancer radiotherapy study. Pancreas, was designed to randomize 172 participants to receive one of two treatments, modified full fare knocks or stereotactic body radiation therapy plus full for Knox. 86 participants would be assigned to each treatment arm. We are concerned about the potential for grade three and four toxicities within the first six months following the radiation therapy. Therefore, we want to enroll ten participants to gain an idea of what the toxicities might be and then monitor continuously thereafter. To provide guidance, we set to thresholds. The maximum allowed toxicity which is 15%. We do not want to go above this toxicity level and the expected toxicity level of 12%. The 12% level, was based on data from prior studies that showed roughly 8% of patients, had grade three or four toxicities within six months. However, we chose to be conservative and increase this rate to account for the potential influence of the fulfilling Knox. And the fact that multiple sites would be recruiting participants, not just one. Our rule was, that we would re evaluate the safety of the combination therapy. If the chance that the toxicity exceeded the maximum allowed level of 15% was greater than 0.65. So what does this actually mean? Well, these are our toxicity stopping guidelines. The table shows, the number of toxicities that we would have to observe in order to re evaluate the safety of the combination therapy for different cohort sizes. For example, after 10-14 patients. If we had observed five toxicities, we would reevaluate. Similarly, we would reevaluate after six toxicities occurred in 15 to 20 participants. And as the cohort size increases the number of toxicities that we would have to observe before reevaluating also increases. Let's look at the steps in the simulation used to evaluate this toxicity rule. Our structure sets the maximum toxicity 15% the sample size 86 are stopping threshold 0.65. And the model that we will use for the true toxicity, which is a Bernoulli p distribution. Basically, a coin toss. To generate the data, we generate a toxicity, yes or no for 86 patients using that Bernoulli distribution. With a proportion P we chose a range of different values for the proportion to evaluate this rule under different circumstances. To run the trial, we look at our 86 fake patients. First, we look at the first ten participants, is the number of toxicities below the threshold. If it's not, the trial would stop based on our stopping rules. And we would conclude that we stopped early. And note that we only enroll ten participants. If the number of toxicities after the first ten was below that threshold, we would look at the next participant and repeat is the toxicity level below our threshold. We would continue adding participants, until we either reached the stopping threshold. Or recruited all 86, at the end of the day we would record for that one instance of the simulation. Whether we had stopped early and how many participants had been recruited. Now, looking at one simulated trial is not enough. Instead we repeated this 5000 times. Now, typically re repeat simulations many 1000 times in order to get accurate precision on our estimates. So we would do this process of 5000 repetitions for each different underlying true probability of having an adverse event. So let's look at the results of this simulation. The table on the left shows, for each assumed level of true toxicity. What was the probability of stopping the trial early. You'll notice, that for the expected level of toxicity between eight and 12%, we would stop only between 1.4 and 11.3% of the time. So not very often. Which is what we want to happen. We do not want to stop early if we are within our safe range. In contrast, when we exceed the maximum allowed toxicity, for example, those scenarios with 20%, 25% or 30% of participants having a toxicity. In that case, we are quite likely to stop early Again, depending upon how high the levels of toxicity are, we would stop between 74 and 99% of the time. Which is again what we want to happen. The more unsafe the combination therapy, the more likely we want to be to stop. So this simulation showed us that are stopping rules had good characteristics and behaved as we would like. That serves as an introduction to simulation studies. [MUSIC]