[MUSIC] Welcome to the lecture on outcomes in the inter-clinical trials class. This is Janet Holbrook again. So we are going to start with some definitions and some categorizations of different types of outcomes. First thing that I want to define is that I am using the term outcome but that term is synonymous with the term endpoint which is more commonly used in the literature than outcome. The reason that. I use outcome, as in my training it was stressed that you always continue to follow patients for the planned duration of the patient's follow-up. And that may be longer than when they have an endpoint. So an outcome doesn't imply that you're going to stop. Follow up, so, in our group, we encourage people to use the term outcome. Now, an outcome is usually a quantitative measure that somehow allows us to evaluate a treatment. And it puts some metric on how effective a treatment is, and so usually clinical notes or peoples subjective opinions are not very useful for this purpose. Its hard to do statistical analysis on those type of things. So Usually, we try to use the definition of the outcome to make what may be a qualitative thing a quantitative thing. It's important that an outcome reflects the objectives of the trial. You may choose different Outcomes based on whether you're looking at an efficacy versus an effectiveness trial. Whether your primary motivation is to evaluate the safety of a therapy or different therapies. Some clinical trials evaluate the process of treatment, or the process of an intervention in a broader sense, or the cost. Usually outcomes are determined in each study participant or we would say in each study of randomization unit. And the objectives of the trial are met by aggregating those outcomes across participants or randomization units by treatment. So there's typically hierarchy of outcomes in a clinical trial because once you start to put together all of the resources needed to conduct clinical trials, you're probably going to to want to measure more than one thing. So there's a primary outcome that should reflect the overall objectives of the trial and whatever is stated as the. The primary hypothesis and we'll go through some examples in a few slides. And that's the design variable and what I mean by the design variable, that's the variable for which the sample size is calculated on, usually. And it is related to the stage or type of research you're doing. So, what I mean by that is in, in preliminary phase one studies, the outcome may be a safety Related at any adverse events. In later stages of research the outcome may be a blood level, such as cholesterol, and then in another stage it may actually be a clinical outcome. So it's the type of research that's being done. Secondary outcomes are measured to evaluate other potentially important treatment effects. Certainly They could be adverse events or toxic effects of the treatment and usually if you want to do that well, in the context of a trial and you have some sense of a agent's toxicity, you would. Tried to make sure you had appropriate ways to measure that. For example, if you're doing a trial of a steroid regimen versus a steroid sparing regimen, you probably want to institute into the trial some measures of steroid toxicity or other toxicities of the new treatment, so that you can really Evaluate the treatments in terms of safety. You may be also evaluating the mechanism of effect, this is very common I think in even large scale trials funded through NIH there is usually some effect of effort to incorporate Some mech, mechanistic hypotheses into the trials as ancillary studies or sub-studies, and these may be quite well-defined, secondary outcomes. And having other outcomes besides the primary outcome allows for more complete evaluation of the treatment, for, for example the risk/benefit ratio, or the cost effectiveness of a treatment And, beyond the primary and secondary outcomes, there's a third layer of outcomes that may include data related to the patient health or study participation that you want to measure, sort of measure a process of the trial. Their compliance with the drug treatment, drug levels, and you may have some exploratory hypotheses that you may collect data on without a lot of prior research, but some thoughts as to why this outcome may be influenced by the treatments. But this is just an exploratory preliminary look at that. So for that most important outcome, the primary outcome, what are the criteria? Well certainly, you have to have defined the primary outcome to a great level of detail including what it is and how you are going to measure it. And even how you're going to analyze it. Before you start the trial. It should not be based on data that is developed during the trial. Now it is possible that you'll have a primary outcome and maybe not have a complete analysis plan when you start the trial. But before you look at the data that, that has been collected, you want to have things clearly defined so that you won't be influenced by the data in sort of Inappropriate ways. It should be a relevant outcome and likely to be influenced by the treatments. You have to be able to measure it accurately and reliably in all study participants. So you don't want to take an invalidated new cutting edge technology to use as an outcome in a clinical trial. It has to be a proven outcome. That, you're able to evaluate in everyone. The assessment should be able to be done, independent of treatment assignment. That you can measure it in both treatment groups and it's through the same procedures. And, it may seem, when would that not be true. But, if you had a surgery trial vs a systemic treatment trial. Your primary outcome shouldn't Be something very specific to the surgery that's not going to happen in the other group. And then you have to think about the power considerations and by power, I'm really referring to the size of the clinical trial, and the primary outcome really is a key component in driving that. And so The kind of nature of the outcome at its very ability. Its standard deviation. The frequency that you would expect to observe. And, what do you think is a likely difference between the treatment groups. All are important for the power calculations. So some examples. If you have a trial of asthma treatment, and the objective was to evaluate for asthma control, well, there's a number of possible outcomes you could have. You could have something like exhaled nitric oxide, which may be a measure of inflammation, and you might think if that was lower That the treatment was a good treatment. You could have lung functions such as FEV1 or peak flow, where you, people actually get on a spirometer, a machine that you breathe into, to measure how well their lungs are working. Or you may have something more related to how a person is actually feeling. Their symptoms. Are they wheezing, do they have night awakenings? And someone with poor lung function could not have many symptoms, so the symptoms is sort of a higher level outcome. And also, commonly at least in asthma trials, there may be composite measures you use. That's sort of an exacerbation, a lack of control of asthma, maybe some combination of a drop in lung function Increasing symptoms, night awakenings, those kind of symptoms and the need for medical care, hospitalizations or ER visits. Or sometimes they use symptom indices which are scores that kind of again blend together on few different facets of asthma control-like symptoms and medication use. Another example just to think about, is if you were evaluating a perioperative procedure that the objective of this procedure was to reduce perioperative morbidity so may be its a different kind of suture or different care for wound. If you were going to be [INAUDIBLE] finding this, what are the considerations for the outcome? Well, sort of the time window. How long is postoperative? When are you going to consider something related to the operation, versus, well, it is no longer likely to be affected by that. And what are the specific events that you would consider an outcome? What type of adverse events are you looking for? And what are the procedures you're going to use to establish the outcome? So are you going to be interviewing patients? Are you going to just be going back to the surgeon's office and looking at the follow up records? And the Specific metric. Is it going to be any adverse event and are you going to evaluate the severity if it's too reduce morbidity? Does it have to be a defined infection that can be assigned to a particular organism, or is it Fever of unknown origin's going to be considered a perioperative adverse event. So those were just two examples to kind of put some concreteness around what we've been talking about. So what are the metrics used for evaluating outcomes when you have events as outcomes, and what I mean by an event is an adverse event. Deaths, heart attacks. There are a few different types of ways of evaluating an event. And for the most part events are either they happened or didn't. So its the presence or absence of something or whether its normal or abnormal. Did the person die or didn't they? . Sometimes, you can transform data that is continuous by nature, something like hemoglobin levels into an event by saying did they have anemia or not. So by just instituting a cutoff value. And if you just look at whether an event occurred Or didn't occur, you would consider that a dichotomous outcome, if the defined outcome is, did a person have a remission at six months, or not? So you want to actually measure something at a specific time usually. But without relation to time in a, in another sense, its just did it occur or not? So you can take that one step farther and add the dimension of time to a dichotomous outcome, and that's usually a time to event type analysis or survival analysis. And, that usually Will be more powerful for detecting treatment differences than just a straight dichotomous outcome. And it also allows for accounting of, of people who you didn't observe the outcome in who are censored. So that would be sort of the second level of looking at events as outcomes. And actually survival events are very typical In clinical trials. Or you could look at one event, but allow for repeats. So, you could look at rates of events between treatment groups, so it's a one, zero event. Did they have an asthma exacerbation or not, but you follow them for a period of time and a particular patient could have more than one event. Or it could be a fever or any number of type of events. In that situation, you need to have the follow up time because you're going to get If you're going to calculate a rate it's usually in events per person year. So you have to be sure to be following them for how long they're being followed before and after these events. And you tend to analyze the rates of events. One important caveat in type of analyses is that usually events within people, or within randomization units, may not be independent. So they may not Follow kind of the strict rules for analysis so you need to account for that non-independence in the analysis, and there are plenty of procedures to do that. And finally, you can look at a number of events and say of one of these happened, then The outcome was achieved. I gave you a example of that in the asthma treatment before that it could be, did the person have an ER visit? Did they need an increase in their medication or miswork? So its two or more events that are related to the disease process or would be considered appropriate for evaluating the treatment. And in that case, you know you could consider that composite measure to be either dichotomous measure or time to event or a rate. It could be any of those three that we've gone through before. All it's adding to the definition of the outcome is that it could be two relatively kind of disparate things, like increase in medication use versus a drop in lung function. It could be one or other of those things that kind of measure sort of different things, but all are related to asthma. So the other type of measure that you can have is a continuous variable. Something like cholesterol levels in the blood. And, in that case, the outcome is usually either the value, compare the cholesterol levels by treatment across the groups Or compare the change from baseline. So you may have a baseline level on everyone and then after you institute treatment, how did that change? There's usually some standard units for that, like for laboratory dye use or Some sort of continuous metric, like a score, that may have a range of reasonable units. You need to define an important difference between the treatment groups that you would consider clinically important, if you're talking about a pharmacological trial, that is not something that you could get a small Difference between two measures and considered them to be statistically significant but that difference may have no real consequences on broader measures. One thing with continuous variables, again, you can measure them repeatedly over time, and use those measurements either to project a slope of how things changed over time or if you think there'll be an, an initial change and then no change thereafter maybe. Just a better estimate of what the change is. Typically continuous outcomes are more powerful than discrete outcomes that you need smaller numbers of people to do your clinical trial. But it really hinges on what you think the important detectable difference between, let's say the mean values in each group, so the mean value of cholesterol in one group versus the mean value in the other group. So if you want to detect a small difference then you'll need more people. The fact that you can kind of calibrate the differences in the continuous case where you can say I detect a small difference or have it be a larger difference you have a little more ability to influence the sample size whereas in a dichotomous outcome where it's yes or no it's a little more difficult to justify differences in sample sizes. And usually, just because of the nature of the beast those type of trials with a dichotomous outcome require more patience. But it's important, when you're looking at continuous variables, to be checking the distributional assumptions. Because you, you know, you have to make sure if you're doing the power calculations based on a normal distribution. That the outcome really follows that. And, sort of a subset of continuous outcomes is an ordinal scale, where there are categories, kind of qualitative categories that are ranked maybe 1 to 5 or A to D This is commonly used in adverse event grading, that it's grade one to grade five. But the differences between these categories are usually qualitative and not necessarily equal across the categories so they're sort of summarized in this ordinal ranking scale that creates some order in them. But isn't necessarily that if you're a two That is twice as bad as a one, and a four is four times as bad as a one. That relationship is, it's not necessarily linear in that sense. But they can sometimes be used in clinical trials, and they can allow for grading between you know, complete remission, partial remission, no remission. And so allow for a little more flexibility in the outcome And finally, in this section, I just want to briefly go over objective versus subjective outcomes. And really it's a spectrum of objectivity to having a very objective outcome to a subjective one. And there's very few outcomes that are absolutely objective that. Can't be influenced, sort of, by the observer, or the person reporting the outcome. And, that would be total mortality. People are usually dead or not. The evaluent later's bias can't really be introduced there. So (no period) Objective outcomes tend to be clinical events or measurements that are pretty definitive. That require little or no subjective judgments. And they may require some, like what is a myocardial infarct? But you can apply rigorous definitions to limit the interpretation of the grader. Of the outcome, you could have infections with confirmed cultures, they were culture positive as well. And, so, you can make things that are pretty well standardized and as long as you Have a very clear protocol for how they'll be evaluated, you'll encourage it to be an objective outcome. Subjective ones of, ones that rely more on judgement and kind of gastolve that. Can't really be quantified quite as well. One is a Karnofsky score, which, if you're not familiar with, its just a score from 0 to 10, that sort of tries to incorporate how patients are doing with daily living skills and, and 0 is dead, and 10 is you're having no problems. But even when you get to evaluations of slides and things, or tests, like CT scans, they're sort of What is the bias that is brought by the evaluator. And, so, if you have a more subjective outcome, that makes masking more important so that whoever's evaluating the outcome, won't be influenced by the treatment effect. Ways to enhance accuracy and objectivity are two clearly defined criteria for evaluation. Make sure you are using a validated measure or do some pre-testing of that to train the people who are going to be evaluating the outcome. And to make sure they're all using the same criteria the same way. So you can have test cases so that, to see what the agreement is across evaluators and standards. And these people may, you know, be operating at different clinics. So it's not like they're going to evaluate the same patients over the course of the trial. So what you're trying to do is to have them evaluate the same case to train them. You can have a panel of assessors to have more than one person do the assessment, and have them operate independently and as they come back with different results then you would have some method for adjudication. And you certainly ongoing quality assurance throughout the trial to make sure there isn't a drift in how things are evaluated. And finally just to comment on patient's opinions which are generally subjective, but are increasingly important in evaluating treatments. How someone's doing, their health status or their change in status. You can imagine that for these type of outcomes where you have to rely on the patient's assessment of how they're feeling or what is happened to them over the past interval. That masking is even more important. So that they are not influenced by their knowledge of a treatment assignment. And one thing that is important to recognize is the effect of being studied. And once a patient is in a trial and has some more attention focused on them. There's usually a positive effect of that. Paying attention to people in a positive way, which we would hope would be true in a clinical trial, will usually have a positive effect on people, that they may be feeling better because you've taken the time to ask them how they're feeling. These type of data are difficult to quantify. They typically use scales of, you know, how are you feeling On a scale of 1 to 10, but can be a challenge to get really accurate data. So that finishes up this section, which was a little long, but just kind of, to, to introduce you to some concepts about outcomes and in the next section, we'll talk more about how they influence the design of the trial.