Let's look at some more examples related to the idea of confounding and discuss them in the context of what we've done in this lecture set. So, the first example I'm going to pull up, is something you looked at it in the first term as well was study from Jama in 2012 looking at gender differences in the salaries of physicians researchers and here by gender they mean biological sex, not gender identity. So, what they reported, what they did was they took a representative sample of US academic physicians and collected characteristics on them including their biological sex and also their annual salary. And the general results show that the mean salary within the cohort was on the order of $167,669 for women with a confidence interval here that goes from 158,000 to 177,000 and $200,000-433,000 annually for men with a confidence interval that goes from 194,000 to 206,600 and so you can see now the estimate size will be different on the order of $33,000 per year more for men than women. The confidence intervals do not overlap but this comparison here is unadjusted, it's the crude unadjusted association is certainly possible that other characteristics differ between men and women that are also related salary and that may still be a problem to be fair but in terms of estimating the actual comparable difference in the physicians we have, it's important to take those into account. So, what they did was they went ahead and adjusted for different factors and then adjusted association between salaries in biological sex adjusting for a specialty, academic rank, leadership positions, publications, and research time. And what they found is after they did this, the difference between men and women who were comparable in terms of these other things attenuated a bit down to only order $13,400 which is certainly less than the unadjusted average difference of on the order of $33,000. But nevertheless, this difference is still statistically significant and large so even after accounting for differences between men and women that may be related to salary there was still a gap. A sizable gap albeit not as large as what was originally shown in the unadjusted comparison. So what they did essentially, was they used the method called multiple linear regression and we'll get to that in the next section but basically they fit a simple logistic regression model where the outcome was annual salary and the only predictor was sex and the slope for sex when coded was a one for males and a zero for females was that $32,764 difference in the unadjusted mean difference in salaries between men and women. And when they fit a model that adjusted for these other things, the resulting slope for sex was $13,399 where sex was coded as one for males and zero to females. Let's look at another example and think about what we can expect in terms of confounding here. This is the primary biliary cirrhosis trial data randomized study where patients with primary biliary cirrhosis were randomized to either receive the drug D Penicillin abbreviated as DPCA or a placebo. We've looked at this many times and the incidence rate ratio of death for these patients in the 12-plus year follow-up period for the drug group placebo was 1.06 with a confidence interval that span from 0.75 to 1.5. So, extensively, it looked like there was no benefit of the drug. Not only would we already have a heads up that the drug would be a good thing because even if it were, statistically different than the placebo it would not be in the direction we were hoping for because it would incur a higher risk of death but ultimately the result wasn't statistically significant. As if that would matter when we had an estimated association that showed more deaths among the drug group. But what we might want to ask is are we comparing comparable people in these two groups. So, you may recall though that patients, the 312 in the study were randomized to the DPCA the drug or the placebo group. So, in a moment we're going to present the adjusted incidence rate ratio, adjusted only for sex but we could do more adjustments if we wanted to, adjusted for both sex and baseline bilirubin levels of the subjects in the study. Let me ask you this, how do you expect this adjusted incidence rate ratio to compare in value to the unadjusted estimate of 1.06 from the previous slide? And you may want to pause here and think about this but let's think about what the necessary and sufficient conditions are for either sex or baseline bilirubin or both to be confounders of the association between death and treatment. So, let's look in order to be a confounder, each of these would have to be related to both death, the outcome of interest, and treatment. So, sex would really have to be related to both death and treatments. So, what would that mean? Maybe males have a higher risk of death and maybe females are more likely to get DPCA than males and hence there is a disproportionate amount of females in the drug group compared to the placebo. Similarly, baseline bilirubin levels would have to be related to both death and treatment. So, baseline bilirubin levels I know from other analyses are higher levels are related to higher risk of death but how what would it mean if baseline bilirubin levels were related to treatment? Well, that might mean that systematically the average baseline bilirubin level in the DPCA group is lower on average than that in the placebo group for example. So, let's think about how patients were assigned to the treatment group. They were randomized. So, what should that mean about these links here between in each of these potential confounders and the exposure of treatment? Well, what randomization does, is it minimizes the chances that these things are linked. So, it essentially removes this piece from the confounding triangle we've drawn and because of that we no longer have the necessary conditions for confounding because each of these potential predictors will not hopefully, if the randomization worked, it's not likely to be associated with the exposure of interest. So, as such by eliminating that link through randomization of each of these potential confounders to the actual exposure of interests, we minimize the chance that the overall relationship between death and treatment is confounded by either sex or bilirubin or both. So, let's just see what happens when we compute the adjusted incidence rate ratio adjusted for these things. What turns out to be 1.01, so very similar in value to the unadjusted ratio of 1.06 with a similar confidence interval. We could have anticipated that like I said because of the fact that the study was randomized. If persons had been allowed to self-select to be in the drug group or the placebo and for example those who were sick or based on their bilirubin levels were more likely to self-enroll in the drug group than the placebo than that original association we saw may differ after adjustment for bilirubin levels. But because of randomization that should minimize the potential for confounding and we're seeing very little evidence of confounding as well we should when we compare that at adjusted and unadjusted association.