In this lesson, we're going to reframe the problem of mediation using potential outcomes. Following Holland and Robins and Greenland, we defined several estimands of possible interest. As in the previous lesson, we assume treatment assignment Z is binary but the results can be generalized to the case where it's not. At this point, we won't place any restrictions on the mediator M or the outcome Y. It'll prove convenient even if it's redundant to rewrite the potential outcomes Y_i(0) and Y_i(1) as Y_i(0, M_i(0)) and Y_i(1, M_i(1)) respectively, where M_i(0) is the value that unit i would take when not assigned to treatment and M_i(1) the value if assigned to treatment. So, we'll then define the unit effects of Z on M and Y as M_i(1) minus M_i(0) and Y_i(1, M_i(1)) minus Y_i(0, M_i(0)) respectively, it's a lot to say. The unit effect of Z on Y is sometimes called the total unit effect. It's the outcome when unit i is treated and responds to treatment with the intermediate outcome value M_i(1), versus the outcome when i is not treated and responds with the intermediate outcome M_i(0). The unit effect can be viewed as operating through two channels; treatment may affect M, which is M_i(0) is not equal to M_i(1), and this may in turn affect the outcome Y and two, even if there is no effect on the mediator, that doesn't imply that the outcomes will be the same under treatment and control because Z may affect Y directly, meaning through channels other than M. In order to separate these two channels, we'll consider a hypothetical study with potential outcomes Y_i(z, m). Okay, so you'll see that's new and z can be zero, one treatment or control and m can take all possible values in the range of the random variable M. So, with this new potential outcome, we can define several kinds of new effects. First, we can define unit effects. So, where we compare Y under z and m with Y under z star and m star. In the literature on mediation, control direct effects of Z on Y, Y_i(z, m) minus Y_i(z star, m) are commonly considered. But affects where z is held at a value and m is varied which may be called control direct effects of m on Y, these aren't commonly considered. If I average the unit effects that leads to an average controlled effect, so CDE; controlled direct effect, or average controlled effects conditional on X. So, you can see that formula just takes the expectation within covariant level X of of those two comparisons. So now, far more attention has been given in the literature to decomposing the total effect into so-called direct and indirect components, so now you see why we use that notation. If you look at the decomposition above, on the first line there's a decomposition of the total unit effect into a unit total indirect effect with treatment Z equals one, so Z equals one is held there and we're comparing the M_i(1) and the M_i(0) and then, it's plus a unit pure direct effect of treatment at the mediator value M_i(0) when Z is zero. So you see that that's just that little decomposition. In the second equality, the total unit effect is decomposed into a total direct effect and appear indirect effect when Z is zero. So, those are two different decompositions of the same thing, the total effect. The terminology above is due to Robins and Greenland, but one often simply sees reference to so-called natural direct and indirect effects following Pearl. Now Pearl just renamed the effects but didn't change the definitions of these. In the following, where it's not necessary to distinguish between pure and total direct and indirect effects, I'm just simply going to refer to these guys is direct and indirect effects. Now, the two decompositions will generally give different values for the direct and indirect components of course, unless additivity or no interaction occurs, and that's the no interaction condition right there. So, it doesn't matter what value of M you're at. Now, Robins points out that additivity is unlikely to hold in most applications, but a special case where additivity holds is when the unit direct effect is zero. Also in that case, Z is an instrumental variable as it affects the outcome only through M, and we'll have more on instrumental variables later. From this decomposition that we just saw, we can easily define average direct and indirect effects. For example using the first decomposition, we get these two kinds of effects or conditional on X. So, the second one would be the direct effect and the average direct effect and the first one would be the average indirect effect. Before we take up the identification conditions for the new estimands defined above, it's important to note that if the new potential outcomes are not well-defined in the first place, it's meaningless to consider such estimands. Some dogmatic issues have been taken about this in the literature, I don't intend to take a dogmatic stance. But it seems to me there will be context in which it's reasonable to consider such random variables in contexts where it's not. So, for example, consider the encouragement study in the previous lesson, subject i is assigned to receive encouragement or not and they then choose how much time to study under each condition, M_i(0) or M_i(1). Now, in that case to consider outcomes Y_i(z, m), the investigator must at least be able to imagine the situation in which each subject might have studied m hours. That doesn't mean that the investigator must be able to design and/or implement a study in which it's possible to assign units to receive that particular combination, although when this is the case, the resolution of this issue is more clear cut and there are some cases where that might be. Now, that said an investigator might argue that the experiment did not manipulate the amount of time studied, and as it is not possible to force students to study literally m hours, consideration of outcomes Y_i(z, m) is irrelevant, at least in a practical if not theoretical sense. As a second example, supposing an investigator wants to study the effect of education on earnings, as mediated through occupational choice. So, let Z_i equals zero if at age 30 the respondent has less than a college education and let Z_i be one otherwise. All right. Let the intermediate outcome denote the occupation at age 31 and let's let m star denote physician, and let's let Y denote earnings at age 32. So clearly, the outcome Y_i(0, m star), the earnings of a physician was less than a college degree is not possible, doesn't make sense to consider such a thing. Average direct and indirect effects have received much more attention in the literature than the average controlled direct effects. One reason is that the average direct and indirect effects indicate how the treatment or treatment assignment operates in the population to which the units belong. Large values of indirect effects indicate the mediator, is a significant channel through which Z operates. Small values suggests that either Z doesn't substantially impact the mediator and/or that the mediator doesn't substantially impact the outcome. In the former case, an investigator might look for a treatment that more effectively targets the mediator. If the effect on the mediator is substantively significant, suggesting the mediator does not impact the outcomes substantially, the investigator might wish to rethink the experiment and try a different treatment aimed at a different channel. For more extended discussion, you should see the encyclopedic book on mediation by Vanderweele. Now, it's also been argued that the average direct and indirect effects are of more fundamental scientific significance because they indicate how the treatment operates through the mediator in contrast of controlled direct effects which fix the value of the mediator for all subjects. So, while this is true of the controlled direct effects of treatment, that is not the case for the controlled direct effects of the mediator. These controlled direct effects are also the building blocks of the average direct and indirect effects and in theoretical scientific context, it may be of more fundamental interest to know the effect of a mediator when its values are controlled than to know the effect of a treatment through mixtures of the mediating variables. At a practical level, for outcomes that measure beneficial quantities, if it were possible to manipulate both z and m, one would want to know how the controlled direct effects vary with z and m, also the value of its expectation that would yield maximum overall benefit. Economists often consider variable Z which are hypothesized to affect Y only indirectly through a mediator of interest, M. Here, while the mediator does not behave as if randomly assigned, Z may have been randomly assigned or behave as if randomly assigned. In this case, the controlled direct effect is of interest but the indirect effect itself is not. Z is just a tool to ascertain the effect of M when the relationship between M and Y is confounded. So, there should be a lot more interest I think in controlled direct effects in the literature than there actually is. In the next lesson, we consider identification conditions for the effects defined in this lesson.