[SOUND] Welcome. In this lecture, you will learn how to transform data, and incorporate nonlinearity into your models. We start again with the model where we explain a dependent variable y, with one or more explanatory variables collected in a vector x. A relevant question is, what is the most appropriate form of the data? An important consideration is that the variables should be incorporated in a compatible manner. If our y variable is a level, such as the number of unemployed individuals, it makes more sense to relay that to X variables that also capture levels, such as the level of production. Similarly if our y variable is some growth rate then it makes most sense to relate that to an X variable that also considers a growth rate. It makes less sense to explain the growth rate of unemployment with the level of production. If variables are not similar in nature one should consider transforming data. We discussed two very common transformations. The first transformation is taking a logarithm of a series. A case where this is a sensible transformation is when there is some exponential growth. In case of exponential growth, such as commonly found in the level of macroeconomic and financial quantities, the properties of the series are not stable. The logarithmic transformation then brings back stability in the sense that the explosive behavior is removed. The second transformation is taking the difference of a variable relative to its previous observed value. This transformation makes most sense when data capture observations for a variable at different points in time, and are thus ordered. Sometimes such a data set, often referred to as a time series data set, shows a trend. When there is such a trending pattern, it may affect the stability properties of the series, which causes statistical assumption to not hold. Fortunately, the stability is oftentimes easily restored by taking the difference. Mathematically we defined a difference as delta(y(i)) to be the level of y at observation i minus the level of y at observation i-1. Now I invite you to study this transformation a bit further and derive the first difference of the series y(i)=i. In this case, the variable y(i) grows 1 at every observation. Intuitively, this already gives the answer that the difference of the series should be equal to 1. We can also show this mathematically. The first difference is y(i) minus y(i-1), which in this case is i-(i-1), and thus 1. So far, we've considered non-linear transformation on the variables. Let us study non-linearity a bit further. At the top here is the usual setting, where the dependence of y on a constant and k-1 other explanatory variables is written separately. The marginal effects are constant and simply equal to the beta parameters. We can extend this setting to get nonlinear effects. For example, we can consider the square of all the explanatory variables. Also, we can consider cross-products of the explanatory variables, which we often refer to as interaction terms. Taking both together in our usual linear model, we get a set-up such as on the middle of the slide. There are two reasons to consider this structure. First, it allows for a non-linear functional form, here quadratic. We can ask to extend this further by adding cubic or even higher order terms, which allows for very rich non-linear relationships. The nice thing is that for all sorts of variations, the relationship from X to y is non-linear, but the setup remains linear in the unknown parameters beta. Taking the square of a series, or cross product of two series, does not depend on parameters, and enters linearly. Thus, ordinary least squares can still be used. A second reason for such a set-up is that, even though the structure itself may seem somewhat contrived, it may actually provide a meaningful economic specification. As an example of this, let us go back to the second series of lectures, where attention was paid to wage regressions. One of the specifications considered is repeated here, where the log(Wage) is explained by a constant, a dummy whether the i-th observation is female or not, the age, education level, and dummy for part-time work. We extend this model with quadratic and interaction terms. In this specification, there's an interaction term for the gender dummy and education level, and a quadratic term for age. This small extension allows for two extra effects. First, because of the new interaction term, the partial wage differential is allowed to depend on education. The gender effect is now beta2 plus gamma1 times the education level. This allows for the possibility that the wage differential as compared to men is different for higher-educated woman or lower-educated woman. In fact, in this setting such a hypothesis can simply be tested by studying the significance of gamma1. Second, the squared term of age allows for a non-linear effect of age. The effect of an increase of age is beta3 plus two times gamma2 times age. This allows for the possibility that the wage increases more during relatively young age when climbing the career ladder and less for older age. Naturally we could have added other squared and other interaction terms as well in this specification. In fact, it is possible to start again with a very general set up with all squares and interaction terms and use model selection of lecture 3.2 to get to a more specific model. We can also use dummy variables to get a somewhat richer model structure and add non-linearities. The mean level of data that are measured quarterly may differ across each of the four quarters. This can be captured by replacing the constant term by the quarter specific mean level alpha(i). We can easily formulate this in our usually framework by use of dummy variables. These dummy variables take the value 1, if a certain condition holds and 0 if that is not the case. In this application we define dummy D(hi) for each quarter, where h is 1 through 4 are the quarters. Here, D(hi) is 1 if an observation, i, falls in quarter h, and 0 otherwise. With this notation, we obtain an equation much like before. We simply add the dummies to our X matrix, and use linear regression to get estimates of the quarter-specific constants alpha, as well of the parameters beta of the explanatory variables. Now I invite you to consider whether or not we can add a constant term to this specification with dummies for each quarter. The answer is that a constant term cannot be added in a general model. If we would add a constant and four quarterly dummies to our X matrix there would be linear dependence among the columns of X. Adding to four dummy variables gives exactly the intercept. So X�X cannot be inverted. We can solve this, however, by simply taking out one of the dummies. If we omit the first quarterly dummy, this what the model becomes. The model is equivalent to the model at the top of the slide, but the dummy coefficients have a different interpretation. As before, alpha1 measures the mean level for the first quarter. In this specification the mean level of the second quarter is however given by gamma2 plus alpha1. In the specification of the previous slide the mean level of the second quarter was given by alpha2. We can thus easily relate the gammas and alphas to each other through the relationship that gamma2 is equal to alpha2 minus alpha1. Similar results hold for the third and the fourth quarter. Now I invite you to make the training exercise, to train yourself with the topics of this lecture. You can find this exercise on the website. And this conclude our lecture on transformation.