Hi again, everybody. In this module we're going to look at linear basis sets for hemodynamic modeling. And continue our model building enterprise. Let's first review some key concepts. We talked last time about model building for multiple predictors. Using an example two by two repeated measure factorial design experiment. We have indicator functions, convolution, and that lets us build a design matrix. We also talked about the assumptions. We have to assume that we have the correct neural response function, that the HRF is correct, and we have to assume a linear time and variance system. I said before we'd learn how to relax on these assumptions, and that's what today is going to be about. One of the key ideas here in relaxing those assumptions, is this idea of basis sets. So HRF models are often used to model the response that's still done most typically. And this is optimal if this is exactly the correct model. But if it's wrong, it leads to both loss of power and also bias in some cases. It's unlikely that the HRF is actually true for all voxels. In fact, here is an example of four cases where the HRF is demonstrably not correct. So in the top left you see results from a visual experiment with a flashing checkerboard with ten subjects. And in each case the dashed black line is what we'd assume from the canonical HRF in the standard LTI system. The solid black line is the response that's estimated using a finite impulse response model, which we'll talk about later in the lecture. So this is the more accurate model of the response. In each of those cases we can see that it doesn't quite match. In the visual experiment on the top left, we can see that we still capture most of the response, but we're off in timing, and we lose some of the amplitude because of that. The top right is a thermal pain study, and we can see here that the model fits fairly well, but we've missed the shape of the response. The true response peaks later and it increases across time, just as actual pain does. But the canonical model doesn't know that in this model, to a small degree. On the bottom left, we can see responses to aversive pictures in the orbitofrontal cortex. And here the actual response is likely to have a sustained duration. So it actually lasts much longer than the actual stimulus itself, so we've missed the duration. And finally in the bottom right, this is a case in the orbitofrontal cortex where we're looking at aversive anticipation. And here our assumed impulse response to anticipatory cues is completely wrong. Anticipation has a very different time course leading up to a stimulation. Another way to see this is within one experiment. HRFs can vary substantially across brain regions. Why is this? Because different brain regions are doing different things, and they're activated for different periods of time during the task. So in this study, this is a memory experiment, what we can see is that the lateral occipital complex shows a fairly typical hemodynamic response, peaking at 6 to 8 seconds, and it's gone by about 12 to 14 seconds. But the hippocampus shows a very different profile. It peaks at 10 seconds or later, and it still shows a substantial response at 16 seconds post-stimulus. So a canonical HRF model is not going to capture this response very well. Enter temporal basis functions. The idea is I'm going to model the overall hemodynamic response as a linear combination of a set of fixed linear basis functions such that the overall response is a sum of three activation parameter estimates for one event type, times the respective functions. So in this case, this is an example basis set with three temporal basis functions. The first one looks like the canonical response, the second one is derivative over time, and the third one is derivative with respect to dispersion. This is a very common basis set. So my overall response is a combination of each of those three functions times their respective beta hats, or activation parameter estimates. So that's the overall estimated HRF shape, and then the responses is a linear combination of these three basis functions times three beta values summed together. So basis sets vary in the degree to which they make a priori assumptions about the shape of the response and the HRF shapes that they can thus model. So let's look at some data fit with three choices of basis set. And we'll look at our visual evoked response data again. On the left we see the canonical HRF. In the middle we see the 3-parameter basis set that we just looked at. And in this case then, the actual response, estimated here in black, is fit by a combination of those three curves times their respective amplitude summed up. And that gives us an overall fit that matches the data fairly well. On the right we have the same response estimated with a finite impulse response model otherwise known as a deconvolution model, and we'll look in more detail about how that works next. So let's look at what these basis sets are. The canonical HRF looks like, you see it looking on the left here. If we look at the image, it's just a single predictor, and that's the fit. The HRF plus derivatives for every event, I end up with these three parameters, and there's the fit. And for the Finite Impulse Response model, I'm going to develop a whole set of predictors that I'll explain in more detail in a moment. And there's essentially one predictor per time point following event onset. And that let's me model the response very flexibly. So this is a little deeper look at the FIR model. Here we see some idealized data, a little bit of noise, and then four event onsets. And the design matrix for the FIR model is shown over on the right. And this is just for one event type. So here, there are 30 predictors to capture that event type very flexibly across time. If we just look at the first four columns of that, the first four predictors, then we can see that what the regressors are here is, one column per time point that's locked to the stimulus onset. So the first predictor is shown here in purple and it captures what's happening in the first couple seconds following event onset on average. The second predictor is shown in orange and it captures what happens the next two seconds following event onset. The yellow is the third predictor that captures what's happening a few seconds later after stimulus onset. And then the green is the fourth, and so on until I've modeled the entire response with a very flexible arbitrary shape. Now let's look at what these look like for multiple event types. So here are our three model choices again, the canonical, the 3-parameter basis set, and the FIR model. And here is a model with just four events in it, we'll see those in a second, per event type. And this is the image of the design matrix for two event types. So let's look at how this breaks down. It's a little bit easier if instead of looking at the image of each of those matrices we look at some line plots of the conditions. So here we can see there are two event types, one and two, indicated by the blue and the brown colors. There are four onsets per event type, so four blue onsets and four brown onsets. And time is going down. And with the canonical HRF, then, we see that there are just two predictors that track those onsets. With the 3-parameter model, now there are six regressors. Each of those is three basis functions that are convolved with the event onsets to yield three predictors per event type. And finally the FIR model in this example has six regressors per event type. So six regressors for event type 1, six regressors for event type 2. Now finally we'll talk about choosing a basis set. How do I know what's the best basis set to choose? So let's consider two criteria. The first is accuracy. Can the model capture the true response in this participant voxel and condition without a systematic bias? Second, let's consider the precision. Every model parameter is estimated with error, or noisy data. So the question about precision is, are the model parameters, and thus the shape, estimated with very little error variance? Or are they noisy? And this is a fundamental tradeoff between accuracy and precision, or also called a bias variance tradeoff, in statistics. And it shows up in virtually every area of statistics. So let's look at the accuracy of these different models. Well, the canonical HRF makes strong assumptions about shape, so it has the most bias. The FIR model makes very weak assumptions about shape, so there's very little systematic bias. It can be very flexible, and the three parameter basis set is somewhere in between. Now let's look at the precision. The canonical HRF has very few parameters, so there's very high precision and high reliability of those estimates. The FIR model has many parameters, so there's very low precision and very noisy estimates at each of those parameters, with limited data. And again, the three parameter model is somewhere in between. So on accuracy the FIR model wins, but on precision the canonical HRF wins. And that's the bias variance tradeoff. So what would we like? We'd like a balance that's a simple model but accurate in the ways that count. By simple I mean few parameters, so high precision. Also by simple I mean that the parameters are interpretable measures of neuroscientific interest, i.e., measures of the response amplitude. We'll talk more about that in later sections. And finally, when I say accurate in the ways that count, I mean that it captures the true response amplitude in the physiological range, and that depends on the task and on the brain region. So in this case with the responses we see here, the three parameter model is a very sensible model because it captures the peak of the hemodynamic response quite well. It doesn't model the undershoot, but probably we don't care about that. For another task or condition the three parameter model might be less appropriate and another model might be a better choice. So models have to be chosen ideally in a way that's adapted to the task that we're studying. That's the end of this module.