[MUSIC] So we just saw a demo of overfitting in the context of polynomial regression, but this issue is something that we see much more generally in complex models. And it's not specific just to polynomial regression. So, in particular we can also face this issue of overfitting when we get lots and lots of inputs. So for example, in our housing data, if we fit a model that has square feet, number of bathrooms, number of bedrooms, lot size, year built, in a massive list of possible features, that represents a very flexible model that can run into the same issues that we saw in our demo for polynomial regression. Or more generally, we can say just if we have lots of features. So we'll say that capital D is very large. And this could be different functions of our input. But when you include lots and lots of these functions of our inputs, in our regression model then again we're in this place where the model has a lot of flexibility to explain the data and we're subject to becoming overfit. But this issue of overfitting with respect to increasing model complexity is really relative to how much data that we have. So let's talk about overfitting as a function of the number of observations that we have. As well as a function of the number of inputs. Or the complexity of the model. So in particular if we have very few observations and it's small, then our models can rapidly become overfit to the data. Because we have only a few points and as we're increasing in our model complexity like the order of the polynomial, it becomes very easy to hit all of our observations, but in between where we have those observations, things can go very wild. On the other hand, if we have lots and lots and lots of observations, even with really, really complex models, we're not gonna as quickly become overfit because we have dense observations across our input, so the function is pinned down basically everywhere. In this example as a function of square feet. And it's not able to hit every observation, it's not able to do these really crazy wiggly things. Okay. So, on the other hand when we have just one input like number of square feet of a house in order to avoid overfitting, we need to have observations that are very dense across number of square feet. So we need to have lots of representative examples of square feet and house value pairs. So this is actually pretty hard to do, to have lots of examples of houses of every possible square feet that you might see. So this is already a hard problem, but it becomes even harder when I increase the number of inputs in my model. So, for example, just think of a model where I have square feet and number of bathrooms. And I want to cover all possible combinations of those two inputs in order to provide representative examples and avoid overfitting. Well that's really really hard. But now imagine a model where you have square feet, number of bathrooms, Number of bedrooms, plot size, year, and a massive list of features. And if you wanna cover all possible combinations of these things that you might see, that's really basically impossibly hard to do. So this is a much much harder problem, and you're much more subject to your models becoming overfit in these situations.