Let's move on to the next topic which is linear regression. This is one of the most common, widely used, completely acceptable grade methods to either do inference or prediction on some data. Generally, when you use linear regression, we have some assumptions and we have some situations where linear regression will traditionally in most cases, beat other models. It's not always the case, but it's a great place to start, again, it's widely used, especially in business or other places where you need to be able to really give solid numbers and interpretability. Remember, back when we talked about the beginning stuff, we talked about interpretability versus flexibility, how it advances the model and how interpretable was the model. Linear regression fits into pretty much extremely interpretable. You can interpret the results and you can tell it to people who need to make business decisions or other decisions based on that. But it's not really a flexible model. It makes the assumption of some type of linear function that's underlying here. We start out with y, which is a continuous variable. Something like sales or on a continuous maybe height, weight, something like that. It's not discreet like account and it's not zero, ones. It's just something normal like again, sales. This is simple linear regression which means that our x is just one variable. We're starting with simple linear regression. It is simple. Then we'll get to normal linear regression. We'll talk about everything that's just the basics of it. If we have a continuous target, then we're going to assume a linear relationship between our target and our single x variable. We usually set it up with Beta_0, which is our intercept here, and then Beta_1, which is our interaction with x. These two are going to tell us everything we need to know about our simple linear regression model. We have been talking about this example where a company has a few budgets, they have radio budgets, TV budgets, all advertising, and newspaper budgets. We want to know how each of these increases our sales. A great example of simple linear regression would be we have sales, we have some Beta_0 coefficient. This Beta_0 really represents what would your sales be if we put nothing into advertising? Maybe a word of mouth is getting you some sales, maybe just having physical stores that people walk by are getting you sales, something like that. Beta_0 says, if all of these, in this case only one variable is out of the equation, what are your baseline sales going to be? It's just our normal baseline term here. Then after that we get to Beta_1, which means if I increase TV by a certain amount, how much does sales increase? That's just my really simple linear regression model and we've been talking about, again, the sales example, so we want to know how does increasing my TV budget increase my sales? Pretty simple question, simple linear regression model. Everything with this is pretty standard. We say, okay, well, our sales will be a certain amount with no advertising, that's Beta_0. Then for every certain amount, every unit of money I put into TV advertising, will increase sales a certain amount. We don't know that amount, that's what we're trying to figure out, how effective is TV advertising in increasing our sales. These Beta terms, Beta_0, Beta_1, and eventually it will get all the way up to Beta_p when we deal with more than one variable. These are coefficients of parameters. In the beginning, when we spoke about modeling, we talked about parametric methods, non-parametric methods. Here we are introducing a model that is parametric. It has parameters that we're going to estimate, and when we do that, we're pretty much done because we're assuming that the underlying function is a linear one. What happens is we use some data here and we get an estimate for Beta_0, Beta_1, that's using training data. These little hats here are traditional in statistics, and they imply that we're using estimates here. We're going to use training data, we're going to plug in an x, plug in a y with our training data that we know. Plug in an x, plug in a y, we're going to do this over and over again until the Beta_0 and Beta_1 flush themselves out. We'll deal with the math behind how to get this Beta_0 and Beta_1 hat. These estimates, we're going to figure all this out and there's great methods to do that, but this is just the simple stuff, the underlying. Again, it's a simple linear regression, we have a target and we have, in this case one x variable, and we just want to make a pretty simple linear function that says TV sales, how they relate to each other.