Last week, you learned that the sensitivity of a cell voltage measurement to resistance is quite high and that makes it relatively simple to estimate the series resistance of a battery cell but you also learned that the sensitivity of a voltage measurement to the total capacity of the battery cell is very low and that makes it difficult to estimate the total capacity of a battery cell accurately. In particular, modeling errors and noises tend to bias results. Remember also that the sensitivity of voltage to total capacity was maximized when you are able to monitor voltages over an extended interval of time during which the state of charge of the cell changes appreciably. With that in mind, imagine that the state of charge equation is going to be important for capacity estimation and so I reproduce that equation here. The way I've written the equation, it says that the state of charge at some time k_2 is equal to the state of charge at some previous time k_1 minus the sampling interval divided by capacity and multiplying the accumulated Ampere's seconds of current that have passed through the cell during that interval of time. We can rearrange this expression by moving the state of charge term from the right side to the left side and by multiplying both sides of the equation by Q, the total capacity. Then if we denote everything on the left side of the resulting equation as y and we denote the change in state of charge on the right hand side by x, then this equation has the linear structure of y equals Q times x. This linear structure is wonderful because it allows using linear regression techniques with datasets combining measured values of y and measured values of x in order to determine a value for the total capacity Q. But we need to be careful. The most commonly used type of regression is known as least squares regression or sometimes as standard least squares or ordinary least squares regression and my experience whether phrased that way or not if you look at the details, this is the most common way that battery management systems estimate total capacity. But it's wrong. The diagram on this slide presents the paradigm, the worldview that is assumed by standard least squares. Measured data points x are located on the horizontal axis and measured data points y are located on the vertical axis. So, we have data pairs of x and y that are plotted as these round filled circles. The standard least squares approach to fitting an equation to data assumes that the measured values of y might have measurement errors in them but that the values of x are known absolutely perfectly and have no errors in them whatsoever. So when fitting a straight line through data points, the errors in the vertical direction are taken into consideration but there is no consideration for any possibility of errors in the horizontal or the x direction. On this diagram, I've shown the uncertainties in the vertical direction as these confidence intervals or error bounds surrounding each data point and for the capacity estimation problem, these errors would constitute accumulated current sensor error over the time interval involved in a particular measurement. But in our problem, there are also errors in the x direction since we never know state of charge perfectly. Instead we rely on estimates of state of charge perhaps from a Kalman filter for example and we know that these estimates always have some amount of estimation error associated with them. So, the picture on this slide is fundamentally incorrect for the problem that we're dealing with and you'll see later that using least squares to estimate total capacity produces estimates that are biased. The drawing on this slide illustrates the actual problem we are trying to solve. We have data points where the horizontal coordinate is a difference between two state of charge estimates and the vertical coordinate is the noisy summation of measured current. Both of the coordinates have uncertainties or error bounds or confidence intervals and here I show them as error bars on this illustration. Our objective is to find the estimate of total capacity Q such that y equals Q times x is our best estimate that takes into consideration the confidence that we have on every data point in both the horizontal and vertical directions on this figure. The standard least squares approach does not take errors in the x-coordinate into account the horizontal coordinate and so it produces a biased estimate of the battery cell total capacity. When we do take into consideration errors in both directions, we cannot use standard least squares but we need to use a different method that is known as total least squares and you will learn all about total least squares in the remainder of this course. The honors track of this course will also show you how you might use a Kalman filter to estimate the total capacity of the battery cell but since Kalman filters are also least squares methods, I believe that they will also be biased by noise. So, I always recommend that you use the more direct total capacity estimation methods that you're going to learn throughout the remainder of this course based on total least squares instead of ordinary least squares or even Kalman filter-based methods. Instead, the Kalman filter-based methods in the honor section might be used to estimate other cell model parameter values as they change as a cell ages. So, if ordinary least square methods are biased, what do we do? So, many researchers and practitioners have noticed that their estimates of total capacity are quite poor if their state of charge inputs are noisy. The usual approach that they take to counteract this problem is to try to ensure that the state of charge inputs to the capacity update method are as noise-free and as accurate as possible and then simply use standard least squares methods anyway. For example, we might put constraints on how capacity is estimated. Before the first data point is collected, we might wait for current to be zero for a long period of time so we have confidence that this cell is electrochemically in equilibrium and therefore that the state of charge estimate is probably very accurate or we could put constraints on the second data point to make sure that it was taken after charging the battery pack and so we have confidence that the state of charge of a cell is equal to 100% with very little error on that estimate. So by doing this, by putting a rest constraint on the beginning point and a fully charged constraint on the end point, we can eliminate to a large extent the errors on the x variable which are the changes or the differences in state of charge though we never eliminate those errors completely. Using this approach, the standard least squares regression can be fairly accurate but there's still no reason to do this when total least squares is essentially no more difficult. It's about the same number of floating point operations to implement and will always produce an accurate result. So that is, this ad hoc approach does not correctly handle any residual error that there might be in the x variable and the differences of state of charge variable. It minimizes the noise that are used as input to the least squares methods but it never totally eliminates it. So my very strong recommendation is to always use total least squares methods instead of ordinary least squares and that's what you're going to learn about for the next two weeks in this course. To summarize this lesson, you've learned that we can write an equation that involves measurable quantities and is linear in the total capacity Q. This linearity is really wonderful because it enables us to use a wide variety of tools developed for linear regression to estimate this total capacity and it's very tempting to use this most common tool which is ordinary least squares. But, I've stated to you and I will show to you that the errors and noises in our problem are different from those assumed by the ordinary least squares problem and if I use ordinary least squares, it will bias the estimate and end up with an incorrect result. So instead, I'm going to share with you a total least squares solution that should be used instead and that's what we're going to study next.