Hello, and welcome to the honor section of this course. This week, you will learn about using Kalman filters to estimate battery pack state of health. You already know that different kinds of non-linear Kalman filters can be used to estimate the state of a dynamic system including state of charge, if you already happen to know the parameter values of the model, and you make some noisy measurements. It turns out that it's also possible to use a non-linear Kalman filter to estimate the parameters of a system model, if you happen to know the state vector of that system, and you also have noisy measurements. This week, we're first going to consider how to estimate a system's parameter values, if the state is known exactly. Then, after that, we will consider how to estimate both the state of the system and its parameter values simultaneously using two different approaches. Let's begin with some background information. We will denote the true parameters of a particular model by the Greek variable, Theta. We will use a non-linear Kalman filter to estimate the value of this vector Theta, and we will call this Theta hat at time k much like you learned how to estimate x hat at time k, which was the best guess of a system's state vector at that time. As you might expect to use a Kalman filter, we need to have a model of the dynamics of how the parameter values change over time, and also a measurement equation. We're going to assume that the parameters that we want to track change very, very slowly. So, we will model these parameter values as being essentially constant, but having some small driving noise that allows them to move over time. This dynamics equation says that the present vector of parameters is equal to the previous vector of parameters plus some noise that we call r_k minus one. This last input in the equation r is considered to be zero mean and small n value and completely uncorrelated or white. It is fictitious. I mean, this is not the actual way that a parameter value changes over time, but generally speaking, it's far too complicated to try to come up with detailed expressions for how parameters change over time especially because they change so slowly. So, this random input r simply describes how a nearly constant value Theta might be allowed to change and it also describes in some sense the infidelity of our model structure by allowing for a noisy parameter. In the end, it's not the value of r that we use, but the co-variance of r, or the uncertainty of r. If we have a big co-variance, that means that Theta can change quickly. If we have a small co-variance, that means that Theta must change slowly. The previous slide I gave an example equation for the dynamics of how parameters change over time and you know that a Kalman filter needs a dynamics equation like we've just talked about, but it also needs an output equation or a measurement equation. We are free to describe any measurable output from our system that is somehow connected to the parameters of that system. Here we use the letter h to describe this function and in this equation, the variable e models some kind of a measurement noise and perhaps to some extent also the modeling error of the measurement equation. The output from the function h might be a voltage estimate and if that is the case, that it is exactly the same as the voltage equation of a battery cell model, but it is not required to be a voltage measurement, it can be any measurable quantity that you desire. So, perhaps you have other sensors that measure stresses or strains are or who knows what else they measure, those are all possible measurements that could be used in a Kalman filter to adjust parameter values. If you choose to let the output D be the same as the output y, then we're stating that this equation measures voltage. But, to be more general in that, I'm going to maintain a distinction in this course between d and y just in case you want to use different measured outputs. So, our Kalman filter is going to require a sequence of measurements to adapt the parameter values, and I'm going to include those in a set of all measurements from the beginning of time up until now, and I call that fancy D. Those are all of the measurements that we make. Also in the measurement equation, there is the variable e that might be the same as the sensor noise of the voltage equation. But, for generality, I consider them to be distinct here in case it is a different noise on a different type of measurement. We have now defined an equation for the dynamics of the parameters and a measurement equation through which somehow we are going to observe the parameter values. We also need to specify a mathematical model of this system, which describes both its state changing over time and its voltage measurement itself. These set of two equations, this comprises the state-space model of the battery cell. The new state is a nonlinear function of the prior state and the prior input and the parameter values and the prior process noise. The measurement equation is a function of the state of the system and the input and the parameter values and the sensor noise. The only difference between this general model on what you learned about in the third course in the specialization is that now we include the parameter vector in the model explicitly, whereas we did not include it in the definition so explicitly before. So, at this point just a change in the notation to show you Theta in that notation. Whenever the model has numeric values that do not depend on time, they may still be built into the model equations, but those do not need to be included in the parameter vector because we're not trying to adapt the values of a parameter vector that we know do not change. So, for example, this is not really true for our battery cell model, but we might have the value Pi in an equation or the universal gas constant in the equation or something like that though we know is never going to change. Those values would never be in the Theta. The parameter vector Theta contains only those things that do change over time and we wish to adapt. Using the definitions of the nonlinear state-space models that we've seen so far in this lesson, we can implement parameter estimation using any nonlinear type of Kalman filter such as the extended Kalman filter or the Sigma-point Kalman Filter. In order to do so, we need to remember the six basic steps of general Gaussian sequencial probabilistic inference, and apply those six steps to the model equations that you have seen already in this lesson. Let's begin by seeing how to apply the SPKF principles for parameter estimation. It turns out that developing parameter estimation using a Sigma-point Kalman Filter is pretty straightforward. So, I'm going to discuss it before we talk about the extended Kalman filter. We begin by defining an augmented parameter vector that combines the randomness of the parameters as well as the randomness of the sensor noise. This augmented vector is used in the estimation process in the way that I will discuss now. As you should expect by now, we're going to proceed by deriving the six steps of sequential probabilistic inference. If you remember, the first step computes a prediction of the parameter values, at this point in time, using only prior information. The predicted augmented parameter vector at this point in time is equal to the expected value of the parameters given the previous measurements. Since the expected value of the fictitious noise, r, is equal to zero, this expected value is simply equal to the previous parameter estimate. I think that this result actually makes a lot of sense. We are assuming that the parameters are essentially constant over time, so, to predict new values for the parameters without any other information, we would simply say that our new prediction is equal to the prior estimate. The next step computes the prediction error covariance. First, we write an expression for the parameter vector error as equal to the true parameter values minus the predicted parameter values. The true values are equal to the previous true values plus driving noise. When we subtract the predicted parameters from the true parameters, we have the error and the predicted parameters as is shown here. Using the expression for the parameter prediction error above, we can directly compute the expectation for the covariance of this parameter vector. When we multiply out the quadratic form inside of the expectation, we find that a number of terms will disappear, because we are assuming that the fictitious noise is uncorrelated with previous errors and also that it is zero mean. So, the covariance of the prediction is equal to the covariance of the previous estimate plus the covariance of the fictitious driving noise. So, the time-update covariance, the prediction covariance, has additional uncertainty, because of this fictitious noise that we are modeling is causing parameter values to change in not truly predictable ways. So, the uncertainty has gone up from the previous estimate to the present prediction. The third step predicts the measurement that will be made. In order to predict this measurement, we require sigma points that describe the output variable that we expect to measure. This in turn requires that we define a set of p plus one sigma points describing the parameter vector at this point in time, which we will denote using the fancy symbol W. I use W because I have a previous life in neural networks, and the parameters of a neural network are called the weights of the model. So, I think of these parameters as weights or W. The definition here for this set of sigma points should look really familiar. The first sigma point is simply that predicted value of the parameter vector, and the next grouping of sigma points is equal to that predicted value plus some offsets based on a Cholesky decomposition of a covariance matrix on the final set as equal to the predicted vector plus offsets based, or minus offsets rather, based on the Cholesky decomposition of the covariance matrix. Each one of these sigma points has a component that describes the uncertain parameter vector, and another component that describes the uncertain measurement error. We separate out the parameter portion into one variable and the measurement error portion into a separate variable. We continue with our description of how to compute the predicted measurement. We take this sigma points from the previous slide that described the uncertainties of the parameters and the uncertainties of the measurement errors, and we pass those pairs through the measurement equation, one pair at a time, to generate output sigma points which I call fancy D. Finally, we predict the output as the expected value of the output equation given prior inputs. Using a sigma point approach, this is the summation of the weighted output sigma points. The next step of the filter produces the estimator gain matrix. Remember that this is equal to the ratio of two covariance matrices. So, first we need to compute both of these covariance matrices. The covariance matrix of the measurement uncertainty is shown by the first line of this equation and the cross covariance between the parameter uncertainty and the measurement uncertainty is shown as the second line of this equation. Once we've computed those, we simply compute the gain of the estimator as the cross covariance, multiplying the inverse of the covariance of the measurement uncertainty. The next step of the filter updates the parameters from a prediction to an estimate. This equation is exactly the same and forms every kind of Kalman filter that we have studied. The estimate we find is equal to the prediction plus the gain factor multiplying the innovation. The final step updates the uncertainty of the prediction into an uncertainty of an estimate. The uncertainty of the estimate is equal to the uncertainty of the prediction minus the estimator gain vector, multiplying the covariance of the innovation, and multiplying the gain vector transpose, which is once again the same equation that we have seen in other places. So, that brings us to an end of this description regarding how to use a Sigma-point Kalman Filter to estimate the parameter vector of a model. First, we must define a relevant nonlinear state-space model, which we've done, and you've seen one method that we can use to do so. You have also learned how to derive a Sigma-point Kalman Filter for parameter estimation. It turns out that the prediction steps are similar too. But actually simpler than the Sigma-point Kalman Filter for state estimation. That's because we've constrained that parameter update equation to have a linear form. We update steps of the Sigma-point Kalman Filter for parameter estimation are essentially identical to the Sigma-point Kalman Filter for state estimation at least in their structure. I'm not going to spend time in this lecture video describing and summarizing all of these steps in detail. But I have included in the appendix to this lesson that does list all of these steps for your reference. So, instead of doing that, what I will do is I will move on to the next lesson, and I will show you how we can derive an extended Kalman filter for a parameter estimation as well.