We're now going to review some of the basic concepts from probability. We'll discuss expectations and variances, we'll discuss Bayes' theorem, and we'll also review some of the commonly used distributions from probability theory. These include the binomial and Poisson distributions, as well as the normal and log-normal distributions. First of all, I just want to remind all of us what a cumulative distribution function is. A CDF, a cumulative distribution function, is F of x. We're going to use F of x to denote the CDF. We define F of x to be equal to the probability that a random variable X is less than or equal to little x. We also, for discrete random variables, have what's called a probability mass function. A probability mass function, which we'll denote with little p, it satisfies the following properties. P is greater than or equal to zero. For all events A, we have that the probability that X is in A is equal to the sum of p of x over all those outcomes x that are in the event A. The expected value of a discrete random variable, X, is then given to us by this formula here. So it's the sum of the possible values of the random variable X. These are the xis weighted by their probabilities, p of xi. So that's the expected value of X. If I was to give you an example, suppose, for example, I toss as a dice. So it takes on six possible values, one, two, three, four, five, and six, and it takes on each of these values with probability. So that's wp with probability one-sixth, and all the way down one-sixth. So in this case, for example, the probability that X is greater than or equal to four is equal to, well, that's one-sixth for four, one-sixth for five, and one-sixth for six. So that's equal to one-sixth plus one-sixth plus one-sixth equals one-half. Likewise, we can compute the expected value of X in this case. It is equal to one-sixth times one plus one-sixth times two, and so on. Plus one-sixth times six, and that comes out to be three and a half. So we also have the variance of a random variable, is defined as the expected value of X minus the expected value of X all to be squared. If you expand this quantity out, you can see that you'll also get this alternative representation so that the variance of X is also equal to the expected value of X squared minus the expected value of X all to be squared. So there are discrete random variables, probability mass function, and so on. So let's look at a couple of distributions. The first distribution I want to talk about is the binomial distribution. We say that a random variable X has a binomial distribution, and we write it as X Tilde binomial or Bin( n, p), if the probability that X is equal to r is equal to n choose r times p to the r by 1 minus p to the n minus r. For those of you who've forgotten, n choose r is equal to n factorial divided by r factorial times n minus r factorial. So the binomial distribution arises, for example, in the following situation. Suppose we toss a coin n times and we count the number of heads. Well, then the total number of heads has a binomial distribution. We're assuming here that these are independent coin tosses so that the result of one coin toss has no impact or influence on the outcome of other coin tosses. The mean and variance of the binomial distribution are given to you by these quantities here. So the expected value of X equals np. The variance of X equals np times one minus p. Now, there's actually an interesting application of the binomial distribution to finance, and it actually arises in the context of analyzing fund manager performance. We'll actually return to this example later in the course. But let me just give you a little flavor of it now. So suppose, for example, a fund manager outperforms the market in any given year with probability p, and that she underperforms the market with probability one minus p. So we're assuming here that the fund manager either outperforms or underperforms the market, only two possible outcomes, and that they occur with probabilities p, and one minus p respectively. Suppose this fund manager has a track record of 10 years, and that she has outperformed the market in eighth of these 10 years. Moreover, let's assume that the fund manager performance in any one year is independent of the performance in other years. So a question that many of us would like to ask is the following. How likely is a track record as good as this, i.e. outperforming eight years out of 10, if the fund manager had no skill? Of course, if the fund manager had no skill, we could assume maybe that p is equal to a half. So actually, we can answer this question using the binomial model or the binomial distribution. So let X be the number of outperforming years. Since the fund manager has no skill, then their 10 years on the total number of outperforming years X is then binomial with n equals 10 years and p equals a half. So we can then compute the probability that the fund manager does at least as well as outperforming in eight years out of 10 by calculating the probability that X is greater than or equal to eight. So what we're doing here is calculating the probability that the fund manager would have eigh, nine, or 10 years out of 10 in which she outperformed the market. That is given to us by the sum of these binomial probabilities here. So these were the original binomial probabilities on each slide, and we summed them from r equals eight to n. N in this case, of course, is 10. So that's one way to try and evaluate whether the fund manager has just been lucky or not. One can compute this probability, and if it's very small, then you might conclude that the fund manager was not lucky, and that she had some skill. But actually, this opens up a whole can of worms. There are a lot of other related questions that are very interesting. Suppose there are M fund managers. How well should the best one do over the 10-year period if none of them had any skill? So in this case, you don't have just one fund manager, as we have in this example so far. We now have M of them. It stands to reason that even if none of them had any skill, then as M gets large, you would expect at least one of them or even a few of them to do very well. Well, how can you analyze that? Again, you can use the binomial model and what are called order statistics of the binomial model to do this. We'll actually return to this question later in the course. So let's talk about another distribution that often arises in finance and financial engineering. That is the Poisson distribution. We say that X has a Poisson Lambda distribution, so Lambda is the parameter of the distribution, if the probability that X equals r is equal to Lambda to the power of r times e to the minus Lambda divided by r factorial. For those who forgotten factorials, I also use it in the binomial model a while ago, r factorial is equal to r times r minus one times r minus two all the way down to two times one. So this is the Poisson distribution. The expected value and the variance of a Poisson random variable are identical and equal to Lambda. So for example, we'll actually just show this result here. It's very simple. The mean is calculated as follows. We know that the expected value of X is equal to the sum of the possible values of X, so these are the rs, times the probability that X is equal to r. R runs from zero to infinity. We can calculate that as follows. So we have the summation of r and the probability that X equals r. We know from up here. We can substitute that down in here, and now we just evaluate the sum. The first thing to notice is that when r equals zero, this term in the sum is equal to zero. So we can actually ignore the zero, the first element, the zero element, and replace the summation running from r equals one. So then we get this quantity here. We can cancel this all out with the first r up here and write this as r minus one factorial. We can also pull one of these Lambdas out here, leaving us with a Lambda to the r minus one. Now if we look at this quantity here, the summation here, we see that this is the same as changing this to run from r equals one to r equals zero and replacing r minus one with r, and r minus one factorial with r factorial here. This total we see is equal to the sum of the probabilities. These are the probability that X equals r. So this is the sum of the probabilities that X equals zero, X equals one, X equals two. So this is equal to one. The total sum of probabilities must be equal to one. So this is equal to Lambda.