Recall our stylized decision rule. A manager makes a prediction V̂ of the future uncertain value V of an investment. S/he then sets the threshold V* and decides to commit resources today to make the investment if V̂ is higher than the threshold V*. Otherwise, s/he does not make the investment. We also said that this decision can produce two errors: A type I error if s/he invests and finds out later that the investment is unprofitable, and a type II error if s/he does not invest, but the investment would have been profitable. A high V* implies a more conservative rule. The manager is less likely to invest, more likely to make type II errors, and less likely to make type I errors. Vice versa, if s/he sets a low V*. Before we discuss the mechanisms of this decision process, it is useful to provide some background on probabilities. In an ode of January 22nd, 2018 in the Harvard Business Review, Walter Frick argues that managers have to think probabilistically and learn basic probabilities. This session provides these bases. We start with the notion of random variable. A random variable x is a variable that takes values x1, x2, x3, up to xn, with probabilities p1, p2, pn. Classical examples are the toss of a die, which takes values one to six, each one with probability one-sixth. There are, of course more, complex phenomena, and there are subjective probabilities in which you make a subjective assessment of what is the probability that a certain event occurs. For example, the probability that tomorrow rains is 30 percent. Probabilities are numbers between zero and one, such that the sum of the probabilities of all the realizations of a random variable is equal to one. Probability distributions represent these probabilities. There are many distributions. In the uniform distribution, all the realizations of a random variable between two generic numbers, a and b, have the same probability. Graphically, this is represented by a diagram in which the x-axis reports the realizations of the random variable and the y-axis reports the corresponding probabilities. The uniform distribution takes the form of a line, parallel to the x-axis, whose height is equal to the constant probability of each realization, as shown in the figure. A classical probability distribution is the normal distribution, whose realizations go from minus infinity to plus infinity, and it is symmetric around the mean, which is also the value with the highest probability. The Poisson distribution is, instead, asymmetric: it has high probabilities for small realizations of the random variable and typically accounts for rare events. Any function can represent the probability distribution, as long as it satisfies two properties. One, each value of the distribution which represents a probability has to be a number between zero and one. Two, the area under the probability distribution, which represents the sum of all probabilities, has to be equal to one. Variables can be discrete instead of continuous, in which case probability distributions are represented by histograms. In this case, too, the probabilities which are represented by the height of the histograms have to sum up to one. The areas of intervals under the probability distributions represent the probability that the random variable falls in the interval. For example, in the figure we represent the probabilities that the random variable is smaller than a, higher than b, or falls between c and d. A probability distribution is defined by its moments. The first three moments are the mean, the variance, and the index of skewness. The mean is equal to the sum of each realization of the random variable, multiplied by its probability. For example, if a random variable takes the values one, two, three, four, five, with probabilities 0.1, 0.2, 0.4, 0.2, and 0.1, the mean is equal to one times 0.1, plus two times 0.2, plus three times 0.4, plus four times 0.2, plus five times 0.1, which is equal to 3. The mean tells you where the distribution is centered. The variance is equal to the sum of the squares of the differences between each realization of the random variable, minus the mean, multiplied by the probability of each realization. In our example, the variance is equal to 1.2. The variance gives you an idea of the spread of the distribution. The skewness tells you whether the distribution has more numbers in its left or right tail. The index of skewness is positive or negative, according to whether the distribution is skewed to the left or to the right. The figures give you a visual representation of distributions with different moments. You can think of these moments as three numbers that help you to make decisions by summarizing many numbers. Suppose you have to think of the height of all the citizens in your town. If I tell you the mean, you begin to have a sense. If I tell you the variance, you also have an idea of the spread of the height of your fellow citizens. With the index of skewness, you figure out whether most of your fellow citizens are short or tall, or the distribution is symmetric around the mean. With these three numbers, and even with just the first two, you already have a good sense of how hundreds, thousands, or even millions of numbers look like.