So, now we get to this very important topic of sampling distributions and I'm gonna take you through some mental exercises to try and understand where a p-value comes from. That's what it's all about. Because from the sampling distribution, we've got to establish what a p-value is. We're not going to calculate it in any way, but we can understand what it really means. Now, from the central limit theorem, we do know that we have a normal distribution. It is a Gaussian or bell-shaped curve. That curve for us represents a sampling distribution usually of sample means. Now, remember we've got a problem that we have to deal with initially. We are dealing with continuous data types. So we've moved away from the fact that we could ask what is the probability of rolling an eleven if I were to roll a pair of dice. I can now no longer ask what is the probability of getting an HB a hemoglobin value of, 14.5 grams percent. That no longer exists, because 14.5 has no meaning, Because I can ask for 14.55, 14.555, 14.55555. It goes on. For practical purposes, it goes on forever. I've got to change the way I look at it. I've got to ask what is the probability of finding a value between 14 and 15? What is the probability of finding a value greater than 18? If I knew that, what that probability was, I could now ask, what was the likelihood of finding 20 if I knew what the probability was of having found a value of 18 or more. Where is my value of 20 going to fall? Is it going to fall within that section or outside of that section? That's how we start to look at the p-value. It is not the probability of finding an exact single value but of it falling in some range somewhere. Now, what is this p-value actually? Well, we know, it's p. It stands for probabilities, that likelihood of finding some result, some event, if you do an experiment. But how was it conceptualized? How did the mathematicians statisticians come up with this? And I just want to take you through some of the thought processes of how this was developed, and how the curve comes to be, and what eventually goes into that really nice curve that we see. And, is that beautiful equation that we looked at that actually draws that curve? How was that discovered? How was that conceptualized? Well, it all starts with a histogram. Now, look at this. Now, this will remind you of the throwing of the die, which would be discrete data points. But remember in essence eventually we're gonna make those rectangles narrower and narrower and narrower until they have got no width at all. But let's start conceptualizing things this way. We start with a histogram. Now a histogram just counts how many times something occurred. So we've got a whole set of occurrences and we just count how many there are. So if we look at this little example here so none of these examples are to scale so don't be too concerned about that. It's just a concept. So we look at values between 50 and 60. So in this sample set there were 80, about 80 I should say values that occurred that had a value between 50 and 60. Between 40 and 50 and 60 and 70 a few less of them occurred but that's the total of values, that's the total number of data points in within those bounds. Now we change something ever so slightly, we move onto relative frequencies. So we're just going to divide that there were 80 between those. Again, we can just divide that by the total number that there were. Remember, there were 6 ways to get a 7 but there were 36 possible so it was 6 divided by 36 we are just changing it to relative frequencies. And then that y axis just changes. Now, it's going to make up just over 20% or 0.2. And as the values become less, it's now just a fraction or a percentage of 100. Then we have to move on to one more step, and now we get to densities, probability density or frequency densities. And what happens there is of that value of 0.2. You divide it by the size of that section. Now, that section was from 50 to 60, so that would be divided by 10, which now brings us to 0.02. This is a probability density. And that little line that we see that the computer draws, that normal distribution, is a probability density function. Now a sampling distribution has a sample mean. We've seen it, the Central Limit Theorem. There's going to be some values that occur most commonly then eventually we're going to get down to the densities, the distribution densities of those means. And it has it's very own standard deviation so for every kind of parameter there's going to be a different standard deviation. Now, if we knew these values, we could construct that nice curve. We could now also construct which area would represent a total area of 5% under the curve. And we put those on both sides. So, it's 2.5% on either side. We could do all of that. So we could work out what it would take to get a value which is very rare that falls on the side that occurs very infrequently. And we're going to see some variants occur a lot more commonly and they will be closer to the middle. Well, in order for all of that to be done, we have to understand one extra concept, and that is this concept of the standard error. So when we talk about sampling distributions, sampling distribution of sample means, for instance, it does have a meaning. It does have a standard deviation, but it's a special type of deviation. We actually call it the the standard error, as I said. And it's just a standard deviation divided by the square root of the sample size. So, there's a little equation that attaches to it but we just use a different language. You don't have to understand the mathematics behind it, it's just a different language. We now talk so many standard errors away from the mean as opposed to so many standard deviations away from the mean. If we used standard deviations we would get a bit of a different answer incorrect answer that wouldn't make practical sense. So, we talk about some of the standard errors away from the mean. Now, in the next sections we're going to talk about two distributions. First of all, the z-distribution which is quite easy to explain and easy to understand. And then I'll mention what we really want to get at and that is the T-distribution.