In this section, we'll be discussing basic statistics. We will define and apply through the use of examples several techniques used to measure central tendency. So, the topics of this discussion is, what is statistics? What are the different types of statistics? And what are some ways to measure data using mean, median, and mode of a dataset? So, what is statistics? It's a gathering of facts or data, typically numerical, that once tabulated or organized, can present significant information about a given subject. Let's move on with some other definitions. An observation is an individual piece of data that you've collected. For example, your home address is an individual piece of data. A data set is a collection of all observations or home addresses in every one of your neighborhood for instance. There are generally two types of statistics, descriptive and inferential. Descriptive statistics involves the organization and summarization of information using charts, graphs, tables, measures of centrality, variation and percentiles. An example might be just to report the percentage of students in this class who own a smartphone. Just giving a basic number of the percentage of students is an example of a descriptive statistic. Inferential statistics, on the other hand, involve methods to measure the reliability of conclusions regarding a population based on a sample of that population. You can achieve destructive statistics from the sample. But it's more involved statistically if you use that sample to infer a conclusion about an entire population. We will get into that a little bit later. An arithmetic mean or a mean is a popular way to describe a dataset. We've always called it as an average, haven't we all averaged our grades in school? The mean is calculated the same way with the sum of the observations divided by the number of observations. The second most common measure of a center dataset is the median. This is the geographic center of the data where half of the data is above the median, and the other half lies below the median. We have an example of that later that shows you how to find this number. A mode, not to be confused with a la mode, is determined by the frequency of how many observations are repeating. If no repetition, then, there is no mode. The one value with the greatest frequency is the mode of the database. In case of ties, both are the modes. This example is a data collection of baseball averages from two professional baseball leagues. The seven players on the left represent the American League and the seven on the right represent the National League. We want to know which league has the better batting average. We will also deem additional data from this dataset. Let's find the mean for each league using our formula. We can find that the mean batting average for each league is 473 and 305 clearly indicating that the American League has the better batting average. What else can we tell from this table? The median indicates the middle of the data. Since we have seven players, an odd number. It's easy to tell that the fourth one down from the top or the fourth one from the bottom is the metal. That means that for the American League, all 500, half of the data lies above 500 and half lies below 500. For the National League, their median is 250, which is lower but also indicates that half is below and half is above. Notice that the mean shown on the last row would not tell you this information. The mode is the batting average with the highest frequency. We see that 667 is repeated twice for both leagues, so 667 is the mode. If we combine both teams, we find that the 667 batting average is still the mode.