[SOUND] So, data visualization can consist of some very simple charts, but the success of a data visualization can often depend on how we map our data variables to the elements of those charts. So we can start with a Bar Chart. And the bar chart has two axises typically. You've got a horizontal axis and a vertical axis. And you're usually measuring discrete values here, and some either discrete or continuous value vertically. And this benefits from the fact that you're mapping a variable, a data variable, to both position, the actual height of these bars, as well as to a length, the size of the bar. And so you do a really good job of not only seeing. That, for example, the orange bar is larger than the blue bar, but how much larger the orange bar is to the blue bar because position and length are both at the top of perceptual effectiveness for displaying quantitative values. And so usually vertically we have some sort of quantitative dependent variable. And then horizontally these can be categories. And so we have some nominal variable or at least some discreet variable here indicating the individual bars that we're plotting. And this is an independent variable, a dimension. And then this is some kind of measure of that dimension. It's a dependant variable depending on the value of this independent variable. Similarly, you have a line chart. A line chart has data points that are connected by a line. And so this is very very similar to a bar chart. These data points are at the same altitude as the tops of the bars. So they benefit from position but they don't have the length. That you visually see with the bars in a bar chart. So you still do a pretty good job of being able to discern quantitative values and their relationship of quantitative values in the altitudes of these data points in a line chart. And so again we have a quantitative dependent variable vertically that's changing based on some quantitative independent variable horizontally. But now the horizontal value is some quantitative continuous variable and the vertical value also needs to be a quantitative continuous variable because we're drawing lines between these data points and these lines imply that there's a continuity of values between these data points and these data points have a horizontal and a vertical component. These lines have a horizontal and vertical component. So you don't want to use a line chart to display data across categories because that's implying that there's in between values in between these categories and if they're nominal categories, if they're discreet, then there should not be in between values. Your visualization shouldn't imply that there's in between being values. If we remove the lines, we get a scatter plot. And a scatter plot gives us some other flexibility. When we display a line plot, we're displaying a function. We're displaying some dependent variable that's changing according to an independent variable. So that there's one dependent value for every independent value. So there's basically one measure for each change in dimension. When we do a scatter plot, we have two independent variables so that I can have the same horizontal value here and I can have two values associated with that and so that can be a powerful value. You usually don't connect these with a line unless there is some order in which the data is coming in that you want to associate with a line and that would be an additional dimension you could indicate on a scatter plot. But the line doesn't infer that you're plotting a function, because a scatter plot doesn't plot a function unless the data's organized that way. And so you have two independent variables, a horizontal independent variable and a vertical independent variable. And you're getting an indication of position, both, horizontally and vertically, for the quantitative values on each of the two axes. You also get some cues based on density if these points tend to cluster in certain areas. You can also create a Gantt Chart, which is kind of looks like a sideways bar chart, except the bases of objects don't line up with one of the axises like they would in a bar chart. And so in this case, we have two independent variables, things that are no longer related as a function, but you still get the benefits of position and length. Gantt charts are usually processed diagrams that tell you the various stages of a project. And so horizontally a Gantt chart would usually be some display of time. This may be a quarter, or date, or some other time axis. And then vertically, this is some categorical, often a discrete or nominal independent variable here vertically, and this is typically the tasks. So you'll have the first tasks and then the second task, and the second task may start before the first task finishes. And tasks may stop and then start up again, and so you get this overlap. Again, it benefits from both position and length. but it operates from two independent variables. Again one could be quantitative and one could be nominal similarly to a bar chart. But in a bar chart you have one dependent variable plotted over an independent variable. In a Gantt chart you have to independent variables And, finally, you have a table. In this case, you have two nominal variables, two categories, for example, they're independent variables. One doesn't depend on the other necessarily and you're just looking at two separate dimensions, and in plotting some value that would be the entry in each of these table entries. So it really benefits from position only, and again, that position is discrete or nominal. It's not a continuous position, as it would be in a scatter plot. It's in discrete, quantized regions. You might also notice if you look at this long enough, you can see some flashing happening at the intersections. And it's, again, important to remember your perceptual psychology to know when you're laying these things out to pay attention to contrast to make sure that you don't get some unwanted perceptual features. So here's a table that visualizes the decision you need to make of what chart to use in various situations, depending on the data that you want to display. Very often you have at least one independent variable and then you may have a dependent variable on it or you may have an independent variable. And your independent variable might be discreet or nominal, some category, or it might be some some quantity tha varies continuously and your dependent value could similarly be continuous or discrete, or an independent variable could be a category, or it could be a continuously changing value. Independent of your horizontal axis, and so depending on each of these configurations you could look up in this table which you want to use. If you have an independent variable and a dependent variable, then most often you want to use a bar chart. You can use a line chart, but only when you have a continuous dependent variable and a continuous independent variable, because the lines indicate that they're in between values both horizontally and vertically. You want to use a Gantt chart if you have a independent variable. That's continuous and a categorical axis vertically or a categorical axis horizontally and a continuous value vertically. Either one of those will form a Gantt chart. If you have two categorical axes, you want to make a table. And if you have two continuous axes that are both independent, you want to make a scatter plot. So we use the kind of data that we're trying to visualize nominal, ordered, quantitative, whether it's continuous, whether it's discreet, whether variables are dependent or independent. To not only figure out how they map to chart elements, but more importantly to decide which chart best displays them. [MUSIC]