Another heuristic which Edward Tufte introduces us to is called chartjunk. Now Tufte is much more damning of chartjunk than he is of other forms of non-data ink. Indeed, he suggests that artistic decorations on statistical graphs are like weeds in our data graphics. He suggested there's really three kinds of chartjunk. The first is unintended optical art. For instance, excessive shading or patterning of chart features, such as shown in this economics graphic which Tufte shares in his book, The Visual Display of Quantitative Information. Here, the patterns make the human eye jump and cause visual fatigue. These phenomenon are called moiré patterns. And this is the same reason you don't usually see people wearing stripe shirts when filming video content like this MOOC, since the low resolution of the video accentuates the moiré issue. Tufte suggests that instead of patterning this content, one is better off labeling the chart graphics directly. We saw that being used in the data ink ratio example by Darkhorse Analytics. The second form of chartjunk is the grid. Tufte suggested the grid is both unnecessary as data ink, but also causes competition with the actual data being shared. Thinning, removing, or desaturating grid lines makes it easier to see the data, instead of being overwhelmed by the number of lines on the page. Direct labeling of data is another great way to reduce this form of chartjunk. But it's the third form of chartjunk that I want to focus on here. It's the one which typically comes to mind when talking about chartjunk. Tufte calls this the duck. Broadly, he's referring to non-data creative graphics, whether they be line art or photographs and they're inclusion in the chart. Newspapers and news magazines are a place where this kind of imagery is often used. One well-known graphic artist, Nigel Holmes, has used the duck to display data in a way which is memorable yet aesthetically interesting. One of the most memorable images he's created, to me, is entitled, Diamonds Were a Girl's Best Friend. And this showed up in a Times Magazine in 1982. This graphic shows the trend of the price of diamonds from 1978 to 1982, and while the dollar amounts are easily forgotten, it's easy to remember that the trend have a spike because of the shape of the woman's leg. So is the doc really a useful heuristic, or is it something about docs which are memorable? I was part of a team led by Scott Bateman who wanted to understand this issue in more detail. So we set up some user testing. I've linked the full academic paper in this week's readings. In short, we provided participants a variety of homes as image, including the one which you were shown, which could be considered chartjunk. We also included a variety of plain graphs with high data-ink ratios, with no decorative embellishments. We brought 24 subjects into the laboratory and assigned them to either view the Holmes or high data-ink ratio conditions. We asked the subjects to describe and summarize the charts with a series of guided questions. This was followed with a recall test, both conducted immediately, as well as a second test two to three weeks later. Finally, we used eye tracking to determine where subjects were paying attention when they looked at these charts. Our findings were interesting. While there was no recall difference when tested immediately, recall after two to three weeks was significantly better for the Holmes Charts, which showed the duck chartjunk. Further subjects indicated subjectively that the Holmes Charts were more enjoyable, more attractive, easier to remember, and easier to remember details of. And they felt it was faster to describe and remember the Holmes Charts. The non-embellished designs didn't perform better than any of the Holmes designs did. Does this mean that you should use embellishments in your charts? Well, maybe. I don't know of anyone who has replicated our findings on a larger and more diverse population, and there are certainly other kinds of chartjunk, like the unintended optical art in the grid which seem like really good heuristics. But I think there's more to be told in the story and more nuance to be worked out. At this point, you should have, at least, some thoughts from designers as well as from a user study on specific kinds of charts. And as you go about creating your data science graphics, it's worth not only reflecting on the principles you use and the results you are sharing, but also the process by which you came up to create the graphics. Setting up a user's study is much simpler today with crowd sourcing services, such as CrowdFlower and Amazon Mechanical Turk. It's now very reasonable for the average data scientist to test out whether more embellished visuals might be more effective. Whether it's in terms of time, memorability, accuracy than those with data-ink. In the next lecture, we're going to go on to touch on two more items from Tufte, the Lie Factor and sparklines.