Now I'd like to talk about where programmers are likely to make mistakes, and where you should likely focus your testing effort. So just to recap, there are five kinds of principles of testing analysis, What, Where, When, Who, and How. And in this case, we're looking at Where. So just to start with, programmers make mistakes. We know this, this is why we test. There are lots and lots of reasons. There's time pressure, there's complex existing code that can be difficult to understand. There's complex infrastructure that you have to hook into. There are complex system interactions, misunderstandings over requirements, incorrect requirements, unexpected physical and environmental conditions. So something that you're supposed to control breaks. And if you look, on average, there are 15 to 50 bugs per 1,000 lines of code, according to Steve McConnell, in Code Complete. But one interesting thing is programmers tend to make similar mistakes. So even though there are a lots of reasons for programmers to make mistakes, how those mistakes manifest themselves can be similar. And there are certain kinds of language constructs that tend to be error prone. And when we think about testing systematically, these are going to be the places where we want to focus our attention. So the first one is floating-point numbers. And this is more of a concern if you're dealing with large-scale, scientific-type software. But what happens is that we use floating-point numbers to model real numbers. While there are an infinite number of real numbers, and there are only a finite number of floating-point numbers. And what this means is that floating-point numbers are imprecise. And depending on how close to zero you are, they're more or less imprecise. So when you get out to really big numbers, there is a long way between two consecutive floating-point numbers. And people tend not to think about this. And it leads to invalid comparisons between numbers. And what's worse is when you do computations involving floating-point numbers. If you don't take specific steps to reduce imprecision, they get more and more imprecise. So over time, what is a tiny floating-point error will turn into a big floating-point error. So this is the thing, actually, that caused the Patriot missile system to misperform in the Iraq War. And caused a loss of life in a US air base because there was a small floating-point error. But when you leave the system on for days at a time, what happened is that error became huge, and it could not longer track incoming missiles. So floating-point numbers are problematic. Another problematic aspect is, as you probably know if you've done any serious C++ programming, is pointers. So pointers are things that allow you to directly access areas of memory. And when you're using a language like C and C++, you have to ask for memory using malloc(). And then when you're done with it, you have to ask to let it go, which is something called free(). And so a lot of times people have problems where either they forget to ask for the memory or they forget to free it. And in the first case, you have what's called a dangling pointer, a pointer that no longer points to something that you own. And that leads to memory corruption. In the second case, if you forget to free things, then eventually you run out of memory. Because you keep grabbing more and more, and your program performs more and more poorly. So figuring out how your program is managing memory is one of the big concerns when you're writing programs in C and C++. Now the focus of this course is Java. And in Java you don't have as many of these pointer problems. But you can still run out of memory. So if you forget to get rid of all the references to a particular object, then it just sticks around forever. So we still have this run out of memory problem, which is kind of like a pointer problem. Another thing that's really problematic for most people is parallelism. So in a parallel program, you have multiple threads of execution that are all running essentially, at least logically, at the same time. And if you have multiple cores, then they may in fact physically be running at the same time. And what this leads to, in some cases, is subtle timing errors. So suppose these two threads both access the same memory value at some point. Now, it could be most of the time this thread gets there first and then this thread gets there. But every once in a while, this thread gets there first. Just because depending on other things that are running in the processors, these different orderings occurs. And perhaps when this thing writes first, the program has a bug. So now you have something that you have to run tests over, and over again in order to try and find the one execution where this thread actually gets to that memory location first. On the flip side of that, you can have something called deadlock. So if you have a correctly written program, usually you want to have something that prevents two threads from trying to write to the same piece of memory at the same time. And that's called a synchronization primitive. In Java, you've seen the synchronized keyword. And this is something that prevents two threads from writing to the same piece of memory at the same time. But this can also be problematic. Because maybe thread A needs to get to something that's locked by thread B, and thread B needs to get to something that's locked by thread A. And then they both get stuck, and neither one can make any progress, and this is called a deadlock. And again, these kinds of things can only happen sometimes in rare circumstances. And so when you're testing for parallel programs, you have to be aware that running the test once doesn't tell you much. And oftentimes you have to rerun the test over and over again to try and find these corner cases when you have bad thread orderings that cause problems in your program. Another aspect that causes trouble are numeric limits and boundaries. This is one that programmers get wrong all the time. In fact, if there was probably a number one programmer mistake area, its numeric limits and boundaries. So everyone’s heard of an off by one error. We think the buffer should be size 32 but we only made it size 31. Or we thought we'd stop at position 99 but we actually went ahead and got into position 100. Which is kind of past the end of whatever we're trying to work with. Or we thought that this altitude should be at or above 10,000 feet. But really it should only be above 10,000 feet. So these off by one conditions happen all the time, both in terms of low level things like buffers and then high level things in terms of requirements. And another thing that happens is that the programmers don't think about how the program will perform when you're at the limits of numbers. So in software, when we say integer, there's some maximum value that that number can have. So depending on the size of the integer, it might be 2 billion, it might be 32,000. There are different sized integers that you can use in Java, so they have short, and int, and long. And most of the time, programs will work correctly if you pass in a number that's sort of in the normal range. So you think this array thing that you're building is going to work for any number. But it really works for any number up to close to the maximum number. If your number is 32,000, if your max int, if you're using a short, if you pass in 32,766, which is one less than the maximum value, your program will actually fail. So people don't think about how these integer computations will perform at the very top limits of the integer range. Another thing that causes trouble, but again, this is only for a specific class of systems, is interrupts. So interrupts are of concern when you're writing embedded software. And that's how you interact with devices in the external world. But an interrupt is like a go to. It's like your program is running along and all of a sudden it gets interrupted by the processor saying, hey wait, there's something for you to do with respect to this hardware device. And what happens is that totally disrupts the normal flow of control. Something happens, and when the program starts back up, the world has changed. And so if you don't deal with interrupts correctly, you're likely to have a problem. And again, these are similar to the parallel computations in that depending on when the interrupt happens, it may or may not be a good area of the program. So you have to run tests over and over again. Another area where people make mistakes are complex Boolean expressions. So if you have, if a and b or c and d and f and g and h, chances are it's very difficult to kind of figure out all the cases under which this Boolean expression is going to be true and false. And so maybe sometimes you take that if statement when you shouldn't, or vice versa. So whenever you have a complex Boolean expression, you have a likely point of error. And one final thing where people make mistakes is when you have to convert between types within a program. So oftentimes people convert between integers and floating-point numbers. So they'll have something that's representative of float. And then a routine that they want to call expects an integer, so they'll turn into an integer. But when they do that, then of course the values after the decimal point get thrown away. And there are different ways, then, those can get thrown away. So you can lose precision, a lot of precision. And in some cases, you can overflow. Floating-point numbers can actually represent numbers that are bigger than are possible to represent as integers. So you try and convert between this and this, and the program just fails. So casts can be problematic. So I've given you lots and lots of things to look for. Let's actually take a look at a program here. So this is a piece of code that comes from the Java HashMap class. I can tell you that this code is correct because it's been banged on and banged on and banged on by lots of people. But it illustrates a lot of the problem areas that we've discussed in the previous slide. So I'd like you to go through and answer some questions about where in this code you're likely to find problems. Okay, well, hopefully that was interesting. And you can start to see that there are lots of places within code that there may be problems. And in real code, you do things like you do conversions between integers and floats, even in legitimate circumstances, for good reason. But there are lots of things that you need to test. And if you want to test things thoroughly, there are lots of places you have to look. So that is looking at kind of the language level issues that you're likely to find. There's another level to consider as well, and this is the module level. It turns out that for testing and debugging, that there's an 80/20 rule. What this means is just that a small number of the modules in a larger system tend to have the lion's share of the bugs. And it could be because these are central to the system's performance or that they're the trickiest part. Or just the people who were writing them were junior and not quite understanding what they were doing. But as a general rule of thumb, 20% of the modules contain 80% of the bugs. And so once you've started to test your system, you should use feedback to try to ensure that you're focusing your attention on where the bugs are. Once you've been through this process once, when you're in a maintenance or an update cycle, you want to be sure and focus your attention on where the bugs are. These are some guidance on where to look when you're doing your testing. Programmers make mistakes for lots of reasons. However, when they make mistakes, how those mistakes manifest themselves in the code oftentimes falls into predictable patterns. And some language constructs and modules are more likely to contain bugs. So when we test, if we want to test effectively, we want to focus our attention on these aspects.