Today I'm going to talk to you about virtualization technologies. I think you've been doing Cloud computing already but this is core technology that some of you have seen, and one thing I have to warn you is I'm going to go through a lot of concepts in very quick progression just to get you situated with everything. The first thing that I'll do is talk about virtualization concepts, and this is the bulk of the lecture today. Then I'll give you a little bit of container technology, and a little bit about VM migration. So that's the game plan. If you look at the origin of virtualization, it dates back to the 70's; IBM VM 370, and 360. The IBM VM 370 is the first one that talked about this virtualization technology. Then microkernels of the late 80's and the 90's, that was another development that happened that was looking at extensibility of operating systems. This happened in the 90's. SimOS was something that was done in the late 90's. The intent there was really asking the question, if I want to simulate a new processor that I want to design, and evaluate the processor, I want to evaluate the processor in the context of not just applications that are run on the processor but also the operating system because all of those come into play. That was the origin of SimOS. Xen in the early 2000, and VMware which really was a follow-on to what SimOS did, the same authors. They came up with this idea of virtualization which is similar to the original vision behind IBM VM 370. Why this resurrection in the 2000, and then now its so pervasive. It's because of the fact that if you think about data centers that we've been talking about, you want to run several different operating systems there, and you want accountability for all of the applications that run and the different operating systems. So that is one of the main reasons why this virtualization has taken such a strong route now in data centers. Digging a little bit deeper, what we are saying is that you have shared hardware resources, and you want to be able to run multiple operating systems on the same hardware resources. So you want to give the illusion to each of these operating systems that they are completely in charge of the shared hardware resources, and there are two different ways in which you can accomplish this. First of all, you've got to have a thin layer right here, between the hardware resources and the operating system that serves as a mediator between the operating system, and the physical resources, and that's what is called a hypervisor. There are two ways you can realize this hypervisor. One is a native version of the hypervisor, meaning the hypervisor's running directly on bare metal, and the guest operating systems are living on top of the hypervisor, whether it is Windows operating system or Linux or what have you. The second approach is something called Hosted, and I think most of you know what this is because you're all using it, VirtualBox and VMware, and so on, where you have a host operating system, and the hypervisor's really running as a process on top of the host operating system, and it is supporting virtualization of all the guest operating systems running on top of it. So these are two different approaches, and in this lecture, we will primarily focus on the bare metal version of it, and what are all the things that the hypervisor have to do in order to support multiple operating systems on the shared resources. There are two ways you can think of the bare metal or native virtualization, one is full virtualization. Here the idea is the hypervisor is of course running on the bare metal, but what we want to do is we want to run each of these operating systems unchanged. That is, it take the operating system binary and run it directly on top of the hypervisor. So that is what is called full virtualization because in this case, we are not touching the operating system itself. Its just running unchanged, and all the processes of course on top of it is also running unchanged. So that is one way to do it. The pros of this one is of course the fact that the operating system vendor doesn't have to think about it, they can simply put it on a data center. The con is that there could be some performance disadvantages because of the way each of these operating system thinks about the physical hardware down below, and what exactly is the physical hardware that's available through the hypervisor. That can cause a little bit of a mismatch especially in the I/O subsystem performance, the operating systems may see some inefficiency if it is fully virtualized. In paravirtualized, what is happening is that there is a well-defined interface that is made visible to the operating system through the hypervisor, and each of these operating systems know that they're not running on bare metal but they're running on top of a hypervisor, and therefore there is some change that has to be done, a minimal change that has to be done whenever these operating system have to do something that requires intervention by the hypervisor. So that is what is called paravirtualization. VMware is a good example of full virtualization, and Xen is a good example of paravirtualization, and we'll talk about distinctions between these two as we go along, but these are at the highest level the two different distinctions in terms of how you might realize virtualization on top of bare metal. Either fully virtualized, meaning the operating system don't have to be changed at all, or paravirtualization where there is a small change that needs to happen, and typically paravirtualized setup, the change to the operating system could be as small as one percent or two percent of the code base which is not a huge thing to worry about, right? So that is a targeted change that you're making to the operating system to run efficiently on top of the hypervisor. So regardless of whether we're using full virtualization or paravirtualization, what needs to be done? Well, we have to virtualize the hardware, and when we talk about the hardware, primarily we are saying, how do you virtualize the memory, how do you virtualize the CPU, how do you virtualize the devices that are supported in the hypervisor. So if you think about it actually, a process is really a virtualization, right? So in some sense, even if you think about a native operating system like Linux, and you're running processes on top of it, process has also a notion of virtualization of the physical resources, right? So it is just that the granularity at which we're talking about virtualization is bigger. Its entire operating system as opposed to individual processes. So we will talk about how each of these media resources are virtualized to support multiple operating systems, and we also will talk about how to effect the data transfer and the control transfer between the guest operating system, and the hypervisor. So that's the big picture.