[MUSIC] Using virtually any popular web application today means you're communicating with a server in a data center. However your connectivity to this service depends on the internet. And is a crucial determinant of the applications performance. Data centers also communicate with each other over the internet. For example, for replicating client data across multiple data centers. When we refer to the cloud, wide area connectivity or the internet is as crucial a piece of the picture as data center infrastructure. So in this lesson, we'll start looking at intra-data center working, and in later lessons, we look at client connectivity as well. What you're looking at here, is part of Google's wide area network as of 2011. It has 12 locations spread across three different continents. So why does a provider like Google need such an extensive infrastructure with so many locations across a wide expanse of the globe? There's several reasons. One is better data availability. If one of these facilities were to go down, perhaps due to a natural disaster, you could still have the data be available at some other location if it was replicated. That presents natural advantage to using widely spread facilities. Also it aids in better load balancing. If you have multiple facilities, you can spread incoming and outgoing traffic over the internet across a wider set of providers, over a wider graphic expanse. In the same vein, this can help you mitigate denial of service attacks on a facility. It also helps improve latency declines. If you're present in multiple paths of the globe, you can reach clients in different locations at smaller distances, thus cutting latency to them. So if Google operates some data centers in Asia, a wide range of clients across Asia can use those instead of having to come all the way to the US data centers. Over time, another aspect of this could become very interesting. Which is local data laws. Several jurisdictions might require that companies store data from that country in that jurisdiction itself. This would obviously necessitate the operation of several geodistributed facilities. Another reason to use multiple sites is the hybrid public-private Cloud infrastructure model. Under the right circumstances, this can be quite cost effective. You can handle average demand for your service from your private infrastructure, and then offload peak demand to the public Cloud. For all of these reasons, inter-data center connectivity is becoming very important, and providers can move a lot of data between these data centers. Here we look at some data from a study from five Yahoo data centers from 2011. The study is based on anonymized traces from border routers that connect to these datacenters. The details of how the study was carried out are not particularly relevant to our discussion here. We'll focus on this one result. What you are looking at here are two plots showing the number of flows between clients and the data centers. And on the right, between the data centers themselves. The three lines on each plot are for three different data center locations. Notice that the y-axis on these two plots are different. In terms of number of flows, the traffic between data centers is 10% to 20% of the traffic from data centers to clients. Now, keep in mind that flows between data centers could be very long lived and carry more bytes than those between clients and data centers. So, flow counts don't necessarily gives us a good idea of traffic volumes. More likely, this number 10% to 20% underestimates the volume of traffic between data centers. Also, the methodology of the study classifies everything that's not Yahoo!, to Yahoo!, traffic as Yahoo!, to client traffic. While in some cases you might have third parties partnering with Yahoo!, such as ad networks. In which case you'll count that also as data center to data center traffic. Google’s assessment of the volume of this traffic is unambiguous though. Google has noted that it now carries more traffic between its data centers than on its public facing wide area network. And the traffic between its data centers is growing at a higher rate as well. Now that we've talked about these networks carrying large amounts of data and being very useful, let's talk about what makes them different from the data center networks we've been talking about so far. The goal of these networks is to provide persistent connectivity between a small set of end points. In Google's case, that was just 12 different data center sites. Also, these networks are typically dedicated and provide hundreds of gigabits per second of capacity across these sites. If you look at comparisons between networking over the Internet in general. For example, end host to a service, or across arbitrary end hosts on the Internet, private data centers, and wide area networks such as these. You'll also find that there's a difference in terms of the flexibility you get in network design. Over the Internet, network design is constrained by the coordination required among several parties to make any change. In private data centers on the other hand, you have complete freedom to implement your own routing, your own topology, your own congestion control protocols and such. Where does a WAN sit in this design flexibility spectrum? That depends on how much of the WAN you own privately. if you own or lease your own fiber lines, you have the same flexibility as a private data center, perhaps even more because the number of endpoints involved is smaller. So you have creative freedom with design without some of the limitations of the scale that big data center networking brings. On the other hand, if you're dependent on a transit ISP, your freedom in terms of routing or traffic engineering might be somewhat more limited. But usually these facilities will be multi home, so you'll have multiple transit ISPs, you'll still be able to manipulate routes and do some form of traffic engineering. Also worth noting is that WAN bandwidth is a very expensive resource. Microsoft has noted that the amortized cost for them is hundreds of millions of dollars on these inter-data center networks. In this lesson, we will focus on traffic engineering in the WAN context, particularly with regards to bandwidth. Latency we'll come to later, when we talk about content distribution networks. The traditional approach to traffic engineering in such networks is to use MPLS which is Multiprotocol Label Switching. So let's see how this works. You have this network here, with several different sites spread over defined area, connected to each other perhaps over long distance fiber links. You use the link-state protocol, such as OSPF or IS-IS, to flood information about the network's topology to all nodes. So at the end of such a protocol, every node has a map of the network. For traffic engineering, what we'd like to do is also spread, information about the bandwidth usage on these different links in the network. Given that there's already traffic flowing in this network, some links will have spare capacity and some wont. Both IS-IS and OSPF have extensions that allow the flooding of available bandwidth information together with their protocol messages. Knowing the set of the network, when the router receives a new flow set of requests, it'll set up a tunnel along the shortest path on which enough capacity's available. It sends protocol messages to routers on the path setting up this tunnel. Further, MPLS also supports the notion of priorities. Thereby if a higher priority flow comes in with the request for a path, lower priority flows might be displaced. These flows might then use higher latency or higher crosspaths through the network. After a flow is assigned a terminal, the routers also update the network state. With the routing state in place, let's see how data packets flow through this network. When a data packet comes into the ingress router, the router looks at the packet's header and decides what label, that is what tunnel this packet belongs to. Then it encapsulates this packet with that tunnel's label and sends it along the tunnel. The egress router then decapsulates the packet, looks at the packet header again and sends it to the destination. In this scheme, only the ingress and egress routers read the packet header. Every other router on the path just looks at the assigned label. Making forwarding along the path very simple. This is the reason the protocol is called Multi-Protocol Label Switched. Also, MPLS can run over several different protocols, as long as the ingres and egress routers understand that protocol, and can map that protocol onto labels, you can use that protocol. Which is why the name, multi protocol label switching. This approach has two big problems. The first of which, is inefficiency in terms of usage of the bands expensive bandwidth. Typically, these networks would be provisioned for the peak traffic. As this image shows here, if you have the traffic over time, the y-axis utilization, you'll provision the network for peak traffic. Now, the mean usage of the network might be very small. In this example, it's 2.17 times smaller than the peak. So you risking that amount of bandwidth for most of the time. Now if all this traffic was latency sensitive you really cannot do much better. You have to provision for the peak, because those packets need to get to their destination. However, the observation made in this Microsoft research paper is that most of this traffic is actually background traffic, with some latency sensitive traffic as well. So you can provision for the peak of the latency sensitive traffic, and then fill the gaps with the background which is not latency sensitive. So unless you differentiate traffic by service, you cannot do such an optimization. This is not easy to do with the MPLS approach because it does not have a global view of what services are running in the network, what parts of the network they are using and such. Also, a related point is that regardless of whether they are multiple services or not, MPLS, the routers make local greedy choices about scheduling flows. So traffic engineering is sub optimal. For these reasons, such networks typically run around 30% utilization to have enough headroom for these inefficiencies, and this is expensive. Another big problem with the MPLS approach, is that it only provides link level fairness. So at any link, the flows can share capacity fairly. But, this does not mean network wide fairness. For example, here we have the green flow sharing capacity across that length with a red flow. The blue flow also shares capacity with the red flow. But, the blue and green flows both get capacity half the red flow, because the red flow uses multiple paths. So we have link level fairness, but we do not network wide fairness. This is something that's hard to achieve, unless you have a global view of the network. In the next lesson, we'll talk about how cutting edge methods from Google and Microsoft address these shortcomings. [MUSIC]