Welcome. Thank you for joining us with CloudU. We're going to be covering the data platform module today. Specifically, we're going to start with why data matters, what does move process, and store mean, some basic terminology around data, and how Intel products relate to the overall architecture and requirements around move, process and store. What is data? Circle little more about what data is. In today's lesson, we're going to talk about an overview of data, the growth in data, the data pipeline, and really how that behaves, some basic terminology around data, and the adage of moving, processing and storing with Intel products how that actually functions. When we talk about data, we realize that data is essential in addition to hardware and software. In today's module, we're going to be talking explicitly about the data platform itself. Data as an input and virtually all facets of computing, we're finding that it really does actually drive how computing behavior is being engineered in the future and today as well. Data lives in many many states and categories. From really very foundational layers at the most low-level substrates such as the semiconductor or the processor layer of the gates, all the way up through and including the application layer, which is where most folks interact with data. Really data is a fundamental portion of computing. Both hardware and software come into play with this. It's an actually an important ingredient from the perspective, that it needs to actually be processed. That's a distinct behavior, moved, that's another distinct behavior, and then analyzed and stored. That's another two sets of behaviors that show up in data. These are interleaved to create something called a data plan. Intel has had a strong pedigree and history with data management and data plan as it were. It's a fundamental technology element for us. We have a lot of deep experience in this premise of the moving, the processing, the storing and analyzing elements of data, and the entire data plan as it were. We're talking about the processing velocity and the volumetric velocity of data in the world. In the next five years, it's expected that data will grow 600 to 800 percent, arguably even more. Most of that behavior is logarithmic, it's non-linear. If you look at your charts, for example, you'll see the blue and the orange. Both of those indicate growth in data, whether it's the hyper scaler cloud, the public cloud, the hybrid cloud, the private cloud or any other configuration of that. The volume metric boot that is going to increase considerably. The utility of that data is super important, because most of the data in the world today isn't actually analyzed. Less than 2 or 3 percent is actually analyzed today. Imagine what would happen if we have the ability to harness and harvest all of that data and process it. Now we're going to talk about the data pipeline. The data pipeline is a construct, a thought construct, but also a logic ended computing construct, where you see the actual lifecycle of how data behaves. You see ingest, ingesting generally relates to the generation of data, and of course, there's transmission and the actual consumption portion of it. The storage portion. Where do you put it? How do you put it? How do you stage it, archive it, prepare it, integrate it into other elements? Processing. Processing generally has cleansing, normalizing, structuring, parsing all of these elements that help you actually make sense of that data. Then finally, the analyzing and delivering part of it, which is where you're taking the compute elements, deploying them into your data assets and artifacts, getting some insights out of that, building, whether it's a model or an algorithm tuning or recommender inch, whatever it is that you're building, you're taking that and you're moving that forward. Then finally deliver. How are we delivering this? Is it being distributed through social media or is it being deployed in an application? How is it being leveraged by another, say CRM type application or another database? All of these are the delivery layers. Then of course it's a nice feedback loop. You see this cycle showing up over and over again in data management platforms at the very most top layers and the very most low level substrates. Today we're continuing to cover more on this. When you have data, we've talked a little bit about the universe of data, the volume of data, now we're going to talk about the structured and unstructured types of data that we have. Generally, the volume of data in the world is greatest at the unstructured layer. That means all of the different types of feeds that are coming in, whether it's JPEGs or MPEGs or videos or any other type of file, there's that whole unstructured world. Then of course there's the classic databases environment. The other computing environments through which data is propagated in different form factors. Unstructured data, semi-structured data, and structured data are the three layers that we generally look at data. At the unstructured layer, most of the data is growing rapidly, most data today is at the unstructured layer. The next most crisp layer that you see is this structure or semi-structured layer and then the final, the structured layer. What are these types of data? As you can see on the visual, unstructured data is many different categories of data, many different types of data, often not organized, labeled, parsed, schemaed. All of those things that help make sense of that data. The sense-making isn't always there, it's just coming in. With structured data, you have a very nice, tightly confined, scoped, cleansed, parsed structured data. Oftentimes a transaction processing systems and databases, you see a lot of structured data. Of course, the volume is going to continue to grow at all layers and most at the unstructured layers. So that's the distinction on how we bucket structured versus semi-structured versus unstructured. In the next modules, we'll go further into this. We're going to continue on with move process, storing of data with Intel products. As you can see, we have a portfolio around movement of data, storage of data, and the processing of data, and the software arc that supports the smooth pipelining of all of these ingredients. But also the capability to actually perform the n function that we're looking at with it all these three families. Incidentally, Intel invest tremendously in software and I believe we're one of the largest software companies in the world, I think sixth or seventh. We do invest tremendously in software. When we look at data and the movement, storage, and processing with our products from the family of the movement products, this is actually getting data from point A to point B whether it's through telecom network or whether it's within the actual processing bodies themselves or the motherboards themselves, or the computers themselves or the data center itself. This is where you see the movement portion of our portfolio really shine. It's around Ethernet, photonics. We recently acquired Barefoot and Omni-Path products. All of these products make up our portfolio for movement. Coupled with that, you have the storage portfolio, which is around persistent memory, which is the highest hottest tier of capability of consuming data. Very special, very important for us to talk about later. But below that is the SSD and the classic 3D NAND. We have many different kinds of drives and families that support the overall requirements of storage and memory as you're getting ready to process data. Then for processing, we have everything from our tiny IoT and Atom cores all the way up through our most sophisticated Xian families. The 8,000, 9,000 bin series. You see that full continuum there. But also we have some directed other elements of the portfolio such as FPGAs, A6. The A6 included in that. Our VPU products, which are the most Movidius products. Then of course we have a GPU family coming. Really all of the four ways of computing the scalar, the vector, the matrix, and the spatial, all of those are covered inside our portfolio. The software stack, as I was mentioning earlier, covers all of that as well, from a software application delivery layer, not just the hardware layer. Moving along, as we look at our portfolio, we're looking at the processing families themselves. We'll take a little more of an intimate look. You can see here, we have quite a broad portfolio here. But I'd like to draw your attention specifically to the Xeon Scalable Family because that tends to be the focal point for orchestration of many accelerator technologies, such as a GPU, or such as an ASIC, so that you have that orchestration function there. Then of course, you'll see on the bottom, some of our processor and storage families highlighted from an acceleration or a GPU's perspective. This is our full portfolio around all of the different offerings that can be brought together in many unique ways for the entire processing requirements of a given customer. Of course, along with that at the very most tip of the complexity, but also, the very most vetted portion of this sphere of compute ingredients, are the Select Solutions. The Select Solutions enable us to have a really broad, crisp, nice, tight, well credential and well vetted and well tested set of Intel architecture ingredients around distinct applications. Those are Select Solutions families. IoT side, we have MRS products that often work with Select Solutions, but from a data center perspective, these are the products that we positioned into customers where there's actually the ability to boot all the way from the bottom up through the application layer, giving you a full rich IA experience. Let's move on to a little bit of a computational understanding of data. We often hear about terms such as data structures, algorithms, data shapes. What does that actually mean? Well, when we look at the processing of data, data has to be consumed by an end application. Let's say for instance, we were talking about a CRM engine earlier or eBay, you're doing some online purchase. Generally, in that environment, you have an algorithm that has processed a lot of data. What is an algorithm? An algorithm is a mathematical or technical way of handling the way the data is processed and delivered to gain a desired output. Whether it's on your smallest and most distinct and crisp use case such as a video analytics application all the way up through and including something like a business analyst requirement, you'll see algorithms at each place in the datasphere. Data structure. How does your data look? Actually physically look, as in how it displays or behaves, as in how do you package it. But that's the actual structuring of data in terms of how big? What size? What units, and what are the requirements to deliver it effectively such that you can consume it for the NDA application. Then the data shape itself. The shape of a data is the actual physical shape of that artifact. Is it a wave-form style function or does it look like a Gaussian distribution? You'll see very quickly that each and every type of data generally takes on the shape. The computer needs to map to the algorithms, the structures in the shapes of data that are being expected to put up for the application. Then we move into the datasphere. The datasphere, which is the post-processed world has its own language itself. You have a tremendous amount of analysis that's going on. That's the actual harvesting of knowledge or information from the data you're feeding a processing system. There's the simulation element which is in some way shape or form, taking the analyze artifacts or the in-process analyze artifacts and moving them into some emulation of reality if you will. You can simulate a stock market behavior. You can simulate a cell behavior. You can simulate a prediction of a plant, how it grows perhaps, or how trees or clouds might transform. These are all different workloads and applications for seeing that heavily present in the simulation environment. Visualization. Yeah, exactly what you imagine. It's the visual depiction of the data. For example, on your image here you'll see actually this is a sample of the genetics of a coronavirus and how and where it's mutating. Here you'll see all of these workloads around visualization, simulation, analytics, the mining work, the mining that was done to harvest, the insights of that data. If you're taking massive data repository is generally in data mining and trying to glean one or two very important outcomes or many, but it's a finite number of variables generally that you're looking to understand. Then the workload itself, the amount of computing that's being dedicated to a given application or a task. All of that is language for the datasphere that you'll often notice is an important element. Where is that all headed? What's happening is data is driving demand for Intel products vociferously and aggressively. Unprecedented amounts of computing are showing up today and tells really well positioned to deliver to our customers the overall value that they can harvest from their data. Complication, absolutely essential to even making sense of all these vast repositories of data. Then the movement, the story of the processing, and the other minutia that come with stitching together these very, very important data pipelines, analytical pipelines, AI pipelines. That's really where you see the richness of the Intel portfolio coming in. Let's not forget that data is on the rise, so the analytics will also proportionately rise with it. Analyzing data, yet again, we're finding that 2-3 percent of data is processed today, not as much as we would like. So we're going to be finding a ramp in that area. Then the overall demand for computing is what's compelling us to deliver faster, better, more efficient, more effective technologies on the whole. Ultimately, what's that leading to, data as the most valued asset in business today. We're finding that most businesses are realizing that and leveraging that, and we're well positioned to help them. In summary, data is essential to all forms of computing, as we discussed, data has physical, logical, and behavioral properties. We can see them in the physical artifacts, we can see them in the mathematical or algorithmic artifacts, and we can see them in the outcomes that they put out. Data is very strongly dependent upon logical design of processors. The way we build processing infrastructure, whether it's scalar or vector, matrix or spatial, the processor itself is what munches, manages, and analyses and helps make sense; the sense-making if the data comes from that. We have an over 50 year pedigree as a company in looking at all forms of computing, stitching these pieces of computing together to make discrete and distinct applications that customers can use, and that we really enjoy delivering to the world. This is our mission, this is our purpose, and we've embraced this. In addition to computing, oftentimes, as we talked about, you see things like memory, storage, hard drives, the entire stack of availability of data is going to play a role. The entire solution of the ingredients around data management, data storage, data memory, that's going to be important in the upcoming future, and where and how you tear that storage and memory as it relates to the processing layers is an important part of that journey. Then ultimately the Cloud. Let's not forget talking about that. Whether you're talking hybrid, public, private, hyperscaler, the clouds, servers, all of this infrastructure are all dependent on data, terabytes of data, petabytes of data, tremendous amounts of data, and making sense of all of that data is why we're here. That's it. That's a wrap for what's a little bit about the Intel's view on data.