All right. Without further ado, let's take a quick demo of the tool Cloud Dataprep and see how we can quickly and visually organize a lot of our eCommerce data and find insights. So, here we are back within our Google Cloud platform projects. You'll practice this a lot more in your lab. But, if you scroll all the way down, under big data where you found BigQuery before or at the bottom, we have Cloud Data prep. Opening that up, you'll have to accept a one-time Terms of Service and a few other pop-ups that you're going to have and then you'll be taken to the flows which are your data preparation pipelines after a Cloud Data Prep initializes. Now, I already have a flow that I have created here, but let's pretend that we didn't even have one of these. So, if you wanted to look at all the flows that you have, I have quite a few again, this is just playing around with the different pipelines, but let's start from scratch. So, flows, when I create a flow, I'm going to call it eCommerce, exploration. After you've done that, the first thing you need to do is explore your dataset by adding it. So, clicking on Add Dataset, I'm going to import a dataset. A couple of different ways of course, you can bring in the CSV file or different other file, Google Cloud Storage, all of my stuff going to have inside of BigQuery. Within here, the table that I want to look at, is the revenue transactions. They are actually previewed and see what that actually looks like, this must be a new feature that they just added and that's pretty cool. So, you can see much like, in BigQuery previewing that table. You can preview the transactions before you bring it in. It actually has 57,000 rows and 28 columns. So, again, much like in BigQuery where you had the information about the table before you bring it in, that metadata is available there for you. So, I'm going to click the plus sign on that. It's going to bring this as a new dataset and I'm going to click the magical button, import and add to flow. Once you have that as part of your flow, this is the visual pipeline that we're going to be creating. So, from this raw data set, we're going to add transformation recipes. But first and foremost, we want to explore the dataset. So, this is our raw data. I'm going to click on it and I'm going to click add new recipe, edit recipe, and this will bring you in what's called the transformer view. So, what behind the scenes Data Prep is doing is, it's loading a sample. Right now is the time of this recording is about ten megabytes that's loaded and that's the big caveat. It didn't load a million records right here into the web UI. It's just loading a small sample of your data and then it helps you get a feeling for some of the statistics and frequency of values and the quality of your data. So, if you've been working with SQL for a long time, seeing a visual representation of a lot of your values is a great visual tool. So, for example, you've previewed the data but now inside of the transformer view, you get access for these histograms which are frequent values. For the data that we have, we pulled in only 10,000 rows if you remember there is 56,000. Those are about one-fifth here. So, take a look at some of the enumerated fields, like the channel grouping. If you weren't familiar with what are the different channels that you had there, how could you do that in a SQL? Yeah, you could do select, count star of the visitors, group by channel grouping, order by channel grouping descending, you get absolutely that and I've done that many times but being able within a few clicks just to see you hover over the channel grouping and you see that value 49 percent came in through referral. Twenty-seven percent came to your eCommerce website directly, referral again this through some referring source like a link and 18 or 19 percent search for it organically. Not only do you get what are some of the different values, you also get paid search, display ads, social media on then affiliates that 0.01 percent. You get a feel for the values and the frequency of occurrence and that's just one field. So, let's take a look at something like a date field. So, again, this is the revenue transactions. So, even without visualizing this and any other tool, you can just quickly see whatever your most popular dates for transactions. So, going up here to this top one, you can see as you might expect for the holidays, December 2016 and again, this dataset is just one year of data and there is a spike around that particular time, that's a quick insight and then one of the lower values, September 2016 and as you might expect August 2017 may not have had a full month of data that's loaded in. Well, you can see generally just that visual trend. Again, the caveat here that this is just a sampler one-fifth of the data that's been loaded in there. You can look at other field types that have been loaded like what's the type of hit, all of them are associated with pages. Have there been any refund amounts not in the demo dataset that we have here? What's the price for the different products? What's the different product names that are familiar with? Here is one of the queries that we did before in earlier lab, what are the different transactions or the different number of products associated with transactions that belong to a particular category? So here, we have 34 percent belonging to nest and then a parallel 23 percent, 12 percent are not set, 10 percent associated with office, drink ware, lifestyle, bags, and so on and so forth. And here, if you remember that product item, that product, placeholder or dynamic value that's been set there as well. But, it's a very quick and easy visual way to see a lot of the frequent values there too. For numerics, if you want to get transaction revenue again, for this point and this demo dataset, these are completely mark and made-up numbers but we can still get some interesting statistics say, yes, you can hover over and look at bucketize values, how many transactions were between zero, and in this particular case, against fit data, zero and a hundred million and the low end you have 35 percent and then longer tail, you have some very high dollar amount transactions. How do you look at what the averages or without having to run an aggregate query? You can click on the drop down here. You can go into the column details and then you can take a look at some of the statistics. So, median value, maximum transaction value, the lowest value of the average their, transaction value, standard deviation. This is a very useful field mismatched values. So, if you had a string or something else that came up in this data field instead of a numeric value, you will easily see that represented here as basically saying, "Hey, 99 percent of these are numbers and then a string is thrown in here, take a look at that particular value." So, it's a very good tool not only for exploration, but for cleansing as well. Let's go back to our GridView, let's pull open a few more. One of the things of course that you can do, that you're going to explore inside of your second lab that we're going cover is how you can actually transform these fields and then clean them through the use of these recipes. That's a quick preview, you can add a step in your transformation pipeline. So, doing things like deriving different fields, applying different formulas, pretty much anything that you might imagine is going to be available through here. You can change the different types of columns, you can drop columns that are no longer needed, you can add calculated columns, anything that you can imagine you would want to do through SQL, you can it do through this WebView on here and we'll practice that more in the next lab.