Okay, all right, here's the really fun part. We get to introduce a new tool to a lot of you called Cloud Dataprep where you can take a lot of the best practices you've learned for cleaning up your data and execute that using a fun user interface, so a drag and drop interface. Okay, we're going to explore with some tools. So, Cloud Dataprep, it's a pre-processing data pipeline building web UI tool that's part of Google Cloud Platform. Behind the scenes, what it actually does is it invokes and it kicks off a Cloud Dataflow. Do you remember that data engineering tool that we talked about? A Cloud Dataflow job without you having to write any of the Java code that normal data engineers would have to write. So, it's a UI tool that allow you to access a lot of these functions much like you would do deduplications inside of BigQuery and then, behind the scenes, it will create that flow for you. because that's the kind of key first word to know, is that these visual maps of, as you can see, data being ingested here, and then there's these scripts that are ran. And you can create these scripts which are called recipes inside of the UI. And all of this entire picture here is called a flow. So data flows together as you're transforming it. And how do you transform the data to these transformation steps? And the tool itself just uses the word wranglers. So the wranglers conclude things that are very common, that map to SQL but you're not writing any SQL. Like aggregations or deduplications or remove these rows that satisfy this condition or create derived fields, much like as you would in SQL. And you can chain together multiple of these transformation steps into what's called a recipe. So three key terms, right? You have the flow itself, which is that very pretty visual flow. And then you have the wranglers, which are those transformation steps. And then you have the recipes, which could be multiple wranglers that form a repeatable set of steps that preprocess your data. Okay, so we covered recipes, and at the end of the day, if you wanted to submit a recipe, to actually run it, you then run that job. And the job then looks up the recipe, and goes and fetches your entire data set. because when you're operating inside of Cloud Dataprep, it's only on a sample of your data set, up to 10 megabytes. And you're building that recipe there for the sake of the speed, and then the actual job runs against all of your data. And fires off that Cloud Dataflow job behind the scenes, which you can actually access and look at, which we'll cover in our lab review. And after that job is run, you can look at some pretty cool statistics. See how many rows of data were processed in this particular screenshot here, it's over 300,000 rows. And what we didn't show you yet was the really fun histograms that are available as part of the visuals that are kind of built into the tool where you can look for the frequency of values as well, right? And here's where you can track those jobs. All right, I'm going to show you a quick demo of the tool and then I want to get you launched into your first lab, where you're going to be ingesting data. And after that lab, we're going to go through creating a lot of those transformations. So just a quick tour of Cloud Dataprep. Here is one of the flows that you're actually going to be creating as part of the second lab. And let's just find a dataset that we can just show you how cool it is to explore this data. So here is your organizational details. Here is your 2014-2015 filing information. I'm going to click on one of these data sets, and then a picture's worth 1,000 words here. So as soon as this loads this 10 megabytes sample of data Into what's called the transformer in Cloud Dataprep. This is where you can spend just a ton of your time. And we'll cover more of this in the labs to come, but here you can see things like how many electronic filers are there. So it's like, imagine BigQuery where you have that preview, right, that we saw before of some of the rows and columns. Complete with this histogram on top, which shows you how frequent are those values, right? So it's just like if you're going to run a little SQL query and do a group by and it shows you that column for every single one of these fields as well. And although also at the top, it'll show you a data quality horizontal bar graph where it'll say, all right, well out of this data set, you have 7 valid values and 24 missing values, or values that don't match with the data type that you said it should. So this exploring view inside this transformer is very powerful, it's for exploratory analysis and looking for those anomalies in your data. And you'll get a lot more into that in the second part of this lab, but first, you have to get your data into Cloud Dataprep.