Welcome to another week. This week, deep learning. The thing you've been waiting for. The thing everybody is talking about. But I'm going to keep you waiting a little bit longer. Because before we get to deep learning, we're going to be looking at merging data tables, merges and joinings. But before we get to that, I want to talk about a feature of H2O that is so useful you'll wonder why I didn't introduce it back in week one. Do you remember this chart? We looked at it a couple of weeks ago when we were talking about training, validation, overfitting. And we have the training curve that just keeps getting lower and lower. And then we have the validation curve, this is the data that the learning algorithm hasn't seen yet. And the angle on that comes down, it's never as good as training, and then it flattens out, and then it starts going up. As soon as it starts going up, it means our model has over-fitted, which is bad. So what we need to do is constantly be looking at this chart as our model is learning. And as soon as we reach the flattening out bit, our best model, we need to stop the learning, take that model, and go and use it. And that's going to drive you crazy, it's going to test your sanity to the limit. So anytime someone says this is a mind numbing task, your first thought should be, how can the computer automate it? And that's what early stopping is, a feature, I'm introducing that. I'm introducing it this week because deep learning uses early stopping by default. But it is available in all the other algorithms. Let's go back and look at our example from a previous week, week two. GBMs. You remember this code? It's a default GBM. We then added a couple of lines to get overfitting. We increased the trees to 1,000 and max depth to 10. And it did, it overfitted badly. We were looking at MAE and I think from the best model to the final model, the MAE got 200 worse. So if we wanted to add early stopping to this model, this is the line we add, stopping rounds, I've chosen 4 here. If stopping rounds is 0, it means early stopping is off, that's how you switch it off. To explain what the 4 means, I'm just going to add in a couple of the defaults. I'm happy with the defaults here but it makes it easier to explain. So, we have the criteria, the metric, that H2O will use to decide when we should stop learning. AUTO when you're doing a regression, the default is mean squared error. I could change it to MAE to be the same as week two, but it doesn't really matter. And then we have 0.001, I think, is the default, 0.1%. So, taken together, average the score over the last four scoring rounds. Compare that to the score over the previous four scoring rounds. If the model hasn't got at least 0.1% better, stop learning and use the current model. Because the idea being, if we carry on any further, we will start overfitting. And that's it. Now, I don't want you to worry too much about these exact numbers. Don't over tweak them. I might have changed MSE to MAE. I often reduce the threshold from the default to 0, which means stop as soon as the model starts getting worse. But that's going to make a difference of maybe four or five trees, maybe the difference of one on your MAE. The point of using early stopping is to save us the 960 trees that this technique will use, and to save us that 200 MAE that the overfitted model was worse by. Generally, you should always be using early stopping, just like you should always be using a validation dataset or cross validation.