Another important design consideration when we're designing our machine learning system, is whether the model training and prediction can be done on a fixed schedule, or whether it needs to be done on a real-time basis. We differentiate between scheduling in real-time in terms of both the model training and retraining and also generating predictions. If our training and retraining is done on a scheduled basis, based on fixed time frequencies, we call it offline learning. If our model training is done real-time as new data comes in, and for each new data point coming in, we're dynamically updating or retraining our model on that new point, we call it online learning. Likewise, if our model predictions are being delivered on a fixed time schedule, we call it batch prediction. If our predictions are generated real-time as a user sends a new data point to our model, we call it online prediction. Let's start by understanding the differences between offline learning and online learning. In offline learning, our model retraining is done on a fixed time schedule, usually on the order of weeks or months. Each time we retrain our model, we're using all of our available historical data points and running multiple iterations of training. Benefits of offline learning, are that it's generally considered easier to implement in production and it's also easier to evaluate the performance of our model. Since we're using our full data set for each retraining, we can carve out a portion of our data set to use as a test set for evaluating our model's performance. One of the key challenges of offline learning systems are that they're slower to adapt to changes in the environment or the distribution of our data. If our data, which is used as an input to our model is changing on a very rapid and dynamic basis, it may take some time for our model to catch up to the changes in the distribution of our input data. Most current machine learning systems use offline learning and retrain their models on a periodic basis. In online learning, we continue to retrain our model as each new data point arrives. Generally, we are retraining models on the order of minutes or hours, using each new data point or each small batch of new data points one time to update the training of our model. One of the key benefits of online learning is that it can be used to effectively handle big data where it's unfeasible to retrain models on our entire data set. We retrain them iteratively each time using a new single data point or a new small batch of data points coming in. Another key benefit of online learning is that it can be very quick to adapt to changes in the environment and the input data. Since we have continual retraining of our model as new data is coming in, we're updating our model so that it can quickly account for changes in our input data distributions. Online learning can be harder to implement in production, and it's also a little more difficult to evaluate the performance of the model each time we're training it. One example of an online learning system might be flagging spam in social media, where we have a high volume of new data coming in and spam is continuing to change in shape and form, sometimes in an adversarial way as people who are creating spam are reacting to changes in our model and trying to stay ahead of changes in our model. In this use case, we need to continually update and retrain our model as new data is coming in, so that our model is able to keep up with changes in the distribution of our data. One example of an online learning system might be a new site that delivers personalized recommendations to a user. Suppose the model behind our personalized recommendations on our new site has learned that I'm particularly interested in sports and so most days when I go to the news site, I want to read about sports stories. Now, let's suppose there's a particular day when a major political event happens. When I open up the news site, I pull up a story about that political event. When I go to the news site, it's likely that the recommendation system is feeding me recommendations about sports, given that it's trained on historical data to recognize that I like reading sports articles. However, on this particular day, what I really care about is this certain event. When I go to the site and read an article, I then I'm more likely to read follow-up articles about this event rather than reading about sports. If my model is not updated or trained in an online learning setting, it might continue to recommend me sports articles and will take some time to catch up to the fact that on this particular day what I really care about is reading more articles about this certain event. Let's now dive into the distinction between batch prediction versus online prediction. When we're running models that use batch prediction, we're generating predictions on batches of observations on a recurring fixed time schedule. The benefits of running batch prediction are that we can leverage more efficient calculations and certain technologies which can make it much more efficient to generate predictions on our model. It's also easier to monitor the drift of data that's occurring within the data sets you're using for your model. The key downside to batch prediction is that our predictions are not immediately available for new data which gets sent to the model. Instead, we have to wait until a new batch is generated and fed into the model for the predictions for the data points within that batch to be available. Batch prediction is typically used for running recommendation systems with very large sets of historical data as well something like demand prediction systems, predicting demand for a certain product where we can run on batches of data at a time. The alternative to batch prediction is online prediction, where we're generating real-time predictions based upon requests from users. As a user requests a prediction by sending data to the model as an input, the model is running in real-time and generating a result immediately back to that user. The key benefits are that we decrease the latency and then our predictions become immediately available for a user. Key challenges of online prediction are being able to minimize the latency of our model in generating the predictions. Also monitoring for model drift can be more difficult to identify when the model performance is degrading or it's changing. An example of an online prediction system might be a translation app, where we have an app on our phone. They can be used for real-time translation. We don't want to have to wait until our model accumulates a batch of request to generate its predictions. We want a translation available immediately and so a system like this might use online prediction. Likewise, autonomous vehicles would use online prediction to be able to get results back from the model immediately. Another example of an online prediction system might be a model that's used by a food delivery service to provide a user with a time to arrival estimate for when food that they've ordered is likely to arrive at their home. Again, in this case, when a user orders food, they want a time to arrival estimate immediately. They don't want to have to wait until the model has accumulated a batch, generate its prediction, and then returns its predictions out. This should be a scenario where we likely use a model with online prediction. As each new user orders food, they can immediately receive a predicted time to arrival of their food delivery.