[MUSIC] Remember that learning data is the stuff that we feed into a learning algorithm in order to produce a question answering machine. But how do you find learning data? It turns out you can pull in lots of data from many sources for the learning. By the end of this video, you should have a good sense of potential sources for learning data and how it must connect to your operational data. Imagine we want to build a machine Learning System to detect cancer. The question the QuAM would answer would be, does this patient currently have cancer? Of course, to answer a question about a specific patient, we need to have some data on that patient. So let's talk more about where this data might come from. You'll probably have demographic information such as age and gender. You'll know the results of so common medical tests like white blood cell count. For some cases, you could have more detailed tests like the results of a biopsy. In fact, this might be the source of the label for your learning data. Then you can build a QuAM that looks at the demographic information and common test results to predict cancer. Let's look at another example, say you're a farmer from central Alberta. The question we want to answer now is, how much fertilizer should I use? The data you might use for such a problem can be information about your specific field, information about whether projections, some statistics Canada data. There are many publicly available data sources about typography and historical performance. You can use the data from a variety of sensors such as combine sensors field sensors and water sensors at the quadrants of your field. You can use a combination of the sensors you already have and you can also add new sensors. The point is, it can be useful to use a combination of your own private data and data that's publicly available. Later in this course, we'll discuss how to combine different data sources. When sourcing your data, a very important thing to keep in mind is that learning data and operational data will need to be in the same format. For the Alberta farming example, you could source learning data from statistics Canada or the Alberta provincial government. The data doesn't have to be from your field, it can be from whatever fields you can get access to, in order to build your QuAM. However, you have to make sure this other data is based on features that will be available for your operational data. In some cases you can buy data sets, if you choose to go this direction, keep in mind that these are sometimes synthesized data. You can use them as part of your learning data, but they might skew your results. So, you don't want to use them blindly and just assume they'll be relevant to your problem. You can build your question answering machine using learning data that comes from a different source, but you may have to adjust for it. In a future video we'll go into a lot more detail about bias and learning from different distributions. The main takeaway, is that you can source learning data from historic data that you have access to. Publicly available data sets such as government sources, as well as your own data. So even if you don't have a lot of operational data, all is not lost. You just have to be careful how you make use of your data.