[MUSIC] Hello, we are standing outside the Coordinated Science Laboratory, or CSL. CSL was founded in 1951 to conduct classified scientific research, hence the original acronym. After several years, CSL was open to general research, and, now, houses offices and laboratories for a number of engineering faculty. Much of the work done inside CSL involved putting fundamental engineering ideas into production. For example, CSL helped develop the first flat panel plasma monitor, binaural hearing aid, and Plato, the first computer assisted instructional program in the world. Thus, it is appropriate to be outside CSL, since this module focuses on practical concepts and machine learning, including those related to putting machine learning algorithms into production. First, you will read about issues in production machine learning. These include the challenges of moving from practice to production, the differences between simple machine learning examples demonstrated in courses, such as ours, and the murky issues in real world data analytics, and the construction of data science pipelines. The focus of all of these readings is how to achieve modelling success and to develop the capacity to put these successes into operation. As an optional reading, you can learn about CRISP-DM, one of the most popular methodologies that you can follow when tackling any data analytics challenge. Next, you will learn about ensemble learning. Ensemble learning works by combining the predictions from many learners into a more powerful learning algorithm. There are three types of ensemble learning we will cover in this module. The first two types start by training weak learners or learning algorithms that have access to limited information, and thus are unable to make robust, accurate predictions. One type of ensemble learning know as bagging combines the predictions of many weak learners together to make more robust predictions. This approach is similar to the wisdom of the crowd. An example of this concept is when you try to estimate the number of jellybeans in a jar. One of the best approaches is to average the estimates of many people in the crowd, each with their own knowledge or insights, as this often produces an amazingly accurate estimate. The second type of ensemble learning is known as boosting where initial weak learners are progressively improved to make more accurate learners. This approach works by identifying the weakest learners that are making mistakes and iteratively improving them to become more accurate. The final type of ensemble learning is ensemble voting where a number of traditional algorithms are combined together by using a voting process to generate a more robust prediction. The final concept addressed in this module is the development and application of machine learning pipelines. A pipeline can simplify the development process of machine learning algorithms by connecting the various pieces together to enable automated discovery. For example, we can connect the train test split module to a classifier. This pipeline can be processed multiple times with different hyper-parameter combinations in order to determine the optimal set. As you learn more about machine learning, these pipe lines can become more complex and include things such as feature engineering, cross validation and performance metrics. Taken together, these concepts provide your first glimpse into their construction of real world data analytic implementations. Good luck.