Welcome to Designing adaptable MLl systems, the third module in the second course in the advanced machine learning on GCP specialization. In this module, we'll learn how to recognize the ways that our model is dependent on our data, make cost-conscious engineering decisions, know when to roll back our model to earlier versions, debug the causes of observed model behavior, and implementing pipeline that is immune to one type of dependency. In the 16th century, John Don famously wrote in one of his poems that no man is an island, by which he meant that human beings need to be part of a community to thrive. For similar reasons, it's also long been the case that few software programs are islands. In software engineering terms, we would say that few software programs adopt a monolithic island like design. Instead, most software today is modular, which is to say that most software today depends on other software. One of the reasons that so many programs are modular is because modular programs are more maintainable. Modular programs are easier to reuse, test, and fix because engineers can focus on small pieces of code rather than the entire program. Though there are many benefits to modular design, there are still complications. Modular programs are by design dependent on other software. As more and more software was developed, the dependency trees within modular programs became large and for a time, this proved to be a cost that offset the benefits of modularity. Imagine having to find and install hundreds of libraries on your computer every time you wanted to use a new library, or being unable to reproduce results and not knowing why. There was even a name for this problem, developers called dependency hell. Thankfully, the tooling caught up to developer needs, and these days managing dependencies is much easier because of tools like Maven, Gradle and Pip. The reason that these tools have been helpful in addressing this problem, is that we specify which versions of libraries our programs depend on. Dependency management became mostly a matter of setting up the same environment and then parsing the dependencies of your software, and the result is that we now know precisely which code paths will be executed at run-time. Containers are a piece of technology that also make it easier to manage dependencies. A container is an abstraction that package's apps and libraries together so that applications can run on a greater variety of hardware and operating systems, which ultimately makes hosting large applications better. To learn more about Kubernetes, Googles open source container orchestration software, check out the getting started with Google Kubernetes engine course. But what if we had no way of characterizing a specific version of a library and had to rely on finding similar libraries at run-time? Your dependency management software might think, well, I don't have superpowers but I do have ultra powers. What if furthermore someone else got to choose which version got run and they didn't know or really care about our program? Then we'd have no way of knowing what the run-time behavior would look like. Unfortunately, this is precisely the case for ML, because the instructions that will be run at run-time, for example, the model weights, depend on the data that the model was trained on. Additionally, similar data will yield similar instructions, and finally other people including other teams and our users create our data. Just like in traditional software engineering, mismanaged dependencies say code that assumes one set of instructions will be called when another end up being called instead, can be expensive. Your model's accuracy might go down or the system as a whole might become unstable. Sometimes the errors are subtle and your team may end up spending an increasing proportion of its time debugging. The good news is that with a better understanding of how to manage data dependencies, many problems can either be detected quickly or circumvented entirely.