Welcome to Module 3, which has a focus on portfolio diversification and how to maximize the benefits of portfolio diversification using machine learning techniques. Now, portfolio diversification is the so-called only free lunch in finance. What we mean by this is that the most efficient way to construct a portfolio is to try and diversify away unrewarded risk. In the end, no investor should be holding unrewarded risk. Holding rewarded risk is okay. Well, it's actually the only chance for us to generate extra return above and beyond the risk-free rate. But holding unrewarded risk is just not good. We want to diversify it away. Now, to achieve proper portfolio diversification, the challenge is to include an investor's portfolio assets that tend to behave differently, well, and ideally especially in period of crashes when things go wrong. You may want to call that maximizing diversity. Now, in the early days, the very most basic original portfolio construction techniques where based in using correlation as a measure of diversity or proximity between assets or asset classes. So the goal was to try and find assets that exhibit low pairwise correlations. Now we understand the limitations of looking at correlation as a single measure or diversity or lack of diversity. It actually turns out that unsupervised machine learning techniques now offer much more powerful new insights and new perspectives on the subject. Well, John will walk you through some of these techniques and how they are being applied to investment problems. Thanks, Lionel. So let me first mention the idea of comparing supervised learning from unsupervised learning. We saw previously in the last module that you can use supervised learning to identify environments where there are labels. In particular, we've seen results beforehand. In the area of unsupervised learning, we no longer have labels. So what we're looking for is interesting patterns in the data. We want to learn something from the data and try to identify the patterns that occur, and we don't necessarily have labels for it. So there's many examples of this. One of the most popular areas that it's been applied to is the notion of identifying behavioral patterns. So if you go to a store and you buy something that's unusual, the machine learning approach underneath it by the credit card company will identify if this is an unusual purchase or not. In particular, we're all used to the idea that you may get a call, you may have to answer certain questions as identifying fraud by using the notion of patterns that occur. Another example would be in the area of pricing algorithms for online shopping. When people are online, they're buying one product. There are ads and other ways to identify what you're interested in, and you might see the next segment of your time with that ad in place. In our application, we're going to use this technique, these techniques of unsupervised learning to achieve this diversification, the maximum diversification. In particular, here we see a picture of the United States, or the world rather, and we see categories of risk according to different countries. So when I go to France or Lionel comes to the US, we see green countries, and that is the safest country in that context. When I go over to China for my project with the financial, part of Ali Baba, you see a slightly higher level risk. Of course, in other areas in Africa and other parts of the world, there's much less safety and in particular much higher risk. Techniques are used to identify these problems fall in the realm of unsupervised learning. Here's another example which comes about in the area of banking. This area is to use what are called graphical networks, where we have nodes that are in this graph identifying banks and the arcs, the connection between the nodes, are looking at the relationship between the banks. In particular, the stronger the relationship, the larger the arch is. So the darker it is. In particular, where there's much less relationship between them, we see no arcs. So this area is another approach which is used in social networking and other applications. We're going to apply it to achieve diversification and use social networking. It's traditional in social networking, but we're going to apply it in the area of finance. In particular, we can see here in the next set of slides, next set of graphs that these patterns can change over time. In particular, we can have periods where there's a fair amount of diversification. Then as time goes on, as we get closer to contraction or a crash, we see higher correlation, much higher environments. So we can look at these graphs over time and we can identify how those patterns change, in particular get a sense of how we might identify conditions for crashes. This is the area of machine learning, and this is what we're going to look at in this module.