Hi everyone. Today we will learn type unification. The relevant part in the textbook is Chapter 24: Type Inference. We define this new language called TIFAE in the previous lecture, which has this type of notation with question mark. The question mark is not really a type, it's just a place holder to represent that there might be a type, but the programmer did not provide it, and we'd like to infer the type. That's what the question mark means. When we implement that question mark, we are going to make it as an optional type. We added one more case for type, which is varT. This varT is for type variable, and its field is an optional type. Meaning that in the beginning, when we say the question mark, we have no idea what it is. That's going to be none. When we have some hints about this type variable, we're going to get that hint ins that option type. To understand what I just said, we will see some examples. Again, this varT has a field of type option type. In the beginning, it's going to be none. When we have more information, it's going to be some, so this field should be variable This should be var ty. Let's see some examples. First, when we have some question mark, we have no idea, so we assign a type variable T. When we have valT, it says VarT None. Yeah, when we have some information like T should be a number type, NumT, then we can keep that information like this VarT of Some NumT. This is how we keep this information. When we have no information, it's going to be a varT of None. When we have some information, it is going to be VarT of the Some that type. So far so good? Okay. Then what if we have some information in sequence like this? Let's say Tao_1 is what type variable, so it's going to be T_1, which is varT None. Then we say, oh, you know what? Tau, T is some type variable T_2, but that should be the same as Tao_1. Then we can describe that like t1 is VarT None, and t2 is VarT t1. How about T_1? Has this information that it should be numT and Tao_2 is Tao_1. Then again, t1 is going to be Some numT and t2 is going to be Some t1. That simple. This is how we represent question mark in our Scala implementation. Then when we have VarT Some t, you can guess that it could be varT Some, Some t. There could be some wrapping of information, so we'd like to get rid of all those redundant records by defining this helper function called resolve. Resolve takes a type and if it's not type variable, return it as it is. If the input is a type variable, then look at that. If it's VarT None, that's the bottom, so we just return it. If it's VarT of Some type, we get rid of this wrapper, and call this resolve function recursively with the type T. It's simple. Then how can we use that? When we have this Tau is T, which is VarT None, resolving this T just remains the same. How about when we have this T is, Tau is numT. If we resolve this type, we are going to, one, we are going to get rid of this part, and resolve t. We're going to get rid of thing, and what do we get? Can you guess? It's going to be just NumT. Next question. When tau1 is VarT, None and tau2 is VarT, Some, t1, what's going to happen? What do you think we need to do. When we call resolve on t2, it's going to be again, get rid of this part and resolve t1. What is to resolve t1? You know the answer. It's going to be t1. Last question. When do we have this, tau1 is NumT and tau2 is again T1, what's going to happen? What I just did with the previous example is going to be the same thing with this one. You can do this by yourself. When we call resolve T2, we are going to get rid of VarT, Some, we're going to call reserve function on T1, which is going to call resolve one more time, and get rid of this VarT, Some, and what do we get? NumT. That's how it works. Then type unification. When we have these equations are in type variables, we'd like to solve them to infer the types. Our method to do that is unification. Type unification, and let's see what it is. For a given two types, tau1 and tau2, we first resolve them, meaning that we'd like to simplify those two inputs. We do not want to have those repeats, VarT, Some like that. First we resolve two types, tau1 and tau2, and then let's see what cases are possible. First, at least one of them is a type variable, that's the one case, or both are not type variables. One more time. When you have two types, tau1 and tau2, we can consider two possibilities. When at least one of them is a type variable or not. What do we mean by or not? That's both a type variables. You can see the case. You can see these two possibilities. Then when one of them is a type variable, there are again two possibilities. Because we have tau1 and tau2, and we said that at least one of them is a type variable, so tau1 is a type variable, or tau2 is a type variable. What if both are type variables? Both cases can handle them. First again. Then in the previous case, we said that there are two cases. and in the first case, we have two cases, meaning that we have 1, 2, 3 cases. Let's fled the structure. Now we have three possibilities. When tau1 is a type variable, when tau2 is a type variable, or both are not type variables. We are going to investigate each case. First, when tau1 is a type variable T, if tau2 is also type variable T, they are the same, then we're done succeed because the goal of type unification is to make two types equal. We try to unify those two types. If tau1 and tau2 are already the same type variable T, we're done. Otherwise, if tau1 is a type variable T and this tau2 is different from tau1, then we can say that tau1 is tau2. Not always, except that tau1 occurs in tau2, then we cannot resolve this equation as we discussed in the previous lecture. If you do not remember that, go back to the previous lecture, and watch that again or read the textbook. We do not want T to occur in tau2. The rule is this, when we unify two types, tau1 and tau2, and tau1 is a type variable T, then if tau2 is already the same, we're done. If tau1, T or currency and tau2, we cannot make them the same, we can make them unified, fail. Otherwise, we say that tau1, which is a type variable T, is going to be the same with tau2. If I said, good. How about the second case when tau2 is a type variable T? Then swap tau1 and tau2 and do the same thing. We will see this in the Scala implementation in a minute. When both are not type variables, they are actually easy. They are both num, then we are good. If they are both zero types, then it all types have primary types and return types. In order for two arrow types to be unified, we need to check primary types are unified and return types are unifiable. Finally, when one is number type and the other is an arrow type, then we can make them unified. It's going to be a failure. This is the entire algorithm of type unification. Are you with me? Good. Let's see some examples. We will revisit the algorithm for each case using concrete examples. Remember the first case is when power is a type variable T. Let's look at this concrete example. When we have a function call and x's type is this arrow type wise time is T. Then we need to unify the primary type of x and the type of y, which are already the same. We're done. That's the first case. Very good. How about the second case? When we have this function call. Again, we say that the arguments type T and the primary type of this function part should be the same, but we can not make these the same because T occurs in this arrow type. It's a failure. Finally, when we have a type T, and when we have some concrete primitive basic type, it doesn't need to be that, but when you have a type variable and a type, then we need to unify them, meaning this type variable should be a number type, and we're done. Look at that. We added one more equation. We saw this first case with concrete examples. That's good, and this is how we implement that. One more time. We are reading this multiple times. We are unifying two types, T_1 and T_2. Do you remember what we did first? Yes. We first resolve them to make them simple. The first case is when tau1 is a type variable. If tau1 and tau2 are already equal, we're done. Succeed. If tau1 occurs in tau2, we cannot make them the same, so it's a failure. It's an error. Otherwise, we say that, hey, tau1, you should be unifiable with tau2. This is how we say that. T_1 tau field will have some of tau2, this is how we implement the algorithm in the previous slide. Then, what was the second case? When tau2 is a type variable, then swap, and do the same thing. Really? Yeah. Really, we can just do the same thing when tau2 is a type variable. See whether tau2 and tau1 they already the same. Otherwise, check whether tau2 occurs in tau1. Otherwise make them unifiable. This is just the same. We do not want to have these many lines for that, so just swap them. Unify T_2 and T_1. Then we're going to go back to the previous case, and do the same thing. Isn't it cool? Yeah, it's cute. Finally, when both are not type variables, it's clear. If both are number type, we are done. If both are arrow types, we're going to check parameter types and return types. Otherwise, failure. This is how we implement that when two types are not. When both are not type variables, then they should be either number type or an arrow type. If both are number types, they are already the same. We are good. If they are arrow types, we are going to unify parameter types and return types. When one is a number type and the other is an arrow type, error. We cannot make them the same. We've used two helper functions. We already saw the resolve function before. What is the occurs function? You could guess. We checked whether this type variable occurs in this type. The t1 is VarT, and t2 is any type. Then, this t2 is going to be any type, meaning either, NumT, ArrowT, or VarT. If t2 is NumT, there is no way for a type variable to occur in a number type, so it's false. How about arrow type? Does the type variable t1, occur in an arrow type? An arrow type has two sub types, I mean, two types in its component; primary type and return type. We need to check whether t1 occurs in the parameter type or in the return type. Finally, if Tau 2, t2 is a VarT, then we need to check whether they are equal. What do we mean by a type variable t1 occurs in a type variable t2 when they are the same. So far so good, great. There are several frequently asked questions about the unify algorithm. First, the unification algorithm doesn't seem too right. How about this? How about that? When I teach this algorithm to my students, they asked various questions. I love them. You may have more questions. I challenge you to tackle this algorithm. Then that's going to help you understand this algorithm better. Second question, what if the type of the whole program is a type variable? Is it possible? What if the type of the whole program is a type variable? It could be thrown exception. When the exception is thrown, what's the type of it? It tests what? Type error, I mean, not really type error, exception. It's going to be some Alpha type. What is that? We are going to discuss that with parametric polymorphism in some lecture later. Finally, what does the occurs function mean? We just saw that. Occurs type variables Tau 1 and type Tau 2, meaning that whether this type variable Tau 1 occurs in Tau 2. We tried to check that because we cannot support recursively defined types in this type unification. When we have an advanced type system, we may be able to do that. If you are interested in that, you may want to look up the literature. Here's quiz. Consider the following code. The occurs function that we just saw. Which of the following is correct for the last case, when Tau 1 is a type variable and Tau 2 is also a type variable. What do we mean by occurs here? When to type variables are the same, the type variable t1 occurs in the type variable t2. Thank you.