In the last segment, we looked at a padding Oracle attack that completely breaks an authenticated encryption system. I hope this attack convinces you that you shouldn't implement authenticated encryption on your own cause you might end up exposing yourself to a padding oracle attack or a timing attack or any other such attack. Instead you should be using standards like GCM or any other of the standardized authenticated encryption modes as implemented in many crypto libraries. In this segment, I'm gonna show you another very clever attack on an authenticated encryption system. And I hope after you see this attack, you'll be completely convinced not to invent and implement your own authenticated encryption systems. But instead, always use one of the standard schemes, like GCM or others. So this particular attack that I want to show you is an attack on the SSH binary packet protocol. So SSH is a standard secure remote shell application that uses a protocol between a client and the sever. It has a key exchange mechanism and once two sides exchange keys, SSH uses what's called the binary packet protocol to send messages back and forth between the client and the server. Now here is how SSH works, so recall that SSH uses what we called encrypt-and-MAC. Okay so technically what happens is every SSH packet begins with a sequence number, and then the packet contains the packet length, the length of the CBC pad, the actual payload follows, then the CBC pad follows. Now this whole red block here is CBC encrypted also with a chained IV, so this is also vulnerable to the CPA attacks that we discussed before. But nevertheless, this whole red packet is encrypted using CBC encryption. And then the entire clear text packet is MAC-ed. And the MAC is sent in the clear, along with the packet. So I want you to remember that the MAC is computed over plain text packets, and then the MAC is sent in the clear. This is what we call encrypt-and-MAC. And we said that this is not a good way to do things, because MACs have no confidentiality requirements. And by sending the MAC of the clear text in the clear, you might be exposing information about the clear text. But this is not the mistake that I want to show you here. I want to show you a much more clever attack. So first, let's look at how decryption works in SSH. So what happens is, first of all, the server decrypts the encrypted packet length field only. So it only decrypts these particular first few bytes. Then it will go ahead and read from the network, as many bytes as specified in the packet length field. It's gonna decrypt the remaining cipher text blocks using CBC decryption. Then, once it's recovered the entire SSH packet, it will go ahead and check the MAC of the plain text, and report an error if the MAC happens to be invalid. Now the problem here is that the packet length field is decrypted and then used directly to determine the length of the packet before any authentication has taken place. In fact, it's not possible to verify the MAC of the packet length field because we haven't recovered the entire packet yet and as a result we cannot check the MAC. But nevertheless the protocol uses the packet length before verifying that the MAC is valid. So it turns out this introduces a very, very cute attack. And I'm only gonna describe a very simplified version of this attack, just to get the idea across. So here's the idea. Suppose the attacker intercepted a particular cipher text block, namely the direct AES encryption of the message block M. And now he wants to recover this M. And I emphasize that this intercepted cipher text is only one block length. It's one AES block. So here's what the attacker is gonna do. Well, he's gonna send a packet to the server that starts off as normal. It's basically, starts off with a sequence number and then he's going to inject his captured cipher text as the first cipher text block that's sent to the server. Now, what is the server going to do? The server is gonna decrypt the first few bytes of this first AES block and he's going to interpret the first few bytes as the length fields of the packet. The next thing that's gonna happen is, the server is gonna expect this many bytes, before it checks that the MAC is valid. And so what the attacker is gonna do, is, he's gonna feed the server one byte at a time. So the server will read one byte, and then another byte, and then another byte. Eventually, the server will read as many bytes as the length field specifies, at which point, it will check that the MAC is valid. And of course, the attacker was just feeding the server junk bytes. And as a result, the MAC is not gonna verify, and the server will send a MAC error. But you realize that what happened here, the attacker was counting how many bytes it sent to the server. And so it knows exactly how many bytes were sent at the time that it receives the MAC error from the server. So that tells it that the decryption of the first 32 bits of its cypher text C are exactly equal to the number of bytes that were sent before it saw the MAC error. So this is a very clever attack. So let me say it one more time to make sure this is clear. So again, the attacker has a one block cipher text C that it wants to decrypt and let's pretend that when C is decrypted the 32 most significant bits of the plain text happen to be the number five. In this case, what the attacker will see, is the following behavior. The server is gonna decrypt the challenge block c and he's gonna obtain the number five as the length field. So, now, the attacker is gonna feed the server one byte at a time and after the attacker feeds the server five bytes the server says, hey, I've just recovered the entire packet, let me check the MAC. The MAC is likely to be false and, then, it will send, bad MAC error. So after five bytes are read off the network the attacker is gonna see a bad MAC error and then the attacker learns that the most significant 32 bits of the decrypted block is equal to the number five. So there. So, you just learned the 32 most significant bits of C. So this is a very significant attack, because the attacker just learned 32 bits of the decrypted cipher text block. And since he can apply this attack to any cipher text block he wants, he can basically learn the first 32 bits of every cipher text block in a very long message. So what just happened here? Well, there are actually two things that were wrong in this design. The first one is the decryption operation is non-atomic. In other words, the decryption algorithm doesn't take a whole packet as input, and respond with a whole plain text as output, or with the word reject. Instead, the decryption algorithm partially decrypts the cipher text, namely to obtain the length field, and then it waits to recover as many bytes as needed and then it completes the decryption process. So these nonatomic decryption operations are fairly dangerous, and generally, they should be avoided. In this example, this nonatomic decryption happens to break authenticated encryption. The other problem that happened is that the length field was used before it was properly authenticated. And this is another issue that should never be done. So the encryption field should never be used before the field is actually authenticated. So let me ask you, if you had the option of redesigning SSH what is the minimum change that you would do to make SSH resist this attack? And let me tell you that multiple answers might be correct. One option is to send a length field in the clear, just as in the case of TLS. In this case, there's no opportunity for an attacker to submit chosen cipher text attack, because, well, the length field is never decrypted. And so there's no decryption taking place that the attacker can abuse. Replacing encrypt-and-MAC by encrypt-then-MAC doesn't have any impact because this attack would apply either way. The problem is that the length field is used before it's authenticated and that would have to happen either way. So a better mode of encryption doesn't actually help. Another option is to MAC the length field separately so that now the server can read the length field, check that the MAC for just the length field is valid, and then it would know how many subsequent bytes to read before checking MAC field on the entire packet. The last option is actually one that works, but is terribly inefficient, and it would expose the server to a denial of service attack, so I'm not going to mark it as a valid answer. So the main lesson to remember is, don't implement or design your own authenticated encryption system. Just use the standards like GCM. But if for some reason, you can't use the standards, and you have to implement your own authenticated encryption system, then use encrypt-then-MAC. And make sure not to repeat the mistakes of the last two segments, namely don't use a length field before the length field is authenticated. And more generally, don't use any decrypted field before that field is authenticated. Okay so this is the end of our discussion of authenticated encryption. I wanted to point out a couple of papers on authenticated encryption that you could use as further reading. The first one is a very cute one on the order of encryption and authentication that talks about whether one should do encrypt-then-MAC or MAC-then-encrypt and it shows that one is correct and one is incorrect. It's a good read and there's a lot of information in that paper. The second discusses OCB mode, which is a very efficient way of building authenticated encryption. In particular, it looks at a variant of OCB with associated data as we discussed when we described OCB. The last three papers are attack papers. The first one describes the padding oracle attack that we discussed in the last segment. This one here describes the length attack that we just described in this segment. And the last one describes a number of attacks on encryptions that just use CPA security without adding integrity. So this last paper actually provides a number of good examples for why CPA security by itself should never, ever, ever be used for encryption. Remember the only thing you're allowed to use is authenticated encryption for confidentiality. Or if all you need is integrity with no confidentiality then you just use a MAC.