As a warm up, we're going to look at run-length encoding, which is actually an effective method in many applications, a simple method that's very effective. And it's based on the idea that it's very often the case in a bitstream that you have long runs of repeated bits. So in this case there's a long run of 0s followed by a medium-length run of 1s, another medium-length run of 0s, and then more 1s. So there's 40 bits, but only switches from 0 to 1 three places. And so what you can do is, rather than write out all the bits, we can write counts to represent alternating runs of 0s than 1s, very simple method. So in this case, there's 15 0s, 7 1s, 7 0s, and 11 1s. And with 4 bits to represent each one of these counts, we can just write 15, 7, 7 and 11, to get instead of 40 bits, get down to 16 bits. That's effective when there's long runs of 0s and 1s in a bitstream. Now, I have to decide how many bits to use to store the counts. It's not necessarily a good idea to use a 32 bits and maybe 4 is too small. So, in our code we use 8 bits, so that'll handle runs up to 256 bits. We used 4 in the example above, but in a realistic thing it's fine to use 8. And then if we have longer runs, then we have to figure out what to do. If the run-length's bigger than the max count, well, we can just intersperse runs of length 0. And there's one way to handle it. But there's other ways to handle it too, but this is the very simple scheme that covers all the bases and can be very effective. [COUGH] For example, consider a bit map representation of this slide. There's huge long runs, say, with black and white, there's huge long runs of white that depending on the resolution might be hundreds or a thousand bits but could be represented with just a few counts. And so many applications of bitmaps of text and other things like that, this is very effective in its use in all kind of technologies like JPEG and fax and others. It's very simple to implement, so this is our warm up data compression algorithm that implements run-length encoding. Actually, we left the compress for the book, this is just the expand. So I'm given a bunch of counts, how do I reproduce the original uncompressed text string? And so, it's as simple as that. So log R is the number of bits per count. And so basically, what we do is read log R bits at a time into an int, whatever the value is, so we put that into the int run. So that's a number between 0 and 256, which is the maximum you can get in 8 bits. And then starting with 0, the first count, that's the number of 0s we need to write. So we just write them out one bit at a time, 0s one bit at a time. And then we flip the bit to make it 1 and read the next count. And now, we write out that many 1s and so forth. So, this is a ten line program that does expansion for run-length encoding. And you can think about or look at the book for how to do compression. It's just as simple. So, this is just an example of the effectiveness of run length encoding for one letter, the letter Q, in a typical black and white scanned image. Even for a single letter, there's lots of redundancy, lots of runs of 0s. So with a relatively small number of counts, we can represent a bitmap. And this is the hard case, actually most of a printed page is all this blank space, as I said. So typically, if you just don't compress at all and you have an 8.5 x 11 piece of paper with 300 pixels per inch, that would be 8 million bits. But most of those are white and typically with run-length encoding you can get a substantial savings simply by basically counting the white bits. And even when there's letters involved, you can get, say, maybe there's only 3,000 characters. That's another example. If it's all text then maybe the text itself is a great way to compress it or the programmer or the document processor that produced the text, but now we're starting to get into undecidability issues. So let's think more in terms of a page that has an image, a drawing, and some text, and so forth. So it makes sense to start with a bitmap and then use as few bits as possible. And then run-length encoding is going to be very effective. That's our warm up case for data compression.