Let's try a new example and bring together some of the things that we've learned. Here's an image of a storefront. Let's load it, and try and get the name of the store out of that image. So from PIL, we'll need the image package of course, and then let's bring in pytesseract as well. So let's read in the store front image I've loaded into the course and display it. So I put this in read_only/ Storefront.jpg, and we'll just open that as an image, and display it in line. Then finally, let's try and run tesseract on that image and see what the results are. So we'll call image_ to_ string on that. We see at the very bottom that there's just an empty string. Tesseract is unable to take this image and pull out the name. But we looked how to crop an image in the last set of lectures. So let's try and help tesseract by cropping out certain pieces. So first, we have to set the bounding box. In this image, the store name is in a box bounded by roughly 315, 170, 700, and 270. So I'll make a bounding box equal to this tuple. Remember that's the upper left corner, and then we walk around the image, and you can go back to the PIL lecture if you want to be reminded how to do this. Now, let's crop the image. So we just call the image.crop, and we pass in a bounding box. It doesn't change the image, it returns a new image. So we save this to this title image variable that we'll use later. Now, let's display it and pull out the text. So we'll pull out display, and then we'll call pytesseract on image_to_string, and pass in the title image. Great. So we see how with a bit of problem reduction, we can make that work. So now we've been able to take an image, pre-process it where we expect to see text, and turn that text into a string that Python can understand. If you look back up at the image though, you'll see that there's a small sign inside of the shop. That also has the shop name on it. I wonder if we are able to recognize the text on that sign. Let's give it a try. First, we need to determine a bounding box for that sign. I'm going to show you a short-cut to make this easier and an optional video in this module. But for now, let's just use the bounding box that I decided on. The bounding box, we'll set this to a tuple of 900 by 420 for the upper left, and then 940 by 445 for the lower right. Now, let's crop the image. So we just call Image.crop, pass it in the bounding box, and we'll call this little sign for fun and display that little sign. All right. This is a little sign. OCR works better with higher resolution images, so let's increase the size of this image by using the pillow resize function. Let's set the width and the height equal to 10 times the size it is now, in a (w, h) tuple. So we'll take the new size, and we'll make it equal to the little sign.width times 10, and the little sign.height times 10. Now, let's check the docs for resize. We can see here that there's a number of different filters for resizing the image. The default is Image. NEAREST. Let's see what that looks like. So we'll take our little sign.resize, we'll pass in the new bounding box size. So that's new size, and then we'll say Image.NEAREST all in caps and pass that to display. So here you can see that it actually resize the image, and now it's maybe much more readable. I don't know. I didn't have troubles maybe seeing it before. Although it was little, and it says the word fossil. I think we should be able to find something better though. I can read this, but it looks really pixelated. Let's see what all the different resize options look like. You can go back up to the documentation to look at the names. So here I'm going to make just a list of all the different names as options. So Image.NEAREST, Image.BOX, Image.BILINEAR, Image.HAMMING, Imaged.BICUBIC, and Image.LANCZOS is how you say that. So for each of the options, I'm just going to iterate over these. Let's print out the option name. So print out whatever the option name is, and then let's display what this option looks like on our little sign. So here we're actually going to call little_sign.RESIZE, pass in the new size, pass in the option that we're looking at, and call to display. So you can see that this has run, and we have a whole bunch of different numbers are printed, and then different images that are interesting. So from this, we can notice two things. First, when we print out one of the re-sampling values, it actually just print an integer. This is actually really common that the API developer writes a property such as Image.BICUBIC, and then assigns it to an integer value to pass it around. Some languages use enumerations of values which is common in say, Java. But in Python, this is a pretty normal way of doing things. The second thing we learned is that there's a number of different algorithms for the image re-sampling. In this case, the LANCZOS and image.BICUBIC filters do a good job. Everything else not so much. So let's see if we are able to recognize the text off this resized image. So first, let's resize to the larger size. So I'm going to create something bigger sign, and I'm going to take little_sign.resize, I'm going to pass in our new size that we want. Then I'm going to use Image.BICUBIC for lack of any personal preference. You feel free to try one of the different methods. Then let's print out the text. So we'll call pytesseract image_to_string, and pass in the bigger side. Well, not really any text there. Let's try and binarize this. So first, let me just bring in the binarization code we did earlier. Now, let's apply binarization. Would say, a threshold of a 190, and try display that as well as to do the OCR work. So binarized, remember those function takes in the sign or the image I guess that we want to binarize, and then a value between zero and 255. It's going to walk through it pixel by pixel of the image and either set it to zero, or one. So change it straight up black and white. Then we'll display what the binarized sign looks like, and then let's actually try and get the text out with pytesseract too, in the hopes that a 190 is actually a good number for us to use. Well, that looks pretty abysmal I would say. It's doesn't look at all like fossil. I guess you could see some of the ases there, but really not much in that image at all. So the text is pretty useless. How should we pick the best binarization to use? There's a number of different methods. But let's just try something very simple to show how this can work. We have an english word that we're trying to detect, its called "FOSSIL" If we tried all binarization from zero through 255 and look to see if there were any english words in that list, this might be one way. So let's see if we could write a routine to do this. So we're problem-solving on our own here. So first, let's load a list of english words into a list. I put a copy in the read_only directory for you to work with. So create something eng dict, it's just an empty list. Then I'm going to open the read_only/words_alpha.text as read, you can go back into one of the previous courses if this doesn't look very familiar to you on how to work with files. We're going to call the file F. Then I'm just going to read all of F in one giant chunk and put that in data. So now we actually want to split this into a list based on those new line characters. So if you go look in that data file words alpha, you'll see it's one word per line. So I'll call data.split on slash eng, this is the new line character, and this will return a new list which is all of the different words and I'll put this into english dictionary. Now, let's iterate through all the possible thresholds and look for an english word printing it out if it exists. So for i in range 150 and 170, I'm just going to binarize between those ranges as binarizing convert this to string values, and then string will set to pytesseract.image to string. So we'll binarized, passing in the bigger sine and are given i value. So this is a binarized with 150,151,152,153, and so forth. I'm going to try them all between these two threshold values 150, and 170. So we want to remove all non alphabetical character. So that includes a parentheses, brackets, percentage signs, dollar signs, et cetera from the text. So here's a short method to do that. So first, let's convert our string to lowercase only. So string.lower, and we'll just change string. Then let's import the string package. It's got a nice list of lowercase characters. So import string, and now let's just iterate over a string, looking at it character by character, putting it in the comparison text. So we'll create some new value comparison, and then for every character in our string, remember this a lowercase. If that characters is in the string.ascii lowercase. So this is actually just checking to see if a single character is in a list of characters. Remember, a string and a list of characters are the same when you use n. If so then, comparison is equal to comparison plus that character. So we just append it to our output string. All right. Finally, let's search for the comparison in the dictionary file. So that's easy in Python. In other languages, you would have to do a lot of work. But here we just use the in comparator, and see if comparison is eng dict. Then we're going to print it out if we find it. So we'll print comparison. All right, let's run that. So you should start to see that various characters come up, and in my case fossil came up and W came up. So W is also in this dictionary, and a W was detected in data that we sent in at least one of the binarization. So well, if this is not perfect but we can see fossil there among other values, and this is not a bad way actually to clean up OCR data. It can be useful to use a language or domain-specific dictionary and practice. Instead of all of the english language words, especially if you're generating a search engine for specialized language such as medical knowledge base, or locations. So like cities. If you scroll up and look at the data we're working with, this tiny little wall hanging in the inside of the store is really not so bad. A lot of this comes down to the purpose that you're actually doing the OCR for. So if you are using it for instance to back up search engine, that's one thing. If you're using it to do text-to-speech for instance, and somebody is going to use this to listen to a lecture, that's completely different, and you have to have a very very strong method for generating the actual data. So at this point, you've now learned how to manipulate images and convert them into text. In the next module in this course, we're going to dig deeper further into a computer vision library, which allows us to detect faces among other things. So then, we'll go onto a culminating project. I'll see you there.