[MUSIC] So how did Benzer compare his map with the other maps, the other kind of map? Well, what he did, is the following. He asked two sets of colleagues for rII mutants that had been mapped classically. He asked Josh Dorman who had been at Oakridge with him, and Martha Chase would was later going to work with Hershey, for a series of mapping. The mappings done by Hershey and by Chase and Dorman were classical maps. That is, they would cross, plate, and count all the progeny and screen for parental [INAUDIBLE]. Which is quite tedious for the frequencies below a few percent. So, they have the map. They send the mutants and what Benzer did was to map Chase and Dorman's mutant into his segments. For instance, that means that mutant 287 was localized to this segment. That was the meaning of these dashed lines. And so forth for the other ones. He also got another series where the recombination frequency were measured in a slightly different ways, and he got this collection. It's a slightly larger collection too. The collection was provided by Edgar Feynman. Now this is a real fine man. This is Richard Feynman the physicist, the Nobel prize laureate in Physics. He was doing a sabbatical in the Dellbruck's lab and we'll meet him again later on in the course. But that was a very short sabbatical and he did manage to finish it. And he did Benzer's to use this collection and they found that all the mutants could be assigned to a linear order of segment, according to their linear map position. Of course, there are size differences between the two maps. In this case the maps is based on recombination frequency. And in this case you assume, although you have no evidence, that the possibility to recombine DNA is the same along the sequence. The possibility, to get the combination is unaffected by the local sequence, which may be true, may not be true. In fact we know now that it's not necessarily true and that their weight with intruance combination frequency. But assuming that, you'd draw the mutant on the line based on the distance. Now what about the map of Benzer? Benzer counted his map in the following way. He measured how many sites do I have in each segment? And we need nucleotide if you want. And the larger the fragment, the larger for instance, you can see that this is large and this is small. This has a lot of sides, this had few sides. That's how he measured it. So of course, the proportionality of the two maps is not exactly congruent. But the order is congruent, completely co-linear. Which was an extremely important aspect to accept the mapping by deletion. So once we've done all of this, we can get to the actual topography figure. This figure is represented in many, many textbooks. This is the figure of the 1600 spontaneous mutants. Each little square in this figure, represents one individual mutant. So for instance if we take segment A1a, we find that in segment A1a we have one, two, three, four, five sites. A1a, 5 sites. That is the number that will give you the length in the previous figure. There are also one, two, three, four, five mutants. Three of the sites have one mutant. One of the sites has two mutants. And one of the sites has no mutant, but the site has been identified later on with mutagens. So that's what the system means. Now, that the distribution is non random leaps to the eye. You probably wouldn't use this in a modern paper, but it's so striking. Look, of the 1600 mutants, half of them localized to one of two sites. These sites are called hot spots. Hot, because they're regions of the DNA which are fragile. They're particularly fragile and susceptible to mutation. This site is 500 times more susceptible to mutation than this site that has 1 mutant, roughly. Again, something not explained by the DNA sequence and by the double helix model. Now, today we know that these two sides, these two hot spots have the following sequence A A A A A A. Six As in a row. Why would that be a hot spot for mutation? Well, because sometimes stutters. Stuttering means that you will remove an A or add an A. A6 can go to A5 or to A7 if you stutter. In both cases, you've changed the number of nucleotides. You've added or removed one nucleotide. There is one difference between A5 and A7. A7 is very unstable. A7 will very rapidly mutate back to A6. Now these kinds of mutants, the stuttering mutants, are not limited to the R2 genes of T4. You find these stuttering mutants in E.coli genes. And you find these stuttering mutants in human genes. The human diseases that are called triplet disease, where the disease stages is the number of triplets you have. You have this in Huntington's disease, you have this in many diseases. You have this in cancer genes. For instance, a certain form of colon cancer called Lynch syndrome have changes in repeat length because of defect in DNA repair. The most beautiful example of, well, the most telling example of this occurs in [INAUDIBLE] colon cancer but in a very special situation. There are groups of Ashkenazi Jews who are prone to colon cancer, more than the rest of the population. The major gene involved this normally has AAT AAA. That's in most of our DNA. This population has AAA AAA. Of course you change the amino acid, but that has very little effect. What you do when you do this is you create a place which is a hot spot for mutations. So the mutation per se is not deleterious, but it creates a terrain for danger. It's a pre-mutation, if you want. So this is a particularly telling story. Of course, when the people wrote about this, it was published as an letter in Nature. Of course they forgot to quote Benzer, who had discovered the hot spots and had discovered the meaning of the hot spot. And Streizinger would discover the mechanism of the the hot spot by proposing the stuttering. But that's life. I have to go back to this slide because one of the thing that Spencer was interested in was how many mutable sites is there in a gene. How many pieces in engine can I break and prevent the car to function? That was his question. How many screws can I remove and block the engine? So, which is a difficult question by the way, but he wanted to see whether by having done this enormous effort, he was in a position to predict the number of sums. So what did he do? Well, he knew that most of the sites here have one mutant. Some have two, like this one has two, this one has two, this one has two, this one has two, this one has three, I believe. That said, so he counted how many sides have one mutant, two mutants, three mutants, four mutants. And 517 mutants. Of course there is only side that has 517 mutant, that hot spot. So he counted and he found that there are 116 sites with a single mutant and 52 sites with 2 mutants, and approximately a third 23 sites with 3 mutants etc, etc, etc. So he look at the distribution, and it occurred to him that by counting the sites with one mutant and the site with two mutants, he could use a Poisson distribution, To estimate the zero [INAUDIBLE]. Now, the way you want to do this is you have to calculate the mean. So what I suggest you do, is if you're not inclined to use a math formula, is you take 64 coins, and you go to a place that has a chess board. And then you throw the coins, and let's assume you can throw the coins only on the chess board. You throw the coins. If you throw the 64 coins onto the 64 square chess board, what you will have at the end are chess boards, the square, you would have empty square. Square with one coin, square with two coins, square with three coins, etc, etc. Now we can do the same experiment but use 32 coins on the 64 square. Or 100 coins on the 62 square. In all these cases you will find a distribution that is obeying the personal rule. And if you have the number of squares with one coin, and the number of squares with two coins, you can estimate the mean. And from the mean, you can estimate the squares that have no coin. The sites that have no mutants. Why? Because, this is called the P0 refraction. It's the mean power 0 which is 1 x e -m over 0 fact, which is 1. So it's basically this fraction is e- m. Once you know the mean, you know the P0. And in this case, they predicated 129 sites. So after all his work, Benzer has the notion that the rII locus has to contain at least 380 nucleotides. It's a little bit dissapoinint, all this work, because we now know that the locus has about 3,000 nucleotides, so he's far from saturation. But he knew that there are mutogens that can induce mutants. And maybe the mutagens will touch portions of the DNA that are not prone to spontaneous mutations. And so he looked at a number of mutagens to see whether he could identify new site, I already told you he did identify new site. So I'm just going to illustrate the mutagen part by one section. He had 160 spontaneous and about 800 mutagen induced mutants. Now he mapped all of these. And if for instance so that there is this site. Empty 50 something, I can't read the number, of this site, EN 136. Of this site, N21 that were not visible without the mutagen. That means the normal way the DNA is hurt, preferentially touch a certain site. When you use a mutagen, you can see other base pairs. And that's very important because you can use now this, for instance, to determine whether people who have liver cancer were exposed to a toxin from a fungus that will contaminate beans which is a toxin. All you can see were the people who have aniline poisoning have developed certain cancer, whether it's due to aniline poisoning, or not due to aniline poisoning. These are signatures left by mutagens. So you get new sites. Okay. Now, of course, when you use a mutagen, there's one thing that you have to always remember is that the spontaneous mutants will not disappear. The spontaneous mutants will always be there. It will be a background, a noise of spontaneous mutant on which you add your mutagen induced mutants. And one of the nice way about this is that you can look at the major hot spot, this hot spot with 6A's, and you can look at the prevalence of this hot spot and the mutagene induced series. If the prevalence is low, you know the mutagene is effected. You can imagine this as being the sea that will rise up and down. If the sea rises up and only the mountain stays above the see, you will pick the mountain. But that doesn't mean that the other mountains and little places don't exist anymore. They're just under the water. This notion of mutagens and efficiency of mutagens is very important. It was actually already discussed by Benza, who was the first one to see that mutagen do things that the general replication machine doesn't do. And with this he got about 253 more sites so you write a 300 site grant total. And now at the end of this, he has probably driven the mutation analysis of a gene almost to the ground. What, of course, Benzer didn't know was that the amount of r2 proteins that is required to grow on a Kalander is only about 3% of what is made during a Whitoff infection. The Whitoff infection is like a factory that produces a lot of things. We don't care whether we're going to sell them all or not sell them all but we produce them. We don't care whether I'm going to need all the tools or not need all the tools to produce them. And so a lot of the mutants cannot be isolated by in the system just because they don't damage the gene enough. Only the very strong mutation will show up as mutants. And if you don't see the mutant, you don't isolate them, you don't study them. So, [COUGH] at the end, of this, in his summary, a small proportion of the genetic map of page 84, the two of the r2 region has been dissected by overlapping deletions into 47 segments. He comes back to the branches because their work is nagging him with a possibility that the branches exist. If any branch exists, it can not be larger than one of the two segments. And then Dahlberg is happy. The overlap deletions I use to map the mutants and the map order established by this method is consistent with the order established by the conventional method that makes use of recombination frequency. 308 distinct sites of widely varied, spontaneous and induced with ability. The DNA is not uniform. The distribution through this region for spontaneous and mutagen-induced are compared. And he uses different mutagens over, some are used to a large extent some to limited extent. And we're going to come back to the mutagens because they're going to be very useful in the next two classes. And the end is the characteristic hotspots will be a striking to public, which is probably the take home message of this paper.