Hi everybody. Last time we started our tree job and I have some great news for you, it is finished. I know that you are as excited as I am at seeing how these endosymbionts genome shape up on a tree so let us go find a job. How do we find the job in PATRIC? Well, you could go into your work spaces and if you could remember where you saved it and what files you created, you could go there. I am not going to go there because my workspace is a mess. I have had my workspace for a long time and I didn't make good choices and it is filled with stuff. I hope you will learn from my example and being nicer and neater about your workspace. But if you can do it, go there. Another thing you could do is go up into work spaces and click on my jobs. That is another way to get access to your jobs. The way I like to do it as go down to the jobs moniker here, which shows me queued, running and completed jobs. If I double-click on this, it is going to open up my jobs page where I can see all the jobs that I have running. When we last left off, this row here was the job we were running with ten genes on this endosymbionts tree. So how do I see this job? First, I have to highlight it and then you will notice that populates this vertical green bar with two icons to potentially downstream actions. Also you have some more info here if you are interested in it, some more information about the job. If you had a problem with the job, you could click on the report issue icon but we just want to see this job so let us click on the View icon. This will reorient the page and show you the information for your particular job. Across the top here is the breadcrumb that shows you where that job is in your workspace. It is in my home directory in the file called endosymbionts trees and there is the job itself. Right underneath it, it shows me some information about the job, the job ID, how long it took to run, and the different parameters. If I click on this arrow here, it will show me a lot of boring stuff that generally I do not want to see, but you may be interested in, tells me the parameters that I selected when I submitted the job. Now I am going to do two different instructional videos showing the different job results that come with any tree job. The way we do things generally as we tried to anticipate all the different files that you would want to see and give them all to you. Looks a little small here, but if I open up this one, this detailed files, you will see there is a lot of data here. But what to me is the most exciting thing about the Codon trees pipeline is this report.html that is really awesome thing that I want to show you today. Let us highlight the row that has that report and you click on it. Then you see you get these particular downstream functions that you can do. I want to view it. I may want to download it. I never want to delete it. It is so important, do not ever do that. If you want to share with other people, you can copy it or you can move it. But let us click the view button and this opens it up. Here at last is our glorious tree. The bootstrap values are pretty large. I will give you more details about exactly how the tree is run. At the very end of this series, I will tell you exactly how you should cite the programs used to build the trees. I will show you exactly how to find that information, how Bootstraps were done, all of that. I do not want to go into that now. I just want to show you the glory of this report. You go down here and you can see all the Buchnera. Remember we just did this with ten genes then we have got Blochmannia, the endosymbionts from ants, Wigglesworthia, the endosymbionts from Tsetse flies, Riesia, the endosymbionts from lice and then even more Buchnera, this is not what I expected. You have got the Buchnera are split and the support values look pretty good, but you have got some 75 there, this tree is midpoint rooting.. Let us look more at the statistics with this tree. The tree analysis statistics tells me that there were 82 genomes, zero deletion, zero duplications. Remember we talked about that looking at the protein families that allow you to use genes from genomes that were duplicated or also include protein families that had a particular genome that might have been missing that, we did not include any of that in this one. Although one of the ones I will be doing is showing you how to improve your tree so we will talk more about that in a later video. We requested ten genes. We found five, only five that they all shared. That also included five protein alignments. These were the number of amino acids that were aligned. These were the number of CDS alignments, and these were the number of nucleotides that were aligned. We did not even get ten. I'll show you some other things too in gaming the system and how to improve your tree. RAxML, which is the program that builds the tree, gave us some warning, it tells us that some of these genomes are identical, and it tells how long it took to run. Here's the RAxML Command Line, for those of you who are into that stuff. These are the genome statistics, there's the genome ID, it tells me how many total genes I've had, how many of those who are single copy, and how many of those single copied were used, and here's the name of the genomes. A good number of genomes, we have to scroll through that and see all of those, but it went from a pretty small, teeny, tiny little genome, which is this guy, up to one, none of the endosymbionts are big genomes, but when you go to Blochmannia, they are considerably more genes. Then it tells me the statistics of the gene families that were used, and What they're PG fam ID was, and the different parameters of the alignment, and what their name was. This sappy could put right into a supplementary file that you submit with your paper, and reviewers would love to see that. Although if it's only five genes, reviewers aren't going to lock to see that, they're going to want to see more. Then here at the bottom, this is the most important part, strategies to increase the single copy genome number. This is telling me there are certain genomes that I could eliminate. Remember, I said I wanted to look at 10 genes, it says if I eliminate these two genomes, I can go up to 23 genes. If I wanted to see what this was, I do a copy, and then a fine function, and then start looking for it, and then I'll be able to find out who it is, there we go. I'll let you know right way, a lot of our problem genomes are here with these Laos isolate, it's telling me if I get rid of this guy, and let me scroll down, let's see who this guy is, so I copy it, I do the control find, paste it, and then I have to march through to find it again. up, another one, that's too bad. This is the one I like, because it's from the pubic lice and the other ones are from body lice, but nonetheless, it's telling me if eliminate those, I'll make the tree stronger. Let's go back to this, to look for Genomes that are identical, saying this Genome, and this Genome are identical, the same first Genome, and this one are identical, the same first Genome, and this one are identical. It's giving me some hints on ways I could possibly improve my tree, if these are all public Genomes and you're just trying to make a nicer tree, this one has a lot of genomes in it. Generally I find that maybe 75, maybe 60, 50 Genomes, make a good-looking tree per publication, this would be too many, so I could go in and cut some of these duplicate files and be able to improve the quality of my tree. In the next Webinar, I'm going to show you how to look at the other files that come with a tree job, and then after that, we're going to go into the nuances of how to improve your tree, make it look better, get rid of some of these, remove some of these from your Genome group. We're going to start marching you through things like removing problem genomes from the genome group, and then being able to determine how many genes you could request, and how many deletions and duplications you can allow. I hope that you'll join me, I love this pipeline and I'm really anxious to show you how you can use it for the best advantage, to make the best tree that you can use for your publication. Here's your third assignment, and of course you'd have to wait until after those tree jobs have completed, but once they have completed, I want you to open the records, the Codon Tree Reports for the five submitted jobs, you could have five tabs open, and open on each one. Look at those trees, how did the support values change with the Brucellaceae tree? How did they change when that was run? What happens to Pseudochrobactrum when the tree that includes Bartonella is compared to the tree including Mesorhizobium? What hints do you have about who Pseudochrobactrum really belongs to? Which of those trees fell short when you requested 100 single copy genes? Which of those trees could the Codon trees Pipeline not give you 100 genes for? Good luck.