Hi, I want to welcome you to our MOOC on the mathematics behind Moneyball. My name is Wayne Winston. I'm a clinical professor of decision and information sciences at the Bauer College of Business at the University of Houston in Houston, Texax. It's been a fairly good year for Houston sports. With the Astros doing very well and the Rockets going to the Western Conference Finals. And a lot of that's due to Moneyball with Daryl Morey running the Rockets who's a MIT Sloan School MBA. And, the front office at the Astros really being touted as the most analytic savvy front office perhaps in all of sports. Sports Illustrated had a nice cover on the Astros saying World Series Champions 2017 mainly because of analytics, and the Astros may be ahead of schedule, who knows? Okay, so we've got a file, sports syllabus, that will give you a list of the topics we're going to cover and so let me give a brief overview of these topics. We'll start with the famous Pythagorean theorem, not from geometry but it comes from baseball. Basically the idea behind the Pythagorean theorem is how many games will a team win if you know how many runs they've scored and given up in baseball or points in football or basketball or goals in hockey. We'll spend some time on that, you'll see it's important because it helps you understand how valuable a player is if you know that scoring ten more runs will win you one more game. Then we'll talk about the math of baseball because that's where this sort of all started, largely thanks to the famous Bill James and we'll talk about how you evaluate hitters. We'll talk about how you evaluate pitchers and fielders. We'll learn a bunch of statistics, including multiple regression will be a key tool in our tool kit. And you know I'm like Schultz on Hogan's Heroes in this course, I mean probably a lot of you younger people don't know Schultz was on Hogan's Heroes but he said, I know nothing. So I assume you guys don't know anything about the topic and I'll try and teach you everything. So if you're intelligent and you work hard, you should learn a lot here. Some of you may know a lot of this stuff, I'm not sure. Okay, but then basically we, after we're done with, while we're studying baseball we'll learn a lot about Excel. Functions in Excel, conditional formating, pivot tables that we'll need to analyze a lot of sports data. So you'll learn a lot of Excel, as well as learning a lot about sports and math. Then we'll talk about an important tool called Monte Carlo simulation. Playing out on a certain situation, over and over, which is pretty useful way to evaluate a baseball team if you're going to add a player to the lineup, how will your team do? And then we'll even talk about deflategate in the context of simulation. And yes, I sort of do think the Patriots were doing something to these footballs for a while. Then we'll talk about baseball decision-making. If you saw the Moneyball movie or read the Moneyball book by Michael Lewis, you know that most aficionados in Moneyball think you shouldn't put much in baseball. We'll talk about why that's true. Other things in decision making, talk about evaluating fielders as I said, evaluating pitchers, the famous concept of wins above replacement which is actually used fairly often now to determine sports writers' MVP votes. I mean Mike Trout should have won one year over Miguel Cabrera even if we won the triple crown, and we'll discuss that. Okay. And then we'll talk about some newer developments in baseball, the shifts that you've been seeing in baseball, okay? There's a great book, Big Data Baseball, about the pirates sort of starting this trend towards shifting. And how do you evaluate a catcher's defense? Catcher framing, a catcher can sort of make a pitch look like a strike when it's not. How the park a player plays and affects your evaluation of his hitting ability or pitching ability. Then we need to learn a little bit of statistics. What's a random variable? What's a normal random variable? Talk about the famous hot hand fallacy. Do players get red hot? Do teams get red hot? And then we'll turn our attention to I guess America's game, football. What makes NFL teams win? We'll find out, it's mainly yards per pass attempt. We'll try and understand the famous NFL quarterback rating formula using regression. And then we'll get into something extremely important, football decision-making. For instance, how do football teams, do football teams go forward enough up on fourth down? Do they have the right run-pass mix and things like that? The famous Bill Belechick decision against the Colts to go for it on fourth and two when the Colts were trailing the Patriots by six points. It was the right decision, even though a lot of announcers really did not think it was. Okay, then we'll talk about how you can use great stuff on footballreference.com. We'll be using all their sites, basketball reference, baseball reference, they're just invaluable to the sports analysts group. We'll really talk about sports analytics when we talk about the math behind Moneyball, that's the term that's used often. And we'll talk about how you can use text functions and the play by play data on pro-football-reference.com to evaluate teams' play selection. The Texans, for instance, are really horrible at running plays on 1st to 10, but they did a lot of them. Then we'll talk about two person zero sum game theory, which explains why, even if your passing offense is better than your running offense, you shouldn't always pass. And if your running offense gets better, you should actually maybe run less than you did before surprising them. Then we'll get to basketball which I feel like know the most about. Because with my colleague and friend Jeff Sager and of USA Today we worked with Marc Cubin in 1999 and 2000 to develop a system to rate players and lineups and a lot of work has been done, sort of I think off our work to really advance basketball analytics. We'll talk about shot selection and the Lakers and Knicks shot selection in 2014-2015 was atrocious. The Rockets shot selection was great. We'll talk about the four factors that make a basketball team win and some of John Hollinger's great work on team metrics. Points per possession which I think you've heard of. Then we'll talk about how to figure out how good a player is, a basketball player. You Sshould look at the box score, you should look at how the player moves the score of the game. We'll talk about how to evaluate lineups. A concept Jeff and I developed the adjusted plus-minus which is now ramped into ESPN's pretty good regularized real plus-minus system. We'll talk about how analytics help the Mavs beat the Heat. Some of the new data that's available in basketball, sports view data which really fits the definition of big data. And then we'll talk about using Excel and optimization to set point spreads. How do you rate NFL teams, find the strength of schedule, how do you figure out a prediction for the total score of a game? How can you predict the score of a soccer game? We'll talk about rating teams just based on wins and losses because the BCS required that, which was sort of silly but didn't do anything about it. How to simulate the NCAA tournament, to figure out the odds on a team's winning? I didn't think there were some bets Vegas gave out on Kentucky that were pretty strange, they're called props bets. We'll talk about that. How do you simulate the NFL playoffs at the beginning of the playoffs? Theory and figure out the odds of each team winning, and the last two years basically, I think we had the Seahawks and the Patriots to win going in. How can you rate NASCAR drivers if you have place finishes in each race? Then we'll talk a bit about gambling. You may know the money line is how you bet on a game on who wins, and not to be go about point spreads and they're related. If you know the point spread, you can figure out the money line and vice versa. How can you tell somebody has a successful betting system? How do props bets work? Know that you can bet on the first score in the Broncos-Seahawks game being a safety. And what were the odds on that? How did Vegas figure them out? Were they correct? Then if you do have a successful method for betting, what percentage of your money should you bet on each bet? Even if you're picking 80% against the points spread, you shouldn't bet 100% of your money each time because then one loss will wipe you out. What is sports arbitrage? Sometimes there could be opportunities to lock in a small profit before the money line or the point spread changes. And then the big trend is toward these daily fantasy sports games you hear advertised, Skill Zone, Vandor, DraftKings, etc., on ESPN, radio and TV, probably 20 times a day you see ads for those. And we'll use Vandor, for example, but how does daily fantasy sports work? And how can you can use optimization to come up with your best daily fantasy lineup? And we'll have test questions throughout the course, of course and other problems. So how should you study for this course? Start with suggested readings here, and they're both by me so I guess I think they're okay. You really don't need to buy these. I mean, I think if you just watch the videos you'll be able to learn everything you need. So I wrote a book Mathletics. The paperback is with Princeton University Press in 2009, a lot of what we'll do is in that book. Although I think every day that book becomes more outdated because it feels it's changing by the day. But there's a lot of stuff we'll do that is explained in more detail in Mathletics. And then I have a book with Microsoft Press, Data Analysis and Business Modeling with Excel 2013. And we'll be using Excel 2013. And Mac users, you need to run a Windows version of Excel to do some of the things that we're going to do. I can't do anything about that, I'm sorry. Okay, so how should you study? So for each video, you can see there's a list of the videos here. I haven't finished all of the videos and some of these are at this point. Okay, but you should open up the starting file that goes with each video. See there's the starting file in column I, and then you should watch the video, pause it if needed and follow along with the starting file and try and do what I do. That, I mean, at Microsoft they would say everybody likes the drive to learn how to do things, and that means basically follow along, just be an active learner, okay? And then once you finish and think you understand what's in the video, there will be a homework problem. Try the homework problem on your own, you'll have the answer. And then you can try the test question that goes with that homework problem. And basically your grade will be totally based on the test questions. So 90% and above will be passed with distinction. 70%, I guess, to 89.9% will be passed, and below 70% will be failed. Of course, you will not get credit for the course, okay? Or another way you can approach the class is watch the video in its entirety and concentrate on what I do and then try and go back with the starting file. And do it from scratch yourself and see if you can duplicate what I did and then try the homework problem and attempt the test question. But that should get you going on how to learn about this subject which I'm sure you're interested in, okay? You like math and we all love sports and a lot of us are fascinated by the way math can help a team win, okay, make a team better. And that's what we're here to learn about and we'll learn a lot of interesting math along the way like regression, simulation, win sampling, optimization. So, hopefully you'll get started on the course and I hope you'll enjoy it.