Hello and welcome to Data and Statistics Foundation for Investment Professionals. This the first of three courses that together make up the specialization Data Science for Investment Professionals. This course and the others in the specialization have been created by the CFA Institute and are designed to build a bridge of understanding for all investment professionals so that you have a deeper understanding of and a heightened ability to discuss the topics of data science and machine learning. As these topics continue to play an increasing role in investment management and creating that elusive Alpha, we became aware that there was a need to level the playing fields that non-data scientists, for example, would be able to have an informed dialogue around these topics. Either as a team member, team leader, the person responsible for implementing change, or perhaps the person explaining an algorithm-based investment approach to a client. My name is Neil Govier, a director at CFA Institute and a member of the Professional Learning Team and I have pleasure in welcoming you to this course, a course that is effectively an introduction into data, getting it, organizing it, presenting it, and analyzing it. It is a course that provides you with a foundation that is required for you to develop a greater understanding of data science and machine learning. Remembering the foundational nature of this course, let me ask you three questions: One, you have to decide between two investments, A offers you an expected return of eight percent, B, offers you an expected return of six percent. Which one will you choose? Question 2, you find that your salary is below the industry average. How sad does this make you? Question 3. You note that a sample of 10 funds with the same investment style outperform their benchmark by two percent. Will you invest in this style of fund? Ok I've not given you long to ponder the questions, but if you answered A, very, and yes, I'm not saying you are wrong, but I am saying that you may not have all the information you need to make the best-informed decision. As they say, it's all about the data. Using, presenting, and analyzing data is what this course examines. Before I say anything else, I must mention Python. Although it is referenced and demonstrated where appropriate, we are not trying to create coders. Rather, it is our intention to demonstrate the value that Python has as a tool similar to Excel and the role it can play in data management, analysis, and visualization. I will say more on Python later. During this course, we will look at types of data, ways of organizing and presenting it, and introduce some analytical techniques such as hypothesis testing. We will examine measurements of central tendency and dispersion, which, of course, in practice are the classic measures of risk and return which every investor focuses on, and why the correct answer to the first question I asked is, "I don't know until I know the risk of each investment." We will calculate and rationalize the use of sample statistics such as the mean and standard deviation, and understand their practical use, and why a better answer to question 2 is that it is probably better to compare your salary to the median rather than the mean. We will demonstrate sampling theory and hypothesis testing where we use, for example, the mean of a sample to help determine the possible mean of a population or test a hypothesis we have by taking a sample and why, when answering question 3, you should have hesitated and thought about the impact of the sample size before rushing to an investment conclusion. As this course is designed as a foundational course, it does not require prior knowledge. Although some formula are presented in the course, they are done so only to bring understanding to what a particular statistic, eg., the standard deviation is actually measuring. There'll be no need to memorize any formula to succeed in this course. Before we get started, a few words on the structure of this and other courses in the specialization. Each course is broken down into modules, and each module consists of lessons. Interwoven into these lessons are examples and assignments, please complete them all. There is a logic to the sequence of the modules in the course and the lessons within the modules and I advise that you do all elements in order as they will build to a greater understanding of the course topic. At the end of each module, there is a graded assessment, please ensure you attempt these as they contribute towards your course performance and your certificate of completion. If, however, you believe that you are already familiar with the content of a module, including Python, you can skip to the end of that module and attempt the end-of-module quiz. Do remember though, that your score on these quizzes counts towards your overall course performance. Having completed this course, you'll be able to explain basic statistical measures and their application to real-life datasets. Understand the use and appropriateness of different distributions. Calculate and interpret measures of dispersion and explain deviations from a normal distribution. Compare and contrast different ways of visualizing data and create them using Python. Explain sampling theory and draw inferences about population parameters from sample statistics. Formulate hypotheses on investment problems. And lastly, there is a discussion forum for you to post any queries you may have or reply to any queries you see posted. To get you started in the forum, perhaps you'd like to introduce yourself and say why you are doing the course. If that sounds too intense, just say hi. Ok, enough talk from me. Let's get started with the first module.