No one can predict the future. However, there's a way to predict the future in certain data sets with greater accuracy; time series data. In this lesson, we will learn what is the time series and understand the basic concepts of time series modeling. We will also try to learn some basic terminology and then use some time-tested, no pun intended, techniques to predict the future. We will first try to answer the question: What is a time series and why is it important in finance? We will then discuss how to analyze a time series data set. What are the different terms used in time series analysis including ARIMA and how we can use it to make predictions? In this lesson, our goals are to understand what is a time series and what are some of the basic concepts in time series that we need to know about? Then we will learn about how to analyze time series data and build a model to predict a future value from past values. First, let's understand what a time series is. A time series is a series of data points indexed in time order. Most commonly a time series is a sequence of snapshots of a process taken at successive equally spaced points in time. Thus, it is a sequence of discrete time data. Examples of time series are heights of ocean tides, counts of sunspots, and the daily closing value of the Dow Jones Industrial Average. Now, let's look at what the basic terminology that we use in analyzing time series data are. First, we need to understand the concept known as stationarity. In time series data such as the chart to the right of us, US GDP data over the last 200 years. We see that it's summary statistics such as mean and variance change over time. This is because US GDP has expanded over time and an average in one 10-year period is not the same average a century later. We call such data non-stationary. So what is stationary data then? Any data such that these statistical structure of the series is independent of time is known to be stationary. In simple terms, it implies that its mean and standard deviation don't change over time. How can we find whether time series is stationary? One way to do that is simply by looking at the plot. As you can see in the chart here, it is non-stationary, meaning that it has a definite trend. Secondly, you can measure summary statistics such as average and standard deviation at various points of time in the data and check for obvious or significant differences. Third, you can look and do some statistical tests to check if the expectations of stationarity are met or have been violated. Suppose your data is trendy like the US GDP chart on the screen, how do you make it stationary? Statistical time series methods and even modern machine learning methods benefit from a clearer signal in the data, which we obtained when we stationarize a time series. One way to make non-stationary time series data stationary is by identifying and removing trends and removing seasonal effects. An easy way to do it is to difference one time period from another. That is, we take the difference between two data points and plot it like we see on screen to the right. How does it look? It still looks like it has some trend or even higher averages over time. Let's try differencing it one more time. After differencing it once more, it appears to be stationary. If you want to confirm that the mean and variance of this series is not dependent on time, you can do a statistical test known as the augmented Dickey-Fuller test. Without going into details about how the test works, we will give you a hint on how to read the test output. If the test statistic of the test is greater than a certain p-value, let's say 0.05, then the given time series is stationary. If you need more details, check out the two links on this slide. Next, let's look at why stationarity is important in a time series model. There are two reasons. Let's say we want to build a model in which averaging is used. What mean and standard deviation of your data will you use? If your data is non-stationary, then you will choose the mean from the beginning or the middle or the end, they're all different. Hence, stationarity allows you to build a stable model that uses stable parameters that don't change over time. In the air passenger traffic chart that we see to the right, we notice that there are some interesting components in time series data. The first component is called trend. A trend is a long-run increase or decrease in a time series. You can see that the chart on screen has a slight upward trend. Second, when data is affected by the time of the year, it is set to be seasonal. In this case, we can see that almost every year the chart tended to peak during the middle of the year and decrease slightly afterwards. This is most pronounced in retail sales such as snow shovels or lawnmowers. Snow shovels tend to sell well in fall and winter and then decline afterward. Third, is a cyclical component. A cyclical component is measured over a long time horizon, typically one year or longer. For example, sales at fast food chains may rise during recessions when consumers are more cost-conscious and then fall during recoveries This is tied to the business cycle. Finally, an irregular component. Irregular effects are the impacts of random events such as crashes, earthquakes, or sudden changes in the weather. By their very nature, these effects are completely unpredictable. Putting it all together, we can see that a time series is an amalgam of all these components. The definition of stationarity implies that the mean and variance of a process remains stationary, that is they should not change over time. However, looking at a stock chart, we can tell that it is not stationary. Stock prices are typically trending up or down. But in this case, they have a rising average price over time. So how do we make it stationary? One way to do that would be the difference the stock prices to get daily, monthly, or annual returns, that should make them stationary as you can see on the bottom right.