Hi in this module, I'm going to be talking about, what is a study? To test a theory, we need to conduct a study. As you may remember from an earlier lecture, conducting a study involves several steps. The first of these steps, is to specify a hypothesis. The next step, is to actually design the study. This will involve a plan for collecting data. Once we've collected the data, we have to go and analyze it. The analysis of the data will reveal whether or not our theory is confirmed or refuted. I'd like to go back and review the definition of a theory. A theory specifies a relationship between concepts. For example, a relationship between education and health. Now what makes a theory a theory as opposed to a simple description of a relationship is that it should also offer an explanation for the relationship. For example the association between education and health works through income. Without an effort at explanation, a theory, again, is simply a description of an empirical regularity. To turn our theory into a hypothesis that we can go out and test, o ne of the first steps is that we need to operationalize the abstract concepts considered in the theory as indicators that we can go out, and measure in the real world. So we have to translate each concept into an indicator. Education we might measure with years of completed education for an individual. We might measure income as annual earnings. And we might measure health as a response on a questionnaire, asking people to provide their self rated health. A hypothesis specifies the relationships that should be observed between the indicators if the theory is indeed correct. In the example of the role of income in accounting for the relationship of education and health, the relationships between years of education, annual earnings and self-rated health, should be as predicted by the theory. If they are, then we consider the theory confirmed. One very important attribute of a hypothesis, indeed a theory, is that we should be able to specify a hypothesis that is, what we say, falsifiable in that our study should be capable of finding that it is not correct. So, for example our study should be able to reveal if indeed the relationships are not as expected from the theory, in which case we reject the theory. Now let's talk about the next step after we specify the hypothesis. A properly designed study is the key to successfully testing a hypothesis and therefore proving or disproving our theory. The design should include a plan for the collection and the analysis of data. It should also reflect practical consideration such as the cost involved and the time required, both of participants in the study and the investigators. Crucially the design should be such that the hypothesized relationships should only appear as a result of the analysis if the theory is indeed true and the hypothesis is correct. One important element of a design is that if there are alternative theories, competing theories, the ideal design will also allow for them to be ruled out. So, for example, if there is a alternative theory that the relationship between education and health is related to behavior and that education influences behavior, which then influences health, the ideal study design will also allow for that theory and related hypotheses to be ruled out. Now I'd like to make a couple of points related to the collection of data. A study design must include a plan for the collection of data. Data provides the measures of the indicators of the concepts in the theory that we have formed our hypothesis about. When these measures appear in our data we typically refer to them as variables. You'll be hearing the term variables quite frequently in later lectures. If we're lucky, the data that we will need in order to test the hypothesis already exists and it's accessible. And then we can conduct secondary analysis. Certainly as a graduate student, or as a junior researcher, it's very likely that much of your research will involve secondary analysis of existing data. However, if the data are not available, we'll have to collect it ourselves. That is, we'll have to engage in primary data collection. When it comes to the measures that we will collect in our data, two important criteria are their reliability and their validity. The validity of a measure in our data is the extent to which it accurately reflects the concept of interest. So for example if we have a measure of income on our data perhaps annual earnings, we hope that it is not biased, and does not in someway misrepresent income in some systematic way. Reliability is the consistency of a measure in our data. The reliability refers to the likelihood that when we measure the same thing at different points in time, the measure remains the same. If the measures vary we consider that problematic. Now we'll make a couple of brief remarks about the analysis of data. As I mentioned earlier a study design must include a plan for the analysis of the data. It's through the analysis that a hypothesis is confirmed or denied. Analysis may be quantitative or qualitative. We'll come back to that later in this lecture. Within each of these there are many choices to make about which specific method to use. One thing to keep in mind is that if the data are problematic, no amount of analysis will save the study. The most important goal for our analysis is to confirm or refute our hypothesis. However in addition to this basic goal, you would typically like to rule out other possible explanations. So for example in our example of the relationship between education and health, we might want to rule out the possibility that behavior is playing a key role in the sense that education is affecting diet and exercise. And it's really diet and exercise that are shaping health, not income. Another important goal for our analysis is to provide insight into mechanisms. In our example, if we do find that annual earnings plays a role in linking schooling to health, it might be useful to be able to show that diet and exercise link earnings to self-rated health.