![]() It may be that there are other factors that cause the observed score differences within each group, or they could just be due to random chance. ![]() Our group of applicants with relevant degrees varied a little bit more than that, and our group of applicants with unrelated degrees varied quite a bit. Those applicants without a degree tended to score very similarly, since the scores are clustered close together. We can also clearly see that within each group, our applicants’ scores differed from one another. Thus, we have systematic variability between our groups. Those applicants who do not have a college degree received the lowest scores, those who had a degree relevant to the job received the highest scores, and those who did have a degree but one that is not related to the job tended to fall somewhere in the middle. Now that we have our data visualized into an easily interpretable format, we can clearly see that our applicants’ scores differ largely along group lines. However, even within these groups, there is still some variability, as shown in Figure 11.2. Now that we can differentiate between applicants this way, a pattern starts to emerge: applicants with a relevant degree (coded red) tend to be near the top, applicants with no college degree (coded black) tend to be near the bottom, and applicants with an unrelated degree (coded green) tend to fall into the middle. Figure 11.2 presents the same job applicant scores, but now they are color coded by group membership (i.e., which group they belong in). This is a common way that job applicants are sorted, and we can use ANOVA to test if these groups are actually different. With knowledge of what the job requires, we could sort our applicants into three groups: applicants who have a college degree related to the job, applicants who have a college degree that is not related to the job, and applicants who did not earn a college degree. Let’s assume that as part of the job application procedure we also collected data on the highest degree each applicant earned. Our goal is to explain this variability that we are seeing in the dataset. However, there’s no interpretable pattern in the data, especially because we only have information on the test, not on any other variable (remember that the x-axis here only shows individual people and is not ordered or interpretable). ![]() As we can see, the job applicants differed quite a bit in their performance, and understanding why that is the case would be extremely useful information. The x-axis has each individual person, in no particular order, and the y-axis contains the score each person received on the test. Take a look at Figure 11.1, which shows scores for many people on a test of skill used as part of a job application. Through this, it becomes clear that, although we are usually interested in the mean or average score, it is the variability in the scores that is key. Our job as scientists, researchers, and data analysts is to determine if the observed differences are systematic and meaningful (via a hypothesis test) and, if so, what is causing those differences. Sometimes this is due to random chance, and other times it is due to actual differences. We have seen time and again that scores, be they individual data or group means, will differ naturally. This chapter will describe the general design of ANOVA, with a focus on calculating the independent samples one-way ANOVA, which is an extension of the independent samples t test, where three or more different groups are compared on a single independent (or grouping) variable. However, ANOVA will, at first glance, look much different from a mathematical perspective, although as we will see, the basic logic behind the test statistic for ANOVA is actually the same. Thus, the purpose and interpretation of ANOVA will be the same as it was for t tests, as will the hypothesis-testing procedure. ANOVA is more flexible in that it can handle any number of groups, unlike t tests, which are limited to two groups (independent samples) or two time points (dependent samples). Tukey’s honestly significant difference (HSD)Īnalysis of variance (ANOVA) serves the same purpose as the t tests we learned in Unit 2: it tests for differences in group means.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |