Thursday, March 26, 2009

Assessment Validity

In gathering evidence to measure student achievement, classroom teachers, course designers, and administrators have a variety of ways to incorporate validity. Popham (2008) describes three types of validity evidence related to “content, criterion, and construct”.

Content related validity is especially important for classroom teachers in assuring that formative and summative assessments are aligned to curricular aims. Once assessment evidence is determined, instruction should align with assessment through a “backward design” (Wiggins and McTighe, 2005) so that all classroom activities remain on target. Besides classroom teachers, instructional leaders can use “walkthroughs” (Downey, Steffy, English, Frase, and Poston, 2004) as a means of bringing alignment of curriculum, assessment, and instruction to the forefront as well. That is, content related validity involves all stakeholders working together through a community of practice in assuring that the taught curriculum aligns to the written curriculum.

Criterion related validity deals with using assessment to predict future behaviors. Aptitude exams are a good example. When students take the ACT or SAT exam, they are measured on how likely they are to succeed academically in the future. Although assessment experts carefully consider criterion related validity in these exams, “only about 25% of academic success in college is associated with a high school student’s performance on [these exams]” (Popham, 2008, p. 301). Perhaps a reason why it is difficult to use assessment measures to predict future behavior is due to the uncertainty of how people will apply themselves under new circumstances (e.g., attending college, a new school, etc.). Criterion related validity typically involves assessment outside the classroom setting.

Like content related validity, construct related validity entails all stakeholders, classroom teachers particularly taking on an important role. Interventions, differential-population, and related-measures studies (Popham, 2008, pp. 63-65) are three types of content related validity that assessment designers use to hypothesize, test, and infer information and behaviors. For example, assessments should measure a progression of improved student understandings, knowledge, skills, and dispositions (i.e., intervention study). Assessments should be free of bias based on the student’s social-economic status, background, etc. (i.e., differential-population study). And assessments should be consistent between teachers teaching the same level of content (i.e., related-measures study). Like content related validity, the efforts of all stakeholders are needed in order to address construct related validity within a school in order to create assessments that are as accurate and fair.