Student Outcomes

ISAT Studies

Previous Page Next Page

Background on Student Assessments

Beginning in 1999 for reading and mathematics and in 2000 for science, the Illinois State Board of Education (ISBE) implemented new yearly state assessments—the Illinois Standards Achievement Test, known as the ISAT. This test generally consisted of single-right-answer, multiple-choice items, although a limited number (approximately 2 per grade level) of open-ended questions were incorporated in the math section. The overall score was based on approximately 80 items, and was placed on a 120-200 scale. The test employed item response theory (IRT) true test score equating using a one-parameter Rasch model to place each year’s results onto the reporting scale.

Initially, the ISAT was given to 3rd, 5th, and 8th graders in mathematics and 4th and 7th graders in science. In the spring of 2006 in response to the federal No Child Left Behind Act of 2001, the ISBE began administering a newly developed ISAT assessment in every grade, 3-8, in reading and mathematics, and the ISAT science exam in the 4th and 7th grades. Scores from the previous ISAT versions were transformed onto a new longitudinal scale using standard equating procedures cite . A variety of standard methods cite  were used to establish content validity and test reliability. The internal consistency value of the test (alpha coefficient) is between 0.91 and 0.95 depending on the subject and grade cite . To date, there are not enough data points to permit long-term longitudinal analyses of test results across all grades; the science data are particularly thin in this regard.

The testing schedule prior to 2007 presented a number of research challenges. A staggered testing schedule of grade levels from 1999 through 2005 made it impossible to use ISAT data to assess long-term effects on anything less than a two-year time interval. The staggered schedule limited outcome analyses to students who were tested in one year, remained in the same school for two additional years, and were retested. This changed in 2006 when ISBE began annual mathematics testing of students in all grades 3-8. This change facilitated the use of statistical methods employing longitudinal scales.

ISAT Outcome Measures

The ISAT provides two key outcome measures. The first measure is the percentage of students meeting or exceeding state performance requirements on the exam (“%ME”), while the second measure is the “scale score”. Over the years, CMSI research studies have used various combinations of both measures as outcome variables. Experience with these outcome measures indicates there are advantages and disadvantages to each measure.

Scale scores (or derivations thereof) permit more in-depth, controlled statistical assessments because they provide a mathematically continuous measure cite , and because they can be calculated for individual students. Thus, individual student performance can be tracked over time using scale scores. In contrast, the %ME indicator is based on a categorical variable cite  and is therefore less robust (i.e., more easily affected by outliers) than scale scores. One drawback of the %ME measure is that it cannot be generated at the student level. Another is that the scale score cutoff between the "meeting standards" and "below standards" categories has been changed by the Illinois State Board of Education over time, making longitudinal analyses using this metric challenging cite . Still, the %ME is not without merit.

Historically, the %ME has been used as the primary outcome measure in assessing school-level performances. Educators are familiar with it and it is the primary measure used under NCLB for rewards and sanctions. CPS has used the %ME for years to measure and assess program outputs and school performances. Consequently, teachers and school administrators are most familiar with this outcome measure.

Previous Page Next Page

See “ISAT Technical Manuals 1999-2005”, ISBE, (1999), (2000), (2001), (2002), (2003), (2004), (2005).
A continuous measure can include all points between whole numbers, such as 1.0, 1.1, 1.2, etc. In contrast, a categorical measure includes only discrete categories such as A, B, C, etc.).
Specifically, the %ME score represents the percentage of students in a school or classroom that fall into one of the discrete categories of “meets” or “exceeds” standards which are a range of scores.
It should be noted, that where appropriate in the following analyses, we have corrected statistically for changes in performance category cut scores.