What test scores fail to measure

Like the unexplained monoliths in the classic movie 2001: A Space Odyssey, our standardized test scores float untethered in space - free of the very things they are supposed to measure, yet having great power.

These scores claim to measure “college and career readiness.” Yet it takes no particular insight to know that being ready for the forestry program at the community college is not the same as astrophysics at MIT.

Likewise, “career ready” means many different things depending upon whether you are a health-care provider, a convenience store clerk, or a road supervisor.

The fundamental flaw is pretending that we can measure an educated person with one narrow set of tests. There is no one universal knowledge base for all colleges and careers. This mistake is fatal to the test-based reform theory.

* * *

When the two test batteries - Partnership for Assessment of Readiness for College and Careers (PARCC) and Smarter Balanced Assessment Consortium (SBAC) - are put to the test, they don't score very well.

Princeton-based Mathematica Policy Research compared PARCC test scores with freshman grade-point average and found only 16 percent could be predicted (in the best case) by the math test and less than 1 percent by the English language arts score.

The SBAC doesn't have such a validity study, but Smarter Balanced says that such a study “appears in [its] crystal ball.” Since the future of schools and children are in the balance, this is no place for murky crystal balls.

Building a test is conceptually simple. You assemble an elaborate web of subject-matter specialists to outline the content based on what they think is important. For tests that have a pass-fail point, that cut-score is likewise based on expert opinion.

Aided and abetted by advocates and politicians seeking to create a scientific “proof” of the failure of American education, the cut scores are knowingly set to have a majority of students fail.

The irony is the tests have a major predictive validity problem. They can't tell you whether they are measuring what they claim, but they know how many will fail.

Like our monolith, they float untethered in space yet have immense but ungrounded power.

* * *

Now: why do we have such a state of affairs?

As former American Educational Research Association President and Stanford Professor Richard Shavelson has pointed out, test-makers get caught up in the latest testing fad, resulting in the tail wagging the dog.

In the current latent traits fad, here's how the tail has to wag: Knowledge can have only one line, from easiest to hardest. Children within a grade are equally distributed within and across all classrooms, and all children learn the same things in the same way, in the same order, and at the same time.

As any parent of two or more children can tell you, that scenario is not reality.

Another fatal tail wagging is that no matter how important the item, if it doesn't fit the latest test fad, it is tossed out. The result is that the test drifts off in space. This problem is made worse when politicians dangle money in front of test experts to do things with tests that cannot and should not be done, says Shavelson.

If we redesigned our measures to address what our state constitutions and citizens tell us is important, we would concentrate on the skills that define success as a citizen, worker, and human being. These skills include clear and effective communication, creative and practical problem solving, informed and integrative thinking, responsible and involved citizenship, and self-direction.

* * *

This is not to say that standardized testing should be eliminated. It is the single uniform measure across schools. But the very standardized attributes that make them valuable cause harm to those things that are truly important for our children and our communities.

Since the “recommended” SBAC tests' standards are currently set to fail about two-thirds of students, the data will wrongly and dishonestly provide fodder for school critics.

In high-scoring states, a mere half of students will be declared failures even though they would rank in the top 10 percent of the world.

The test scores measure neither college nor careers nor success in life. They simply float free in monolithic space, radiating glossy ignorance, but as far as informing us about our schools, they are a cold, silent, and misleading void.

What test scores fail to measure

William J. Mathis

Wednesday, August 23, 2017 — Issue 422

Special

When does spring start? In Vermont, it depends.

Welcome, spring!

From the Archives, #

’Tis the season to be responsible

Other Great Stories in The Commons

News

Naming as a path to environmental stewardship

Olga Peters

Wednesday, July 22, 2020 — Issue 571

Arts

BMAC to host 10th annual Lego Contest & Exhibit

Wednesday, October 25, 2017 — Issue 431

Voices

Who are the people of modern Israel?

Marlene Wein

Wednesday, November 8, 2023 — Issue 739

Categories

More Info

What test scores fail to measure

Wednesday, August 23, 2017 — Issue 422

When does spring start? In Vermont, it depends.

Welcome, spring!

From the Archives, #

’Tis the season to be responsible

Subscribe to the newsletter for weekly updates

Other Great Stories in The Commons

Wednesday, July 22, 2020 — Issue 571

Wednesday, October 25, 2017 — Issue 431

Wednesday, November 8, 2023 — Issue 739

Categories

More Info