This is a second in a two-part piece about the current state of educational testing in our K-12 schools. The first half covered the reliability of standardized testing, whether we should be using standardized tests for younger children, and digital educational design. Click here to read Part 1.
The tests don’t test what we think they test
My informant pointed out a huge problem with her third graders taking the test: much of the test had no audio component, and assumed that they could all read and write well enough. But as anyone who has taught children to read will tell you, some kids just learn later. They don’t learn worse and it has nothing to do with their intelligence overall. Late readers are not less successful in life.
But here’s what she had to say about her group of kids: “There was no audio component to the math, so a lot of the test was really a reading test. If they couldn’t read the paragraphs, they couldn’t answer the questions. And they sure as heck couldn’t write a paragraph.” The Common Core assumes that if you understand something, you should be able to write about it. (I won’t get into the question of why any reasonable 8-year-old would actually want to write about math!) But clearly, the less able readers were not being tested on their understanding of math—they were being tested on reading, which depressed their math scores.
On top of that, this test is also a test of a skill most kids don’t learn until middle school: typing. “And OMG they have no typing skills. I’m not sure a 3rd or 4th grader needs typing skills in general, but they were not ready to type for a grade. It was painful to watch.” Again, their math skills took a backseat to something the test designers didn’t even take into account. If we really wanted to find out their mastery of math, we’d let the teachers read the instructions out loud and type for the kids, or install voice recognition software so they could dictate.
Unreasonable expectations:
Standardized tests have been around for a long time, and over those long years, we have learned a lot about them. Here are some things we know about the tests themselves:
- They are inherently biased. They can be made better and better through tinkering, but they can never reach the stated goal of being instruments that find out “what a child knows” because some children, for a variety of reasons, will never do well on them regardless of their mastery of a subject.
- They are not good predictors of much of anything except how well a child will do on his next standardized test. The SAT, a much better test than any ever designed by a state government, is retooling itself because of the much-publicized research that shows, conclusively, that a good SAT score predicts absolutely nothing. Except, maybe, a good GRE or LSAT score!
- They do not measure the worth of a teacher. Great teachers have all sorts of effects on their students’ lives, but improving their students’ standardized test scores is not a given effect. You can have a great teacher who does amazing things with kids who does not bring up their test scores.
- They do not measure the effectiveness of a school. There are so many other factors that are as important or even more important than test scores. Test scores are one tiny factor that administrators can use to judge schools, but they are not the most important factor by far.
Yet our unreasonable expectations of this test are that it will somehow:
- Be better at testing all children at their own level. See the point above about the inevitable bias. These tests won’t do any better than other tests. Sure, the kids who have trouble tracking from a test booklet to the correct bubble to fill in might do better, but these tests will inevitably end up biased against some other group of kids.
- Predict a child’s success outside of test-taking. No, these tests will not predict any such thing. They will merely predict how well the child will do on the next standardized test. Period.
- Show how well a teacher is teaching. This is absolute idiocy and any idea that teachers should be punished or rewarded based on test scores is rooted in a deep cultural distrust of teachers, not in any sound educational theory. Some teachers may indeed bring up their students’ test scores, but I sure hope those teachers are also doing something useful for their students.
- Give us a way to “rate” schools. I have my own personal way to rate a school. I walk into the school and watch. In a great school, the students will be happy and relaxed. Yes, they may also be deeply focused on what they are doing, but that doesn’t mean they’re not also happy and relaxed. The parents will enjoy the school and feel welcome there. The teachers will feel energized to come to work; they will feel a partnership with the school administrators, other teachers, their students, and the parents. None of these important factors is represented in a composite test score. Yes, the score is a useful piece of information, but it alone does not rate a school.
Until we as a culture deal with our unreasonable expectations, it doesn’t matter how “good” the test is. A standardized test is a measure of how well students take standardized tests. In other words, it’s a measure of how much vocabulary they have heard in their few years on this earth. It’s a measure of what their parents discuss at the dinner table, assuming they have parents, a dinner table, and food to put on it. It’s a measure of how often the people they spend the most time with (and this is not teachers) talk about numbers in real life so that they become comfortable with number sense before being required to learn other skills that build on number sense.
A standardized test is also a measure of a child’s personality—nervous, anxious children don’t test as well regardless of their background. A child who didn’t have protein with breakfast won’t test as well. A child in the first day or two of coming down with the flu won’t test as well as she would otherwise. A child who lives daily with the fear that his older brother will be shot by his friends won’t test as well as he should. A child who is told he is too stupid to learn won’t do well on tests, and a child who has been overpraised about her intelligence (ironically enough) won’t test as well.
In conclusion, there are simply too many factors within the messiness of one person’s little life to put such weight on the results of a test. Sure, let’s make a better test, because we always need to improve the information we gather. But let’s not think that this test is going to solve any educational problems we have. It’s just a test, imperfect, limited in scope, and vulnerable to bias and technical problems. Education is just too important and complex to be judged by such a narrow, flawed instrument.
well said and thank you!
We take ASL, this year’s test allows the directions to be viewed in ASL, but it goes faster than the professional interpreters were comfortable with, much less a grade school child.
I fear this test will instill test anxiety in those that had none. The practice test was dreadful. The variety of computer skills, poor directions, lack of clarity, unfamiliar format all contributed to frustration. When an answer was in the wrong format the program would not advance. That’s enough to crush the spirit of an aim to please eager test taker.
Interesting that they are attempting to make accommodations for disabilities. This is one area in which the computer-based testing should be superior. As I mentioned above, there are a number of kids (usually undiagnosed) who have visual tracking problems and who had real trouble tracking from the problem number in the booklet to the problem number on the worksheet then across to the correct bubble. A computer should be able to accommodate pretty much any disability, but your example points out the weakness: what if we introduce an “accommodation” that doesn’t actually help the child with the disability? So we could even have kids be able to hear audio in their native language, for example, but they aren’t being instructed in their native language so they’d be unlikely to be familiar with math terms in their native language regardless of whether they are fluent in it. The range of problems we could create by inadequate accommodations is scary. My informant said that there was no audio for the math section – so a dyslexic child being tested on math is actually being tested on reading, not math.
I hope that all parents are clear to their children that no test can measure their worth. Tests are only an imprecise measure of one child’s capabilities on a certain day to do well on that test. Given how much pressure the schools are under, it’s worth asking our kids to do their best (I’m not a big fan of the approach that some parents take where they tell their kids just to answer randomly). But we need to make sure that our children know that the test is not a measure of their worth or even their potential to succeed.
You said “The SAT, a much better test than any ever designed by a state government, is retooling itself because of the much-publicized research that shows, conclusively, that a good SAT score predicts absolutely nothing. Except, maybe, a good GRE or LSAT score!” This is incorrect. The SAT is a pretty good predictor of first-year and all-four-years GPA for college students. Surprisingly good for a 3.5-hour test. Anti-test people have been propagating misinformation about the SAT for some time, but there is pretty solid research that shows it is a decent predictor (comparable to high-school GPA, and the combination of SAT and HSGPA is better than either alone).
The SAT is being revamped for marketing reasons, not because of problems with its validity.
Sorry, I just saw this comment. (For some reason, my email thought you were a spammer!) I see I stated this too strongly. A good SAT score may correlate, but it’s hardly a predictor of success in a broader sense. It’s just a test, in other words, and we shouldn’t change anyone’s life based on just a test. Luckily, most universities understand this and take a variety of aspects of a student’s performance into account. But my point was more about the high-stakes testing push for K-12, and how these tests, clearly inferior to the SAT, are being used to determine the fates of individual teachers and schools. That is absolutely not a good use of a standardized test. Again, a test score as part of how we judge success is fine, but a test score should not be weighed more heavily than real-world assessments, which are slow, expensive, and complex.