Progress, Policy, and Protests: Teacher Evaluation Laws Evolving Faster Than Underlying Research That Proves Their Worth, Experts Say
Correction appended Jan. 4
This article also appeared on the Education Writers Association blog.
If there’s been one constant over the last decade in terms of teacher evaluation policies in the United States, it’s been change.
First, performance reviews incorporating student test scores became — mostly — the law of the land. Then, the academic standards that educators and their pupils are measured against — mostly — changed. And then, in many places, those standards changed again.
So, has the implementation of the federal Every Student Succeeds Act, which did away with mandates on how states measure teacher quality, calmed the roiling waters?
“No” was the resounding answer from a trio of experts interviewed by Chalkbeat reporter Matt Barnum as part of a recent Education Writers Association seminar on the state of the teaching profession.
In the two years that followed ESSA’s enactment, some 150 teacher evaluation bills were introduced in state legislatures, Stephanie Aragon, a policy analyst with Education Commission of the States, said at the gathering. The bulk of lawmakers’ efforts have concentrated on the use of student assessment data in the evaluations, according to Aragon.
The idea of factoring growth on student test scores into teacher evaluations gained widespread adoption shortly after President Barack Obama took office in 2009. As part of a federal stimulus program, the U.S. Department of Education launched the Race to the Top initiative with about $4 billion. To qualify, states had to meet several requirements, including revamping their teacher and principal evaluation systems. A key element was to incorporate test scores as one component of teacher evaluations.
But state policy on teacher evaluation has moved faster than researchers’ efforts to understand what works and why, Aragon said.
The risk? In many places, teachers remain skeptical.
“States are tacking on things to evaluation systems without making sure they are trusted,” Aragon said. “I don’t think the research base exists yet to say, ‘This is the best practice.’”
John Papay, an assistant professor of education at Brown University, and Gina Caneva, a teacher-librarian in the Chicago Public Schools, joined Aragon on the panel.
While there is general agreement that the evaluations ought to be used to improve teaching, the three panelists said researchers have yet to figure out which promising evaluation systems might lead to better teacher professional development and career growth for teachers.
Papay described lessons gleaned from several efforts to implement new teacher evaluation systems. In particular, he praised the Tennessee Educator Acceleration Model. The state’s Instructional Partnership Initiative has guided teachers and principals in the use of evaluation data to design better and more personalized teacher professional development.
Papay tracked the initiative in a randomized control trial that matched teachers who had strong classroom-observation ratings on certain teaching skills with colleagues who received low ratings on those same skills. After the mentorships, students taught by the initially lower-rated teachers showed gains that were the equivalent of being taught by a teacher rated average, versus by one in the lowest 25 percent. The program has now expanded statewide, and Papay continues to track the impact.
Teacher and student outcomes have improved, according to the Tennessee Department of Education. And principals, Papay said, now do a better job of retaining teachers with high evaluation scores and dismissing those with low ones.
But lots of teachers still mistrust the evaluations, which too often are followed by scant feedback, Papay said.
“A lot of thought went into making [the evaluations] accurate,” he said. “And not much on how to use it to give teachers useful feedback.”
Caneva agreed. During the panel, she discussed the shortcomings of the system used to evaluate her — in particular, as a mechanism for helping teachers improve their practice.
“When I get evaluated by my principal, that’s about one day,” she said. “When you talk about how we’re going to improve education in America, I don’t think this is where we are going to get the biggest bang for our buck.”
Because, as a teacher-librarian and writing center director, she teaches in content areas that are not subject to the annual student assessments required under federal law, Caneva described how she creates tests at the beginning and the end of each academic year for her students. She scores the exams and enters the data into a software program.
Looking back, Aragon said blowback was intense after many states adopted new teacher evaluation policies in response to the Race to the Top initiative. The ensuing decade was characterized by continual upheaval, with states, districts, and teachers scrambling to adjust to the new normal.
Yet when the ink dried on ESSA, states found a combination of flexibility and funding designed to help them refine their evaluations. Some states, Aragon noted, have waited to fully implement their systems and may benefit from lessons gleaned by early adopters.
Churn notwithstanding, there are places where data is being collected that highlights promising models. In addition to Tennessee, Dallas, Denver, and Newark have all put considerable energy into refining their evaluations, Aragon said. New Mexico has done a good job differentiating, or adjusting evaluations and professional development to different communities’ needs, she said.
“In many cases, states are struggling to get to those good evaluations,” Aragon said. “I think that’s the exciting part, if you have a meaningful evaluation system and it can lead to teacher growth.”
Correction: Story was updated to cite Dallas as one of the examples of states/districts getting teacher evaluations right.
Get stories like these delivered straight to your inbox. Sign up for The 74 Newsletter