Tag: accountability

Location, segregation, and accountability: the quest for better school ratings

Parents, homebuyers, and ed reformers all love simple, clear school ratings. Can we have them without reinforcing neighborhood segregation?

Image from Zillow.com

In the old real estate adage “location, location, location,” at least one of those locations represents proximity to a good school. It seems natural that parents would want this, and real estate investment experts confirm their hunch. A 2018 Forbes article recommends first-time homebuyers consider four priorities, and the first is location. Describing a good location, the author writes, “Look into things like crime rates and the quality of the school district.” 

At some point in their hunt most homebuyers will visit at least one of the most popular search sites — Zillow, Realtor.com and Trulia — and encounter a seemingly simple measure of school quality: a color-coded GreatSchools.org ranking on a scale of 1 to 10, and a “parent rating” of 1 to 5 stars. States have followed suit with their own user-friendly report cards under the Every Student Succeeds Act (ESSA), but none have the market presence of Great Schools. 

“We literally chose the house that we did based on proximity to and ratings of the schools,” said Clint Ochoa, a recent homebuyer in Texas. The ratings were clear and helpful for Ochoa and his wife while he was stationed in South Korea and she was in Atlanta with their three children. The whole home search had to happen online, and they knew very little about the area. Though they would go on to make school visits and talk to a real estate agent to discuss some of their family’s unique needs, Ochoa said, “Zillow was a starting place.” He’s in good company. In 2018, Zillow reported 157.2 million unique users per month, if those users clicked on a property, they saw a Great Schools rating. 

Given the power to steer homebuyers and shape neighborhoods, it’s not surprising that the ratings used by these sites are under constant scrutiny. 

Critics—both parents and academics—  have taken aim at what goes into the 1-10 rating. The formula has grown far more sophisticated since 2003 when Great Schools launched nationwide, but it still relies in part on test scores (state assessments, ACT, and SAT), which favor schools with wealthier students.

Even assuming that the summary ratings could somehow fairly and accurately measure school quality, University of Massachusetts-Lowell Assistant Professor and director of research for the Massachusetts Consortium for Innovative Education Assessment Jack Schneider still voices concern about their effect, whether from private or state agencies. Though state ratings are intended to spur school improvement, he said, “I think we should have some very serious questions about the theory of change that putting such information out there for the public is somehow going to guarantee better education experiences for young people.”

Great Schools is not in the accountability business, but instead aims to help parents make informed choices. This function is equally problematic for Schneider. Having a facile quantifier for how “good” and “bad” schools are, he says, drives neighborhood and school segregation, two of the culprits behind the achievement gap in the first place. 

Image from Zillow.com

In search of a better measure

The accountability and rating craze began in the No Child Left Behind era and has done little more than reinforce what many already believed, Brookings Institute fellow Jon Valant said: schools serving low-income populations or large populations of black and brown students were bad and should be penalized and avoided. 

“That was preventable,” Valant said. Once it was clear that the data showed little more than centuries of social and economic disparity—something schools can do very little about— state and federal governments could have tried to find a different way to measure what schools were actually doing, and get that information to parents.

“If you are going to show a label for the school,” Valant said, “it should be about what they are learning in the school … not how much they knew when they got there.”

Image from txschools.org

There have been some steps in that direction. Under the Every Student Succeeds Act, many states incorporated “growth scores” as a component of overall scores to demonstrate the value of what is happening inside the school. Each calculates and weights them a bit differently, but in general growth scores describe how much knowledge a student gains in one year, regardless of where they started. With wealthy students starting school already ahead of their lower-income peers, growth scores, in theory, should keep schools from being penalized for where their students started.

Parents care about more than tests as well.  In the beginning, as No Child Left Behind took effect, the ratings were simple reflections of state assessment performance—something Great Schools no longer sees an adequate measures of school quality. Since 2007 Great Schools has continually expanded and honed the criteria that go into its ratings, including growth scores and other equity metrics.  

“Our goal as an organization is to provide the broadest picture of school quality,” Great Schools CEO Jon Deane said. “We’re generally looking at, how can we tell a complete story?”

Stanford University professor Sean Reardon is one of the nations’ leading proponents of using growth measures over proficiency scores to determine school quality and measure the opportunities afforded to students. “If parents were to use the learning rate data to help inform their decisions about where to live, they might make very different choices in many cases,” he said in a news article on the Stanford website.

Image from Zillow.com

A Brown University study by David Houston and Jeffery Henig seems to back that claim. In an experimental situation when parents were asked to select a school using demographic information and growth scores, they chose less white and wealthy districts than when they were given demographic information with achievement scores. 

However, a new report and data tool from Reardon’s team showed that segregated schools with concentrated poverty do not move students forward academically as well as socioeconomically diverse schools and wealthier schools. Their growth scores would be lower. The report would not say definitively what accounted for the difference in growth, but did suggest some possibilities including limited resources and teacher experience. 

So prioritizing growth scores might make a parent choose a different school…but probably not highly segregated, low-income school. Changing the way ratings are calculated hasn’t make disparities disappear, because disparities still exist in schools. The fundamental truth learned from No Child Left Behind still stands: not all kids are getting the same quality of education. We know more about the gaps—they show up in almost any measure we invent—but we haven’t been as successful in addressing them. 

Image from txschools.org

Families know this, but for many, the neighborhoods around the most highly rated schools are out of their price range. “A large percentage of our users are low-income parents,” Deane said. 

For them, finding the right school is an art. They need the more complex data available on GreatSchools.org to figure out if their children can thrive in more middling-rated schools–those in the yellow 4-7 range. Then, they can go back to Zillow, and find an affordable home to match.

Did you want a good school, or a white school? Or are those the same thing?

While deeper exploration is possible on the Great Schools site itself, the real estate websites only show the summary score. Seeing the summary dark orange “2” instead of a green “9”  can have an immediate effect on a family wanting to purchase or rent a house close to the best possible school. As long as ratings act as market drivers for certain neighborhoods, they will increase segregation, which will increase the disparities reflected in the ratings, said Schneider.

“Good schools” sell houses. Alleviating inequality does not, Schneider explained, “The market doesn’t create integration.”

However, it’s not clear whether the ratings themselves are actually changing homebuyer habits, as much as reinforcing them, and making the process more efficient. While both state agencies and Great Schools continue to make more information available to parents, that data still generally tracks with income and racial demographics. Ratings are, whether the shopper intends them to be or not, a shortcut to finding schools with more affluent, whiter students.

Some researchers and most integration advocates acknowledge that parents may explicitly prefer whiter schools. What bothers Florida mom Stephanie Ilderton is it that this desire can be laundered of its racism by the seemingly simple science of ratings. Meanwhile, families who are looking for diversity may think they are getting a more racially and economically agnostic measure, but they aren’t, she explained, “The ratings are really incredibly biased.”

When Ilderton looked around her son’s Orlando metro-area school, she saw a lot of things she liked: strong academics, engaged teachers, and classrooms with books and materials for every child. What she didn’t see were black adults. As white parents with adopted black sons, Ilderton said, she and her husband needed to find “racial mirrors” for their children.

“We’re always on the hunt for role models for our sons,” she said.

She knew that finding a new school meant moving, so she consulted Zillow. From there, she quickly realized that the school ratings were leading her away from where she wanted to be. It was easy to find schools rated 9 and 10 in suburban Orlando near the university where her husband is a professor. But if she limited herself to 9s and 10s, she said, it ruled out the most diverse schools and the neighborhoods they served. 

It bothered her that the ratings confirmed what people in her area already associated with “good schools,” she explained— whiter, wealthier students—thus rewarding decades of racial and economic segregation. “I don’t even think it’s intentional,” she said.

Ilderton is also acutely aware that disparities can continue within a highly-rated school. She needed more than a simple rating to know if her sons would be well-served academically and socially.

To find the supplemental information she wanted, Ilderton combed the Office of Civil Rights Database, visited schools, and eventually settled on a school that was rated lower than their current school, but still an 8 on the Great Schools rating system. It received a “B” on the Florida state report card, lower than the district as a whole, but the academic performance for black students was proportional to their representation in the student body—information she gleaned from Great Schools and the state report card. Both allow users to explore performance for different groups of students. Upon visiting the school, Ilderton was pleased to see black teachers around. 

She knew which school she wanted, and only then did Ilderton returned to Zillow. She used a tool on the website that allows users to search for homes within the attendance zone of their desired school.

A map is focused on a geographical area: a city, a zip code, a suburb. Alongside the map, area schools are listed next to their ratings. Location.

Users click on the school name, and a map shows up, highlighting the attendance zone. Location.

The user then clicks “houses for sale inside this attendance zone” and another map shows up, empty except for qualifying homes. Location.

The tool does not tell parents what they should be looking for in a school. It leaves the moral quandry of balancing quality and equity to the individual parent. Ilderton used the tool to find a more diverse school, but it chilled her how easy it could have been used to continue the racial and economic sorting process that troubles her so much. It was, in a sense a metaphor for how people already thought about certain areas. As Schneider put it, “All they have to do is click a button and all these neighborhoods go away.”

More metal detectors and less STAAR seemed like a good idea at the time

The 2019 legislative session is off and running, with new leadership cut from old cloth at the helm in the House of Representatives and no bathroom bill on the agenda. 

Promising signs range from stryofoam cups embossed with the message “School Finance Reform: The time is now” to a speaker pro tem wearing a Notorious B.I.G. tie at his swearing-in.

Not that all signs are good. We’ve already seen a strong effort to disassociate poverty and standardized test scores, and age-old trick to try to prove that funding makes no difference in education.

In addition to the serious buzz around school finance reform, we’ve also seen a wave other education-related bills filed. Two so far address major concerns of ordinary parents—those who mainly experience the school system via their children, rather than think tanks, policy analysts, and budget reviews.

HB 797 filed by Shawn Thiery (D-Aleif) would put metal detectors in every single public school facility. 

HB 736 filed by Brooks Landgraf (R- Odessa) would lower the stakes of the STAAR test. 

On the surface, both of these simple, seemingly common sense bills draw a near-universal, “yay.” Safe kids who are less stressed out about a single test that determines whether or not they advance to the next grade. 

However, sweeping bills like these should be thoughtfully considered. They often come with unintended consequences. 

Metal detectors won renewed attention in the wake of the mass shooting at Santa Fe High School, but they’ve been a fixture in some schools for decades. Those schools, as you can predict, are not suburban, wealthy schools. They are schools in urban areas, where many kids live in poverty and where gangs are highly visible. These schools are disproportionately attended by children of color. 

Thiery’s district includes Alief ISD which fits that description pretty well.  Gang activity and incidents of violence are high. The white population is small. It isn’t in the heart of urban Houston by any means, but represents the sprawl-meets-gentrification phenomenon of increasing poverty in what were once suburbs. 

In 2017-2018 the district reported more incidents of gang activity in school than did neighboring Houston ISD, despite the latter being 4.5 times the size of Alief. 

Mass school shootings, however, seem to be a uniquely non-urban phenomenon, hence, I suppose, the metal detector bill being extended to all schools and stadiums. All. Charters too.

The question to be asked, however, is whether metal detectors have been effective at preventing either kind of violence—mass shootings or person-to-person violence. 

The answer, unsurprisingly, is that we don’t know. 

So far all school shootings have begun outside the school building, so there’s no indication that metal detectors would have prevented anything.

The American School Health Association conducted 15 years of research on whether metal detectors decreased school violence in any way. They could not show that violence decreased. Their report suggested that it did make students feel like their school was a dangerous place to be. 

Which reminds me of Maddisyn, the junior at South San High School who told me that gang members were the most stressed out people she knows. She also told me that the increased police presence in South San ISD—a response to the Newtown mass shooting—made students feel as though their fellow students were a constant threat, and even that they themselves were somehow in need of policing. 

Basically, when you fill a person’s world with danger cues, they respond to danger cues. If those danger cues—like seatbelts, Caution tape, and bicycle helmets—are making them more safe, then we consider that appropriate safety education. There’s a danger, and our precautions remind us to be careful. They should have the appropriate adrenal response to the situation—increased alertness, circumspection, etc.

However, if the danger cues are not actually keeping them safer…what’s the point? To have them living in a state of constant adrenal stimulation? 

And how much would we pay to make schools, stadiums, and other public school facilities feel less safe? 

At $4,000-$5,000 per metal detector, we’re looking at at least $40 million to put one machine in each school. But to make that math work, you also have to subscribe to the Dan Patrick one-entry-one-exit solution to school shootings. It would take about an hour to get into the school building. Even airports have more than one detector, and not all planes take off at once. 

So really we’d need a ton more. I’m not usually a budget hawk, but I don’t like paying for things that are counterproductive.

Speaking of that, we are paying for STAAR tests. The state pays $90 million to Educational Testing Services. I have never heard anyone say anything positive about STAAR tests. No teacher, administrator, parent, or student. 

With that in mind, Landgraf’s declawing of the STAAR test makes a ton of sense. Pre-No Child Left Behind, we had standardized tests…we just didn’t worry about them. A note went home saying, “eat a big breakfast! The test is long.” And that was it. Far far cry from the madness we currently have. 

So, as Texas House of Representatives Public Education chair Dan Huberty asked Commissioner Mike Morath…can’t we just get rid of STAAR altogether?

Only if we want to get rid of accountability and federal funds altogether, Morath replied (that’s a broad paraphrase.)

I think we all agree that a more well-rounded evaluation mechanism would be ideal. The Every Student Succeeds Act (which is tied to our federal funding) gave states the chance to consider other criteria than test scores…Texas has a number of outcome-based components to its ESSA plan, as well as a consideration of discipline data. But very little in the qualitative categories.

There’s not much of that high-touch, observation-based assessment in Texas’s ESSA plan at all. Probably because there are 5 million children, 9,000 schools, and 1,200 districts in the state.

But that’s really what people instinctively want. They want their kids to be evaluated by someone who knows them. So why not just leave it up to teachers?

And we all understand why teachers and even district administrators can’t be the only ones responsible for determining whether their kids progress to the next level…right? 

Making a child’s mastery of a topic subjective to his teacher’s assessment is the recipe for inequity. Think discipline statistics and G/T referrals, both of which overwhelmingly favor highly verbal white girls from professional homes. There will always be borderline kids for whom their relationship with their teacher will be a determining factor to their success unless there’s an agnostic evaluation tool. Implicit bias is real, even with the best of intentions.

Districts can’t be the last word on assessment, because regional politics and economics also make it possible that kids in one region might not be held to the same standards as kids from another. But if kids from San Augustine and going to compete against kids from Highland Park for admission to UT-Austin, they’re going to need to be held to the same standards from day one.

Educators are awesome—anyone who chooses to spend all day trying to fill young minds and hearts is a hero— but they also need accountability to make sure that their skills match their good intentions. That’s nothing to run from. Every profession should embrace evaluation, whether it’s by an industry standards board or consumer feedback or sales conversion rates. The question is, are we using the right evaluation tool? 

Probably not, but before we go scrapping it, we need to consider what could and should take its place to better accomplish its goals.  

And that, dear reader, is the challenge before all of us watching bills pop up into the headlines as the Legislature progresses. Education, when done equitably, is complicated, and our gut reactions to things should always be balanced by the boring, wonky, details, and the question, “who might we hurt?”