I

n 2013, the state of Wisconsin charged Eric Loomis with five criminal counts in connection with a drive-by shooting. Loomis eventually accepted a plea deal and pled guilty to several lesser charges: attempting to run from a traffic officer and operating a motor vehicle without owner’s consent. Before Loomis’ sentencing, a Wisconsin Department Corrections officer produced a presentence investigation report that included a risk-assessment score to help predict Loomis’ recidivism rate — his potential chance of reoffending in the future. The introduction of this algorithm would dramatically change Loomis’ case. 

This risk-assessment score was computed by Correctional Offender Management Profiling for Alternative Sanctions (COMPAS)—a privately-owned algorithmic system, designed by the company Equivant, that produces recidivism predictions based upon public data and answers from a lengthy questionnaire. Once formulated, Loomis’ COMPAS score identified him as high risk for violence, high risk for recidivism, and a high pretrial flight risk.

Before a COMPAS score was introduced into Loomis’ case, the prosecution and defense had agreed upon a plea deal of one year in county jail with probation. However, at Loomis’ trial, the trial court referred to the COMPAS-generated risk-assessment score as a judicial tool to help in its sentencing determination. The court classified Loomis as high-risk of re-offending based in part on this score, and proceeded to sentence him to six years of imprisonment and five years of extended supervision. Loomis filed a motion for post-conviction relief on the grounds that the court’s reliance on the COMPAS score violated his due process rights. While the Wisconsin Supreme Court ultimately denied Loomis’ motion, their closing remarks reflected the growing sense of skepticism that surrounds COMPAS and the role of risk-assessment technologies in the American legal system. “While our holding today permits a sentencing court to consider COMPAS,” the court said, “we do not conclude that a sentencing court may rely on COMPAS for the sentence it imposes.” 

The essence of these closing remarks is simple: according to the court, risk-assessment algorithms like COMPAS are not a replacement for human judgement.  

***

COMPAS is one of the most widely used algorithms in the U.S. criminal justice system and it has been applied or adapted by many states, including New York, Wisconsin, Florida, and California. COMPAS uses public criminal profile data and answers from a 137 interview questionnaire to generate a risk score. The questionnaire gathers information on past criminal involvement, relationships, lifestyle, personality, familial background, and education level. It produces scores grouped by risk level, ranking defendants on a scale of 1-4 Low Risk; 5-7 Medium Risk; or 8-10 High Risk.  

First developed in 1998, the algorithm has now assessed over one million offenders. The recidivism prediction component of COMPAS, known as the Violent Recidivism Risk Score (VRRS), has been in use since 2000. However, despite the algorithm’s widespread use in courtrooms across the country, it is largely considered to be a black box: though its basic input information is available, the weighting of these inputs within the algorithm are proprietary, and thus not available to the public.

Risk-assessment algorithms like COMPAS have the potential to be beneficial tools of justice for the American criminal justice system. Much of this potential stems from their technical and data-based foundation—many scholars contend that formal, actuarial, and algorithmic methods of prediction perform better than intuitive methods used by judges or other experts. According to Adam Neufeld, a senior fellow at Georgetown Law’s Institute for Technology Policy and Law, the systemic benefits these tools provide are abundant: risk-assessments can efficiently and effectively reduce costs, reduce crime, and do not waste human potential when implemented in the criminal justice system. 

An example of the possible benefits of this technology is bail. Data from the Prison Policy Initiative shows that releasing a low-risk defendant on bail can reduce their recidivism rate, while detaining these individuals or individuals with similar profiles contributes to an estimated cost of $13.7 billion in annual jail fees in pretrial detention. Neufeld argues that the human potential of every individual offender released on bail is not wasted—both keeping that offender from spending (perhaps undeserved) time in jail while awaiting trial, and also by reducing a recidivism cycle that might result in that offender returning to jail in the future. 

One study in particular—the National Bureau of Economic Research’s 2017 publication, “Human Decisions and Machine Predictions”— found that in New York City’s pretrial decisions, an algorithm’s assessment of risk would far outperform judges’ track records. The study further concluded that if New York relied on an algorithm to aid in bail decisions, an “estimated 42 percent of detainees could be set free without any increase in people skipping trial or committing crimes pretrial.” This study supports the assertion that risk-assessment use in New York would indeed reduce costs, reduce crime, and would not waste human potential when implemented as a judicial aid in decision-making. For a judge who might encounter any one of the 30,000 people arrested daily in America, a tool like this thus offers a cost-effective and efficient aid in adjudication. 

Relying on an actuarial tool to provide a risk score with profound, life-altering implications, however, might make many people justifiably uncomfortable. Yet, Neufeld insists that though it may seem “weird to rely on an impersonal algorithm to predict a person’s behavior given the enormous stakes… the gravity of the outcome—in cost, crime, and wasted human potential—is exactly why we should use an algorithm.” 

Neufeld offers a convincing case for the benefits of using these algorithms—saved money, saved time, and some needed support for an already strained criminal justice system. But as with any growing field of technology, issues arise when these technologies become a depended-upon component of a system like the criminal justice system which, according to the Prison Policy Initiative, annually jails 443,000 people pretrial alone. What is at stake here is not simply systemic efforts to reduce costs and time; it is the ability of each individual to have access to a fair, equitable trial and sentence in an American courtroom. 

Did Eric Loomis receive this treatment? The Wisconsin court ruled he did. However,  pressing concerns still exist around whether or not the COMPAS algorithm is capable of providing the type of fair and unbiased judicial aid that the court claimed it could. 

***

In 2016, the non-profit newsroom ProPublica launched a study of Florida’s COMPAS system and found that the formula was especially likely to mark black defendants as future criminals, mislabeling them at almost twice the rate as white defendants. White defendants, on the other hand, were identified as low risk more often than black defendants. ProPublica also found that the risk scores were unreliable in forecasting violent crime, with 80 percent of the people predicted to commit violent crimes not actually doing so.

Part of the reason why COMPAS is susceptible to bias is a lack of transparency, a factor that is due in part to Equivant’s right to protect its own intellectual property (a point the state of Wisconsin supported in its decision to reject Loomis’ appeal claims). Unless taken to court on charges such as impropriety—if, for example, there is suspicion that an ethics or standard of conduct violation has occurred in the company’s product or process—Equivant is unlikely to relinquish details pertaining to the system’s internal function. 

University of Maryland Law Professor Frank Pasquale believes this secretive aspect of COMPAS is concerning because it refuses a courtroom actor an answer to the following question: “How is the algorithm weighting different data points, and why?” Each aspect of this inquiry is crucial in relation to two core legal principles: due process, and the ability to meaningfully appeal an adverse decision. Judicial processes are generally open and explicable to the public. Even after juries have deliberated, judges themselves are required to give an explanation for their rulings, particularly when adjudicating sentencing. When an algorithmic risk-scoring process like COMPAS is kept secret, it becomes impossible to challenge key aspects of that score because the system’s internal function remains protected. Subsequently, all members party to a case—from the judge to the defendant—are largely unable to question, challenge, or request a reassessment of any algorithmically-generated score. 

As a result, any judge seeking to use COMPAS as a judicial aid cannot, at this moment, understand fully how a COMPAS risk-assessment score is developed, nor how factors about the defendant’s profile were weighted to arrive at the given risk score. Not only does this prohibit judges from properly understanding a judicial tool meant to assist their process, but it might also deny the defendant the ability to identify a fair trial outcome, should the judge base their final ruling in any way upon a score they both cannot fully comprehend. This issue of miscomprehension follows the defendant into the appellate arena, where higher circuit courts conceivably face the same confusion.

The secrecy surrounding COMPAS also leads to a failure of ‘narrative intelligibility’ — when outcomes create confusion between involved parties regarding a decision, where a decision came from, or how it was reached. For a jury’s verdict to be narratively intelligible, the judge, defense, and prosecution must clearly understand the procedure the jury followed to reach the verdict and what that verdict means. When this process is transparent and easily explainable, a defendant’s due process rights are not plausibly at risk; if the defendant can understand the ways in which a verdict was reached, they can choose to pursue legal action to appeal on those grounds.

Pasquale argues that COMPAS fails to be narratively intelligible. COMPAS is neither transparent nor easily explicable to a judge or defendant, let alone to the public. A defendant might be unable to clearly understand how they have been “risk-assessed,” and so they might encounter difficulty in challenging the score should they disagree with it. As a result, risk scores can function like an algorithmic brand: no matter the effort, a defendant is largely unable to change or eliminate it from their profile for the offense at hand. 

Risk-assessment scores could also create problems for judges, who also encounter difficulty when trying to question a risk score’s authenticity, origin, or algorithmic weighting. Without this investigation, judicial dependence on these algorithmic scores could lead to automation bias—judges, like all humans, might defer to the technology without questioning its validity, accuracy, or possible biases. Automation bias is an issue both inside and outside of courtrooms. Research on the biases involved in algorithmic decision-making systems reveal that human decision-makers frequently rate automated recommendations more positively than neutrally, even if they are aware that these recommendations might be subject to inaccuracies or error. If left unchecked, automation bias is quite challenging for a human actor to shake, creating scenarios in which people have difficulty refuting automated recommendations. This can lead to a judge in sentencing relying heavily upon a risk score without questioning its legitimacy. 

Along with the potential for automation  biases, it is unclear if COMPAS scores themselves are even constitutional. In particular, the potential for COMPAS to arrive at results by disproportionately or inaccurately weighing factors like socioeconomic status, gender, or race cannot be ignored. 

COMPAS takes into account a holistic life view of the defendant into its risk-assessment score that extends beyond a specific criminal incident—personal details ranging from gender, age, race, education level, familial background, social capability, and more are considered. University of Michigan Law Professor Sonja Starr writes in her paper, “The New Profiling,” that as a result of this holistic practice “judges and parole boards are told to consider risk scores that are based not only on criminal history, but also on socioeconomic and family-related disadvantages as well as demographic traits like gender and age.” Notably, gender-based sentencing that occurs as a result of COMPAS scores can intensify the incarceration rate of young and poor men of color. Indeed, risk-assessment instruments like COMPAS seem to specifically deem defendants riskier based on indicators of socioeconomic disadvantage, deem males riskier over females, count crime victimization and living in a high-crime neighborhood as risk factors, and also include assessments of the defendant’s attitude or mental health as risk factors. Often, Starr points out, the familial and neighborhood-related factors that these procedural instruments consider are highly race-correlated. 

Sentencing decisions made in the criminal justice system which consider race and gender are explicitly defined as unconstitutional practice under U.S. Sentencing Guidelines. Starr argues that by allowing judges to generate a ruling based in part upon such factors, the state itself might be endorsing a practice that allows certain groups of people to be considered “high-risk” or more likely to engage in violent crime because of factors they have no control over. These are judgments based upon the characteristics of who a person is, not what actions they have done. In this respect, when a judge uses a COMPAS risk-assessment score which seems to label defendants as higher risk based upon identity characteristics like socioeconomic status, race, and gender, this judge is allowing discriminatory factors entry into the legal arena. 

Such practice is, at its root, unconstitutional. By using the technical language of a risk-assessment score to obscure discrimination in this manner, unconstitutional judgment practices are able to enter legal ruling in ways that would otherwise be unacceptable if stated outright. COMPAS’ potential to produce bias and discriminatory results, and further, to then allow these results access into adjudication as a formal risk-score a judge may rely upon, is not only ethically concerning; it is also outside the bounds of constitutional practice.

COMPAS, and risk-assessments like it, have the potential to be beneficial tools of justice. Yet, it is still unclear whether the COMPAS algorithm is ready to serve American courtrooms.

 A judicial aid should be transparent, narratively intelligible, and constitutionally sound. These factors promote a judicial aid’s chance of supporting a judge in reaching a morally and legally defensible decision. It also boosts the public’s confidence in the ethicality of a judge’s decision in sentencing. Without certainty that COMPAS is able to adhere to these principles, a judge’s ability to reach a fair and just decision could be in jeopardy. Until this certainty arrives, the use of COMPAS, and risk-assessments like it, should be heavily scrutinized by courtroom actors.

***

In their opinion for Eric Loomis’ case, the Wisconsin Supreme Court wrote that “it is incumbent upon the criminal justice system to recognize that in the coming months and years…the system must keep up with the research and continuously assess” the use of tools like  COMPAS. 

Yet, while these technologies are scrutinized and debated, there’s a lot at stake—the future lives of individual defendants; unbiased procedure in criminal justice sentencing; public confidence in the criminal justice system at large; and the integrity of a judge’s discretion in sentencing. Whether or not the judge is able to make a sound decision on another’s livelihood relies on this discretion remaining intact. 

Should an algorithm designed to help alleviate a strained system do so at the potential cost of individual justice? This question ultimately remains unresolved and will face increasing scrutiny in the near future. However, until COMPAS and other similar risk-assessments are proven to consistently advance justice for the individual, courtroom actors must employ strong caution when engaging with them. Risk-assessment tools do hold an important place in the future of automated justice efforts. Yet, using these tools before they are ready could effectively hinder a proper delivery of justice in American sentencing procedures. 

Alexandra “Mac” Taylor ‘20 is a recent graduate of Stanford University