Algorithms in the justice system started off as a noble solution to a serious problem: the bias of judges. There are two distinct ways that judges can be biased — targeted bias, such as sexist and racist beliefs, and cognitive bias, ways in which our mental circuitry fails to work logically (such as how judges give harsher sentences right before lunch). The first may be so successfully hidden that it is impossible to rectify with legal appeals, given the variety of alternative legitimate ways to reach the same decision, and the second cannot be removed so long as a human is involved, as it is a core component of our mental circuitry. However, the development of new computer technologies in the ’80s, ’90s and early 2000s presented a novel solution to both problems — what if we got rid of the human entirely, thus eliminating two birds with one stone? For advocates of judicial reform, algorithms seemed like the perfect solution to the problem of judicial inconsistency and bias. They cannot fall for the cognitive biases which are, no matter how we structure our institutions, inherent to humans.
But the cure may be worse than the disease in this case. Its potential drawbacks need to be considered before we upend centuries of judicial practice. Risk assessment algorithms are ultimately dependent on the data that is entered into them, and as such are vulnerable to two types of bias. The first results from tainted data, such as when the arrest data reflects racist officers who arrest a certain group at a higher rate. The second results from socially biased data, where the data is completely accurate in terms of crime rates and other factors, but those trends are resulting from larger systemic and institutional biases that the algorithm takes as natural and immutable.
Tainted data is an easy problem to understand. The data for these algorithms is primarily a combination of demographic data and crime records. If a particular group, as is the case for black Americans for centuries, is arrested at a higher rate, charged at a higher rate, found guilty at a higher rate, and sentenced for longer sentences at a higher rate when compared to the general population for reasons of racial bias, then the data will reflect those biases. If a group is overpoliced, the algorithm will make sure they are over-sentenced. The best way to think of these algorithms is as status quo enhancers. By definition, algorithms look for trends in existing data and make predictions that match that data. Thus if an algorithm is running sentencing, any biases in the data collected will be replicated in its decision making, as it cannot determine the moral weight it applies to data it is fed. It treats all data equally.
The larger the usage of the algorithm spreads, the greater the scale and absolute number of the replication. All it takes is a few racist officers or judges; if their data is taken by algorithm and then applied across the entire country, they are effectively infecting the entire system. Tainted data leads to tainted results. If we want to remove all racial bias from our algorithms, we need to make sure that none of the data we feed it is artificially skewed by racism. If we could manage that, we would have little need for the algorithm in the first place.
However, algorithms are a problem even if the data isn’t skewed by over-policing or other data collection errors. What about circumstances where groups actually make up different proportions of crime statistics, separate from explicit targeted over policing and fabrication, or are otherwise correlated with criminal activity? For example, men commit significantly more murders than women. If an algorithm analyzes that data and decides to sentence men across the board to higher sentences, is that unbiased?
Not necessarily. On a basic level it violates a moral standard that most of us profess to: group membership is not a sign of guilt. You should not automatically be found guilty for the actions of those you have no connection with other than some group category. However that moral stance is not the end of the issue here. We also need to consider false positive (saying someone is high risk when they are not) and false negative (saying someone is not high risk when they are) rates. If two groups, due to accurate data, have different trends in relation to crime data (think men and women, where men commit more murders), then the algorithm will isolate different amounts of each. Even with the same false positive rate for both groups, that will thus lead to a greater absolute number of one group being falsely accused, which, when applied to the population as a whole where the percentages of each group may be close to even, will result in a higher chance of being falsely accused if you are from that group.
So even if the data is completely unbiased, it will lead to biased results as it will result in a greater false accusation rate for some groups when compared to others. The smaller the group is and the larger their correlation with criminal activity, the more people will be falsely accused. If inequities in crime rates indeed result from systemic economic and social factors, as data indicates, it needs to be remembered that algorithms reflect the status quo. All they do is reflect and then entrench the existing patterns. Even if the correlations do not exist because of over-policing, when algorithms are given the task of running sentencing, then they will magnify and repeat the existing trends.
How do we rectify this? One way is to picture what an equitable distribution of criminality across society would look like, and skew the algorithm in that direction. This idea has been routinely demolished by both supporters and detractors of algorithms, as it by definition requires us to ignore substantial amounts of crimes on the basis of race. Another approach might be altering the algorithm so it doesn’t focus simply on sentencing, but rather attempts to isolate patterns that led to the criminal action or other courtroom issues in order to assist rectifying them. Machine learning algorithms are pattern-identifying masters, and that skill set does not only apply to crime — it matters what we choose to focus the algorithm on. If the crime rates resulted from some systemic disadvantage, the algorithm could assist in overcoming those barriers on a person-by-person basis; instead of denying bail because of the late attendance of the individual, it could assist them with navigating the public bus routes. In other words, these programs could be used for targeted interventions, altering the path of the individual instead of condemning them.
One of the key problems of these algorithms is the appearance of neutrality. The real risk is that with mass adoption, it could lead to people believing that, since the biased humans were removed, the result was inevitable. If people believe that algorithms cannot be biased, the algorithms can make people believe inequitable results are the result of nature, and not social biases, making those biases more difficult to overcome. It gives existing inequities a veneer of objectivity once the human element is removed, providing justification to those that seek to block reform. This alone may make current social dilemmas politically immovable. Equality in the algorithm is not the same as equity, and holding the system as it is may only benefit those currently in power; after all, these algorithms simply reflect and enhance.
Ultimately, a decision will have to be made here. That is because there are two equally valid considerations that must be taken into account. Those are accuracy and equity. The fundamental dilemma of algorithmic justice is that neither can be obtained at once.
Algorithms will be more accurate than human judges in predicting recidivism. That equals less crimes committed, which means more lives saved and less lives broken. It means safer cities, and families that are not torn apart by random acts of violence. However, as you increase the accuracy you decrease the equity. You increase the amount of false positives — false accusations leading to years of incarceration — for the groups associated with the algorithm’s prediction. You entrench existing inequities by reflecting and enhancing existing trends, removing the ability of the court system to take that into account and attempt to alleviate the problem. You reduce crime by more effectively targeting its perpetrators, while making the reasons for those crimes nearly impossible to change. You provide a veneer of objectivity to the results, making them seem natural and inevitable.
Using human judges allows us to avoid making a systemic decision. We don’t have to choose to focus on accuracy or equity. We can let individual judges make that decision in each case. This means accepting that the system will be fundamentally inconsistent as a baseline feature, not a bug. It means giving up control and the idea of perfection. Judges will never be as accurate as the algorithm in any particular case, but they have the ability to be much more equitable by considering their own moral judgements, the emotional appeals of the individuals, the social circumstances around the case and within society, and their view of what is just. They can be persuaded by things that matter to us but would never be considered by the machine.
At the same time, those considerations may be morally abhorrent. Judges are able to use their own prejudices to influence their decisions, regardless of what the facts say. They can ignore legal reasoning and evidence, in a way the algorithm never can. They can be bribed and corrupted. They will continue to make cognitive errors that will restrict freedom. Their own mental architecture will create mistakes that will lead to people being locked away for years more than they should, including innocent people who shouldn’t have been locked up at all.
As mathematician Hannah Fry notes, “The choice isn’t between a flawed algorithm and some imaginary perfect system. The only fair comparison to make is between the algorithm and what we’d be left with in its absence.”
Featured Image Source: The Appeal