Turning Lemons into Lemon…Reflux: When AI Makes Things WorseTurning Lemons into Reflux: When AI Makes Things Worse

Sometimes the biggest challenges emerge when AI does exactly what it is programmed to do! An AI doesn’t recognize social contexts or constructs, and this section examines some of the unwanted impacts that can result from the divergence between technical and social outcomes. The three fails explore three components of the AI: the training data fed into the model, the objective of the AI and the metrics chosen to measure its success, and the AI’s interactions with its environment.

 

 
Explore the Four Fails in This Category:

Irrelevant Data, Irresponsible Outcomes

A lack of understanding about the training data, its properties, or the conditions under which the data was collected can result in flawed outcomes for the AI application.

 

Examples

In 2008, early webcam facial tracking algorithms could not identify faces of darker skinned individuals because all the training data (and most of the developers) were white skinned.1 One particularly illuminating demonstration of this fail occurred in 2018, when Amazon’s facial recognition system confused pictures of 28 members of Congress (the majority of them dark-skinned) with mugshots.2 The ten-year persistence of these fails highlights the systemic and cultural barriers to fixing the problem, despite it being well acknowledged.

40,000 Michigan residents were wrongly accused of fraud by a state-operated computer system that had an error rate as high as 93%. Why? The system could not convert some data from legacy sources, and documentation and records were missing, meaning the system often issued a fraud determination without having access to all the information it needed. A lack of human supervision meant the problem was not addressed for over a year, but that wouldn’t change the underlying problem that the data may not be usable for this application.3

An AI for allocating healthcare services offered more care to white patients than to equally sick black patients. Why? The AI was trained on real data patterns, where unequal access to care means less money is traditionally spent on black patients than white patients with the same level of need. Since the AI’s goal was to drive down costs, it focused on the more expensive group, and therefore offered more care to white patients.4,5 This example shows the danger of relying on existing data with a history of systemic injustice, as well as the importance of selecting between a mathematical and a human-centric measure to promote the desired outcome.

Why is this a fail?

Many AI approaches reflect the patterns in the data they are fed. Unfortunately, data can be inaccurate, incomplete, unavailable, outdated, irrelevant, or systematically problematic. Even relevant and accurate data may be unrepresentative and unsuitable for the new AI task. Since data is highly contextual, the original purposes for collecting the data may be unknown or not appropriate to the new task, and/or the data may reflect historical and societal imbalances and prejudices that are now deemed illegal or harmful to segments of society.6

 

What happens when things fail?

When an AI system is trained on data with flawed patterns, the system doesn’t just replicate them, it can encode and amplify them.7 Without qualitative and quantitative scientific methods to understand the data and how it was collected, the quality of data and its impacts are difficult to appreciate. Even when we apply these methods, data introduces unknown nuances and patterns (which are sometimes incorrectly grouped together with human influences and jointly categorized as ‘biases’) that are really hard to detect, let alone fix.8,9

Statistics can help us address some of these pitfalls, but we have to be careful to collect enough, and appropriate, statistical data. The larger issue is that statistics don’t capture social and political contexts and histories. We must remember that these contexts and histories have too often resulted in comparatively greater harm to minority groups (gender, sexuality, race, ethnicity, religion, etc.).10

The ten-year persistence of these fails highlights the systemic and cultural barriers to fixing the problem, despite it being well acknowledged

Documentation about the data, including why the data was collected, the method of collection, and how it was analyzed, goes a long way toward helping us understanding the data’s impact.

 

 

 
 
       
Hold AI to a Higher Standard Involve the Communities Affected by the AI Make Our Assumptions Explicit Monitor the AI’s Impact and Establish Layers of Accountability
It’s OK to Say No to Automation Plan to Fail Try Human-AI Couples Counseling Envision Safeguards for AI Advocates
AI Challenges are Multidisciplinary, so They Require a Multidisciplinary Team Ask for Help: Hire a Villain Offer the User Choices Require Objective, Third-party Verification and Validation 
Incorporate Privacy, Civil Liberties, and Security from the Beginning Use Math to Reduce Bad Outcomes Caused by Math Promote Better Adoption through Gameplay Entrust Sector-specific Agencies to Establish AI Standards for Their Domains 

Add Your Experience! This site should be a community resource and would benefit from your examples and voices. You can write to us by clicking here.