The Cult of AI
The Cult of AI: Perceiving AI to Be More Mature Than It Is

AI is all about boundaries: the AI works well if we as developers and deployers define the task and genuinely understand the environment in which the AI will be used. New AI applications are exciting in part because they exceed previous technical boundaries − like AI winning at chess, then Jeopardy, then Go, then StarCraft. But what happens when we assume that AI is ready to break those barriers before the technology or the environment is truly ready? This section presents examples where AIs exceeded either technical or environmental limits – whether because AI was put in roles it wasn’t suited for, user expectations didn’t align with its abilities, or because the world was assumed to be simpler than it really is.

 

 
Explore the Three Fails in This Category:

No Human Needed: The AI’s Got This

We often intend to design AIs to assist their human partners, but what we create can end up replacing some human partners. When the AI isn’t ready to completely perform the task without the help of humans, this could lead to significant problems.

 

Examples

Microsoft released Tay, an AI chatbot designed “to engage and entertain” and learn from the communication patterns of the 18-to-24-year-olds with whom it interacted. Within hours, Tay started repeating some users’ sexist, anti-Semitic, racist, and other inflammatory statements. Although the chatbot met its learning objective, the way it did so required individuals within Microsoft to modify the AI and address the public fallout from the experiment.1

Because Amazon employs so many warehouse workers, the company has used a heavily automated process that tracks employee productivity and is authorized to fire people without the intervention of a human supervisor. As a result, some employees have said they avoid using the bathroom for fear of being fired on the spot. Implementing this system has led to legal and public relations challenges, even if it did reduce the workload for the company’s human resources employees or remaining supervisors.2

Why is this a fail?

Perception about what AI is suited for may not always align with the research. Deciding which tasks are better suited for humans or for machines can be traced back to Fitts’s ‘machines are better at’ (MABA) list from 1951.3 A modern-day interpretation of that list might allocate tasks that involve judgment, creativity, and intuition to humans, and tasks that involve responding quickly or storing and sifting through large amounts of data to the AI.4,5 More advanced AI applications can be designed to blur those lines, but even in those cases the AI will likely need to interact with humans in some capacity.

Like any technology, AI may not work as intended or may have undesirable consequences. Consequently, if the AI is intended to work by itself, any design considerations meant to foster partnership will be overlooked, which will impose additional burdens on the human partners when they are called upon.6,7

 

What happens when things fail?

Semi-autonomous cars provide a great example of how the same burdens that have been studied and addressed over decades in the aviation industry are re-emerging in a new technology and marketplace.

Lost context – As more inputs and decisions are automated, human partners risk losing the context they often rely on to make informed decisions. Further, sometimes they can be surprised by decisions their AI partner makes because they fail to fully understand how that decision was made,8 since information that they would usually rely on to make a decision is often obscured from them by AI processes. For example, when a semi-autonomous car passes control back to the human driver, the driver may have to make quick decisions about what to do without knowing why the AI transferred control of the car to him or her, which increases the likelihood of making errors.

Cognitive drain – As AIs get better at conducting tasks that humans find dull and routine, humans can be left with only the hardest and most cognitively demanding tasks. For example, traveling in a semi-autonomous car might require human drivers to monitor both the vehicle to see if it’s acting reliably, and the road to see if conditions require human intervention. Because the humans are then more engaged in more cognitively demanding work, they are at a higher risk of the negative effects of cognitive overload, such as decreased vigilance or increased likelihood of making errors.

Human error traded for new kinds of error – Human-AI coordination can lead to new sets of challenges and learning curves. For example, researchers have documented that drivers believe they will be able to respond to rare events more quickly and effectively than they actually can.9 If this mistaken belief is unintentionally included in the AI’s programming, it could create a dangerously false sense of security for both developers and drivers.

Reduced human skills or abilities – If the AI becomes responsible for doing everything, humans will have less opportunity to practice the skills that were often important in the development of their knowledge and expertise on the topic (i.e., experiences that enable them to perform more complex or nuanced activities). Driving studies have indicated that human attentiveness and monitoring of traffic and road conditions decrease as automation increases. Thus, at moments when experience and attention are needed most, they might potentially have atrophied due to humans’ reliance on AI.

 

 

 
 
       
Hold AI to a Higher Standard Involve the Communities Affected by the AI Make Our Assumptions Explicit Monitor the AI’s Impact and Establish Layers of Accountability
It’s OK to Say No to Automation Plan to Fail Try Human-AI Couples Counseling Envision Safeguards for AI Advocates
AI Challenges are Multidisciplinary, so They Require a Multidisciplinary Team Ask for Help: Hire a Villain Offer the User Choices Require Objective, Third-party Verification and Validation 
Incorporate Privacy, Civil Liberties, and Security from the Beginning Use Math to Reduce Bad Outcomes Caused by Math Promote Better Adoption through Gameplay Entrust Sector-specific Agencies to Establish AI Standards for Their Domains 

Perfectionists and “Pixie Dusters”

There is a temptation to overestimate the range and scale of problems that can be solved by technology. This can contribute to two mindsets: “perfectionists” who expect performance beyond what the AI can achieve, and “pixie dusters” who believe AI to be more broadly applicable than it is. Both groups could then reject current or future technical solutions (AI or not) that are more appropriate to a particular task.

 

Examples

In 2015, Amazon used an AI to find the top talent from stacks of resumes. One person involved with the trial run said, “Everyone wanted this holy grail… give[n] 100 resumes, it will spit out the top five, and we’ll hire those.” But because the AI was trained on data from previous hires, its selections reflected those existing patterns and strongly preferred male candidates to female ones.1 Even after adjusting the AI and its hiring process, Amazon abandoned the project in 2017. The original holy grail expectation may have diverted the firm from designing a more balanced hiring process.

The 2012 Defense Science Board Study titled “The Role of Autonomy in DoD Systems” concluded that “Most [Defense Department] deployments of unmanned systems were motivated by the pressing needs of conflict, so systems were rushed to theater with inadequate support, resources, training and concepts of operation.” This push to deploy first and understand later likely had an impact on warfighters’ general opinions and future adoption of autonomous systems.2

Why is this a fail?
Non-AI experts can have inflated expectations of AI’s abilities. When AI is presented as having superhuman abilities based on proven mathematical principles, it is tremendously compelling to want to try it out.

Turn on the radio, ride the bus, watch a TV ad, and someone is talking about AI. AI hype has never been higher,3 which means more people and organizations are asking, ‘How can I have AI solve my problems?’

AI becomes even more appealing because of the belief that algorithms are “objective and true and scientific,” since they are based on math. In reality, as mathematician and author Cathy O’Neil puts it, “algorithms are opinions embedded in code,” and some vendors ask buyers to “put blind faith in big data.”4 Even AI experts can fall victim to this mentality, convinced that complex problems can be solved by purely technical solutions if the algorithm and its developer are brilliant enough.5

In the end, it’s about balance. AI has its limits and intended and appropriate uses. We have to identify the individual applications and environments for which AI is well suited, and better align non-experts’ expectations to the way the AI will actually perform.

What can result is a false hope in a seemingly magical technology. As a result, people can want to apply it to everything, regardless of whether it’s appropriate.

 

What happens when things fail?

Misaligned expectations can contribute to the rejection of relevant technical solutions. Two mentalities that emerge – “perfectionists” and “pixie dusters” (as in “AI is a magical bit of pixie dust that can be used to solve anything”) – can both lead to disappointment and skepticism once expectations must confront reality.

Perfectionist deployers and users may expect perfect autonomy and a perfect understanding of autonomy, which could (rightly or wrongly) delay the adoption of AI until it meets those impossible standards. Perfectionists may prevent technologies from being explored and tested even in carefully monitored target environments, because they set too high a bar for acceptability.

In contrast, AI pixie-dusters may want to employ AI as soon and as widely as possible, even if an AI solution isn’t appropriate to the problem. One common manifestation of this belief occurs when people want to take an excellent AI model and replicate it for a different problem. This technique is referred to as “transfer learning,” where “a model developed for one task is reused as the starting point for a model on a second task.”6 While this approach can expedite the operationalization of a second AI model, problems arise when people are overly eager to attempt it. The new application must have the right data, equipment, environment, governance structures, and training in place for transfer learning to be successful.

Perhaps counterintuitively, an eagerness to adopt autonomy too early can backfire if the immature system behaves in unexpected, unpredictable, or dangerous ways. When pixie dusters have overinflated expectations of AI outcomes and the AI fails to meet those expectations, they can be dissuaded from trying other, even appropriate and helpful, AI-applications (as happened in the “AI Winter” in the 1980s7).8

In the end, it’s about balance. AI has its limits and intended and appropriate uses. We have to identify the individual applications and environments for which AI is well suited, and better align non-experts’ expectations to the way the AI will actually perform.

 

 

Hold AI to a Higher Standard Involve the Communities Affected by the AI Make Our Assumptions Explicit Monitor the AI’s Impact and Establish Layers of Accountability
It’s OK to Say No to Automation Plan to Fail Try Human-AI Couples Counseling Envision Safeguards for AI Advocates
AI Challenges are Multidisciplinary, so They Require a Multidisciplinary Team Ask for Help: Hire a Villain Offer the User Choices Require Objective, Third-party Verification and Validation 
Incorporate Privacy, Civil Liberties, and Security from the Beginning Use Math to Reduce Bad Outcomes Caused by Math Promote Better Adoption through Gameplay Entrust Sector-specific Agencies to Establish AI Standards for Their Domains 

Developers Are Wizards and Operators Are Muggles

When AI developers think we know how to solve a problem, we may overlook including input from the users of that AI, or the communities the AI will affect. Without consulting these groups, we may develop something that doesn’t match, or even conflicts with, what they want.

Note: “Muggle” is a term used in the Harry Potter books to derogatorily refer to an individual who has no magical abilities, yet lives in a magical world.

 

Examples

After one of the Boeing 737 MAX aircraft crashes, pilots were furious that they had not been told that the aircraft had new software, the software would override pilot commands in some rare but dangerous situations, and the pilot manual did not include mention of the software.1,2

Uber’s self-driving car was not programmed to recognize jaywalking, only pedestrians crossing in or near a crosswalk,3 which would work in some areas of the country but runs counter to the norms in others, putting those pedestrians in danger.

Why is this a fail?
It’s a natural inclination to assume that end-users will act the same way we do or will want the same results we want. Unless we include in the design and testing process the individuals who will use the AI, or communities affected by it, we’re unintentionally limiting the AI’s success and its adoption, as well as diminishing the value of other perspectives that would improve AI’s effectiveness.

Despite our long-standing recognition of how important it is to include those affected by what we’re designing, we don’t always follow through. Even if we do consult users, a single interview is not enough to discover how user behaviors and goals change in different environments or in response to different levels of pressure or emotional states, or how those goals and behaviors might shift over time.

 

What happens when things fail?

At best, working in a vacuum results in irritating system behavior – like a driver’s seat that vibrates every time it wants to get the driver’s attention.4 Sometimes users may respond to misaligned goals by working around the AI, turning it off, or not adopting it at all. At worst, the objectives of the solution don’t match users’ goals, or it does the opposite of what users want. But with AI’s scope and scale, the stakes can get higher.

If we start thinking about the ‘customer’ not only as the purchaser or user of the technology, but also as the community the deployed technology will affect, our perspective changes.

Let’s look at a relevant yet controversial AI topic to see how a different design perspective can result in drastically different outcomes. All over the country, federal, state, and local law enforcement agencies want to use facial recognition AI systems to identify criminals. As AI developers, we may want to make the technology as accurate or with as few false positives as possible, in order to correctly identify criminals. However, the communities that have been heavily policed understand the deep historical patterns of abuse and profiling that result, regardless of technology. As Betty Medsger, investigative reporter, writes, “being Black was enough [to justify surveillance].”5 So if accuracy and false positives are the only consideration, we create an adoption challenge if communities push back against the technology, maybe leading to its not being deployed at all, even if it would be beneficial in certain situations. If we bridge this gap by involving these communities, we may learn about their tolerances for the technology and identify appropriate use cases for it.

If we start thinking about the ‘customer’ not only as the purchaser or user of the technology, but also as the community the deployed technology will affect, our perspective changes.6

 

Hold AI to a Higher Standard Involve the Communities Affected by the AI Make Our Assumptions Explicit Monitor the AI’s Impact and Establish Layers of Accountability
It’s OK to Say No to Automation Plan to Fail Try Human-AI Couples Counseling Envision Safeguards for AI Advocates
AI Challenges are Multidisciplinary, so They Require a Multidisciplinary Team Ask for Help: Hire a Villain Offer the User Choices Require Objective, Third-party Verification and Validation 
Incorporate Privacy, Civil Liberties, and Security from the Beginning Use Math to Reduce Bad Outcomes Caused by Math Promote Better Adoption through Gameplay Entrust Sector-specific Agencies to Establish AI Standards for Their Domains 

Add Your Experience! This site should be a community resource and would benefit from your examples and voices. You can write to us by clicking here.