Last week, Google introduced Allo, a new chat app with a built-in “Google Assistant” — a chatbot that lives in the app, answering questions you pose it in regular, everyday speech. It’s the latest in a wave of natural-language chatbots and “assistants” that have been rolled out by big tech companies over the last few years — think Apple’s Siri, Amazon’s Alexa, Microsoft’s Cortana, and Facebook’s M. To hear futurists, developers, and investors talk, the chatbot (and its associated technologies, voice recognition and natural-language processing) is the future of computing, and the way we’ll be interacting with our phones, computers, and even our homes.
So far, though, the chatbots we use have only been successful in a limited way. Actual conversational bots trip up, get words wrong, misunderstand, and often times offend. Siri has an incredibly hard time understanding accents; “clever” lines from the always-listening Alexa become inadvertently creepy when their timing is off. Conversation and common sense, which run on experience, empathy, intuition, and emotional understanding, are hard to teach and encode. Humanity is hard to program.
Nevertheless, the companies that make these bots want you to think of them as human. They’re named and gendered; they have personality traits and serve up pithy prewritten responses. This kind of anthropomorphization is a specific design decision intended to create an implied interaction pattern: conversation. But you’re not really talking to the bot, are you? Anything that Cortana or Siri says in response to a person has been vetted by lots of people at Apple and Microsoft; Facebook’s M similarly has someone behind the curtain.
As a designer with bots and AI, I wonder why we insist on interaction paradigms that require so much maintenance, and still seem so likely to fail. When we give bots freedom, but still demand they follow the rules of conversation, the results can be disastrous. Microsoft’s attempt to create a Twitter bot that could respond and interact like a teenage girl, was a great example of trying to leap forward toward human conversation and utterly failing. “Tay,” as the bot was called, was extremely “smart.” It could learn and expand its vocabulary very quickly through interacting with other Twitter users. It could understand basic grammar: A sentence consists of a noun and verb and maybe some other words. But Tay’s intelligence only went so far. It didn’t understand what words meant. “I love food” was, as far as Tay was concerned, identical to “I love Hitler.” People — 4channers — were able to use Tay’s intelligence against it, and Tay, in less than 24 hours, was spewing genocidal, misogynistic bile.
Contrast Tay to the now-defunct BBC Weather bot. Users could tweet a location and a date — something like “Westhampton tomorrow” — and the bot would respond with the weather. The bot did what bots do best, which is quickly and effectively parse information to return a very relevant response. Instead of trying to be “human,” it was just “botty” — and it worked well, and never once advocated for genocide.
Artificial intelligence doesn’t mean higher intelligence, and it certainly doesn’t mean human intelligence. I think we have trouble creating great bots that work on their own because we continue to insist that they act more like cheery, witty humans than like the mechanical automatons they are. Why should I be obligated to speak with Google’s Allo as though it’s another person? Why does Alexa need zingers, and why does Microsoft want a bot that sounds like a teenager? What’s wrong with images, menu options, or even adorable R2-D2 noises? Why not create a botlike version of something as fantastic as terrapattern — a website using machine learning to show patterns in Google image data sets. Bots could be extensions of ourselves, the human creators, to help augment our processes, instead of personified service AIs. Computers are really great at processing information, of holding a lot of data, but really bad at understanding intention and empathizing with users. Those qualities are what humans do really well.
The design researcher Alexis Lloyd has spoken about this idea of “bottiness,” most recently at the Eyeo Festival. “Bottiness,” she says, is “about a bot expressing itself in a way that is keeping with its processes and functions […] when I think about how bots express themselves, they don’t speak English, but they present themselves in a way that sets them apart from the expectations we bring to human conversation.” Lloyd brings up R2-D2, who “speaks this language everyone can interpret even if you don’t necessarily literally understand it.” Actual conversations involve a lot of nuances — on the linguistic level, on the cultural level, even on the physical level. Human conversations are hard enough for humans! So why are we so eager to see bots excel at something that many people are themselves barely competent at?