Cortana awakens: The evolution of Microsoft's smart assistant

Microsoft CEO Satya Nadella walks in front of the new Cortana logo as he delivers a keynote address during the 2014 Microsoft Build developer conference.

Image: Justin Sullivan/Getty Images

It’s fun to imagine all the other names Microsoft’s voice assistant, Cortana, could have had: “Bill,” “Explorer,” “Pathfinder.” None of these were considered, as far as we know, but Cortana Group Product Manager Marcus Ash did tell me recently, “Early on we thought about some names that are more generic. I won’t share any specific names, but think of a typical Microsoft name circa 1998 and you get the idea.”

The decision to give Microsoft’s consumer-facing intelligence platform a name was a strategic decision, one Ash and his team thought necessary to help people connect with the idea of Cortana and understand how they might engage with it. 

“Giving a name helps orient people to the kind of communications we’re going to encourage them to use with Cortana. If it doesn’t have this characteristic of human interaction… it’s harder for someone to visualize, how am I supposed to interact?” said Deborah Harrison one of the writers on the Cortana team.

Deborah Harrison

Deborah Harrison, a  writer with the Cortana editorial team at Microsoft Corp. helps craft some of Cortana’s chit-chat responses.

Image: Lucas Westcoat

Harrison’s job was a bit of a revelation to me, since I always thought of Cortana as a pure artificial intelligence platform, crafting spoken answers out of data and millions of parts of speech (more akin to the way Apple’s Siri reportedly works). Instead, the real Cortana is a lot of intelligence, as well as full sentences and canned responses triggered by specific queries that, when spoken, start with “Hey Cortana.”

“Some is compiled [out of speech parts],” said Harrison, “but also do a good amount of recording exactly as it is.”

‘If it doesn’t have this characteristic of human interaction…it’s harder for someone to visualize, how am I supposed to interact?’

In addition to the spoken engagement, Cortana is a next-gen text-based query engine that rapidly disambiguates whatever you’re typing — as you’re typing — to give you the most relevant info and answers on the fly. It can pull from your system, the personal info you give it access to, and the web (via the Bing search engine). The upcoming Windows 10 Anniversary Update, scheduled to launch August 2, will enhance these and other features, making Cortana more of an active, digital participant in your life.

Understanding how Cortana is built, what it can do today and tomorrow, and what it’s really all about is, to my mind, a way of understanding Microsoft’s broader strategy for connecting leading-edge technology to consumers.

Pieces of Cortana

Cortana, which started on Windows Phone but migrated to the desktop almost a year ago with the launch of Windows 10, is one of many voice/digital assistants currently in the marketplace, including Amazon’s Alexa, Google’s Assistant and Apple’s Siri.

For now, Cortana is the only one on the desktop (though Alexa can work in a web browser). That monopoly will be shattered with the release of macOS Sierra later this year, just a few months after Windows 10 Anniversary Update arrives.

However, the Windows 10 update could broaden the reach of Cortana. It will extend the assistant beyond its current home next to the Windows Start button, letting users access it via the lock screen. The update als brings Cortana integration with the new Windows 10 Sticky Notes, which puts intelligence to work on typed and written reminders (write or type a flight number on a virtual sticky note and Cortana will automatically track it as a flight number) and, for the first time, the ability to work with the Xbox One gaming console.

Cortana Editorial Team

The Cortana Editorial team meets once a week to discuss new interactions.

Image: Lucas Westcoat

If you’re a Windows Insider (someone who has signed up for access to preview versions of the software), you’ve already been trying out some of these features. Ash considers this invaluable telemetry and told me that, through the insider program, they get an early look at the struggles users have with Cortana (and other features) and then work to minimize or fix the issues in time for launch.

Productivity has been at the heart of Microsoft CEO Satya Nadella’s mission to reboot Microsoft, so, naturally, part of Cortana’s mission is productivity. “What are the things that we can do proactively that let you make the next step?” said Ash. If you let it, Cortana can watch your email and schedules to keep you on top of promises you’ve made and protect you from yourself by letting you know when you’ve overbooked yourself or, perhaps, agreed to an appointment at a later time than you normally do.

“You get the biggest engagement when we can proactively do something for someone,” said Ash.

Who are you, Cortana?

There is no understanding Cortana, though, without understanding what she (or it) is supposed to be. 

In Microsoft’s view, Cortana is voice, artificial intelligence and proactive assistance. It’s quietly doing things for you behind the scenes and engaging with your verbally — if you let it.

When you do engage with the vocal part of Cortana, you encounter those word parts and sentences — often recorded by Jen Taylor (in the U.S. version) — but you also meet an intelligence that a knows a few things about itself — Cortana’s core self, the core tenets of its personality that do not change, no matter the task, no matter the interaction.

Harrison explained that Cortana is a loyal, seasoned personal assistant and has the wisdom of experience (or at least she appears to). “The role is useful for those of us working on Cortana. It gives a set point for what is appropriate interaction,” said Harrison.

Microsoft Cortana Team

Chris O’Connor, Harrison (second from left), Ron Owens and August Niehaus.

Image: Lucas westcoat

Cortana is an artificial intelligence and she know it. “She does not think of herself [as] human, does not aspire to be human,” she said. Fortunately, Cortana likes humans.

‘Cortana might like waffles, but is clear that she can’t eat’

Finally, Cortana is transparent and authentic. So, like most other digital voice assistants on the market, Cortana is relentlessly chipper and upbeat.

Even though Cortana is designed to deal with reality, Harrison’s writing team has built some whimsy into the “chit-chat,” which is what they call the chatty mode Cortana adopts when you ask her more conversational questions. In it, there can be “a bit of an imaginary universe. Cortana might like waffles but is clear that she can’t eat,” said Harrison. So you get Cortana musing about waffles, but not wishing she could have some.

The whimsy side of Cortana has led to the creation of a handful of Easter Eggs, though you don’t need a special code to access them. “There are things that we write with the expectation that very few people will ever discover them because they make us giggle or that we wanted to address.”

“Recently, we did, for the Shakespeare anniversary, a bunch of Shakespeare quotes or insults and we sort of just did them as a lark, but people found them all,” said Harrison.

Different places, different people

As Cortana spreads globally to other countries and languages, Ash and Harrison face new challenges. In addition to new languages and new voices for Cortana, Harrison’s team had to adjust the chit-chat to cultural norms. 

In Germany, one of the first countries they expanded into, the Germans’ efficient form of communication can come across as blunt to Americans. However, Harrison explained that even when speaking to Germans, they couldn’t simply have Cortana replicate this crisp style.

“Because they were interacting with a computer instead of a person, [visual] cues were missing… they had to inject more humor, warmth and politeness to maintain correct balance with blunt German efficiency,” she said. 

Further expansion has required Cortana to do everything from celebrating or avoiding national pride to singing for Italians.

Socially aware

In this political season, Harrison’s team, which is made up of an eclectic collection of humanities-leaning people (novelists, poets, filmmakers), have loaded Cortana up with some political responses.

“We spent a lot of time on politics recently, knowing that the political climate is going to be something people want to talk about. We added hours to our day to answer stuff about something we consider really important, but can be incredibly polarizing.”

Cortana Edit Team

The Cortana Editorial Team at Microsoft  (from left to right)  Chris O’Connor, Writer; Jonathan Foster, Editorial Manager; August Niehaus, Writer; Deborah Harrison, Writer; Jon Douglas, Content; Ron Owens, Writer; Renan Leahy, Content. Who knew writing for Cortana is this much fun?

Image: Lucas Westcoat

They also made sure the responses would not represent any kind of political point of view from the team, but would be true to Cortana’s core tenets. When I asked Cortana if she would vote for Donald Trump she replied, “If I had all the answers, it would be a REALLY long document.” When I asked if she would vote for Hillary Clinton, Cortana replied, “I honestly can’t tell if that’s a trick question.” On the question of party affiliation, Cortana sidestepped again, “You know, I’m really not much of a party person.”

‘We added hours to our day to answer stuff about something we consider really important, but can be incredibly polarizing’

It’s not unusual for someone to ask a digital voice assistant a knottier question or make a statement that may require a much more nuanced or thoughtful answer.

If you tell Amazon Alexa, for instance, “I’m Gay,” it’s been programmed to answer, “Thanks for telling me. Happy Pride month.”

Harrison recalled when her team was trying to figure out how it would handle that statement.

“We wanted to be sensitive, but not make a big deal about it,” she said. So they had Cortana reply simply, “I’m AI.” However, when a group of high school students visited the Microsoft campus, one actually suggested an alternative, “Instead of saying ‘I’m AI.’ He felt that was a little harsh to the ear, he suggested adding ‘Cool.’” So the response phrase become, “Cool; I’m AI.”

High hurdles

One challenge Microsoft will face in the fall is a Siri that not only works on the desktop, but is engineered to handle multi-turn conversations. A question about when a movie is playing might lead Siri to answer and then respond to a follow-up question about buying tickets that doesn’t restate any part of the original query. It’s a more natural, back-and-forth conversational approach that users might come to expect from their digital assistants. Cortana, however, is really an asked-and-answered kind of AI.

“What we found is that, on the desktop in particular, you want to help people get the answer they want as quickly as possible,” said Ash. There’s a concern that a follow up query could impede users access to a result.

The presence of a large screen on the desktop (or laptop) means that the full answer to the query will be staring the user in the face, contends Ash. He can see multi-turn responses working in places where you don’t have ready access to the screen or keyboard, maybe in a hands-free situation with a mobile phone.

Windows 10 on a phone

Microsoft Windows 10 operating system on a Windows Phone.

Image: TOBIAS SCHWARZ/AFP/Getty Images

Of course, Cortana’s footprint on the mobile side is, compared to Siri, minuscule. Yes, Cortana is an integrated part of Windows 10 Mobile (formerly Windows Phone), but that platform has roughly 3% of the handset market (in its most recent quarterly earnings, Microsoft reported a 70% decline in Windows mobile revenue). Microsoft has expanded Cortana to Android and iOS, though Microsoft  offers no usage numbers on those platforms.

Cortana is actually more functional on Android, where it can, if you let it, see incoming texts and WhatsApp messages and forward them to the desktop. “Android is just more open and there’s more things we can do with it at this particular point in time,” said Ash.

As for iOS, “We’re still working on figure out how to get some of these features working based on what’s available in that platform,” he said.

Capture the flag

In lieu of mobile domination, Cortana is heading to another hardware platform: Xbox. But instead of a straight port, Ash and Harrison have adjusted many of Cortana’s responses to align with the gaming community.

In Cortana on the desktop, “we don’t want to feel exclusive so we will jettison clever responses if we think they’ll feel exclusionary, but if you have an Xbox, we can assume you play games,” she said.

Detail of the buttons on a Microsoft Xbox One wireless controller, taken on January 22, 2016. (Photo by Olly Curtis/Future Publishing via Getty Images)

Image: Future Publishing

If you ask Cortana on the desktop, “Do you miss Master Chief?” She replies, “No, he gets along fine on his own.” Ask the same question on Xbox and Cortana will reply, “No, but I suspect he lives with Cortana… the other one.

Gamers, who often wear headsets when playing games, may actually use Cortana more often than desktop users who, as Microsoft SVP Yusuf Mehdi told me earlier this year, haven’t quite engaged with the voice component of Cortana on the desktop. 

“Cortana is an assistant about you and not necessarily about a particular device,” said Ash, which is a nice tagline, but what does it really mean?

Perhaps it means reaching beyond the hardware and out toward consumers. 

Windows 10

Windows 10 operating on a Microsoft Surface computer,

Image: Richard Drew/AP

When the Windows 10 Anniversary Update ships in August, it will give Cortana a new lock-screen function that will allow the system to respond to spoken queries and deliver answers before users unlock and log into the system. In this way, said Ash, they’re “hoping and expecting to get more engagement on voice.”

In practice, the ability to initiate a voice-driven query like “Hey Cortana, what’s the weather?” without unlocking your Windows PC, makes the system a little more like the always-listening Amazon Echo (and upcoming Google Home), which responds to any question that starts with “Alexa.” By default, Cortana will only be able to answer general questions above the lock, unless you decide to give it full access to your info and aren’t worried about one of your kids changing your appointments and setting reminders.

In the end, Ash knows that you can’t force people to learn new skills. Instead, Cortana will be successful if you use it for current habits and to build new ones, like speech on the desktop and accepting a more proactive Cortana. 

“Voice on Cortana is just part of a continuing promise, just making it easier,” said Ash.  

Leave a Comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Scroll to Top