Mark Zuckerberg's Jarvis AI comes a long way – but still has a long way to go

At the beginning of the year, Facebook Inc. Chief Executive Mark Zuckerberg said that one of his goals for 2016 would be to develop a general purpose AI that could run his life “kind of like Jarvis in Iron Man.” Today, he revealed his progress on his Jarvis clone, saying that he has both succeeded and failed in his quest.

“My goal was to learn about the state of artificial intelligence — where we’re further along than people realize and where we’re still a long ways off,” Zuckerberg said. “These challenges always lead me to learn more than I expected, and this one also gave me a better sense of all the internal technology Facebook engineers get to use, as well as a thorough overview of home automation. ”

AI for the home

One of the first features Zuckerberg pursued for his AI was the ability to control his connected home: turning lights off and on, changing the temperature, managing his security system and so on. Zuckerberg explained that teaching the AI control his home was actually easier than he expected, but getting the agent to communicate with all of the different devices proved to be more challenging.

Diagram of the systems used in Jarvis. (Image courtesy of Mark Zuckerberg/Facebook)

“We use a Crestron system with our lights, thermostat and doors, a Sonos system with Spotify for music, a Samsung TV, a Nest cam for Max, and of course my work is connected to Facebook’s systems,” Zuckerberg said. “I had to reverse engineer APIs for some of these to even get to the point where I could issue a command from my computer to turn the lights on or get a song to play.”

As Zuckerberg quickly realized, the Internet of Things is still a bit of a Wild West when it comes to APIs and programming standards, and getting devices to talk to one another is not always a straightforward process. Zuckerberg noted that unless manufacturers adopt some sort of industry standards or common APIs, this diversity will continue to pose a problem for end users.

Talking to your AI

Another important feature for Zuckerberg’s Jarvis clone was the ability to understand natural human speech. In Iron Man and other Marvel movies, Tony Stark communicates with Jarvis as he would with a real person. But while natural language processing has seen major improvements in the last few years, Zuckerberg noted that the technology is still not perfect.

“Understanding context is important for any AI,” Zuckerberg said. “For example, when I tell it to turn the AC up in ‘my office,’ that means something completely different from when [my wife] Priscilla tells it the exact same thing. That one caused some issues!”

Zuckerberg noted that open-ended requests work better when the AI has more context, so it is important for the AI to understand who is making the request. The more the AI understands about a specific person, the better it will be able to understand vague commands like “play music like Adele.”

Seeing the world with computer vision

In the Iron Man movies, Jarvis is essentially an AI butler, so in addition to answering basic requests, Zuckerberg’s AI also needed to understand the world much as a human does. One of the biggest requirements for this feature is the ability for the AI to more or less “see” the world using computer vision.

An example of using face recognition to let a friend inside (Photo: Mark Zuckerberg/Facebook)

An example of using face recognition to let a friend inside (Photo: Mark Zuckerberg/Facebook)

This is an area of AI research in which Facebook excels, as the social network already uses computer vision for a wide range of tasks, including offering recommendations for which friends to tag in pictures. For his home use, Zuckerberg installed a number of cameras that could capture multiple angles of a location, giving the AI more information to work with.

“AI systems today cannot identify people from the back of their heads, so having a few angles ensures we see the person’s face,” Zuckerberg said. “I built a simple server that continuously watches the cameras and runs a two step process: first, it runs face detection to see if any person has come into view, and second, if it finds a face, then it runs face recognition to identify who the person is. Once it identifies the person, it checks a list to confirm I’m expecting that person, and if I am then it will let them in and tell me they’re here.”

Filling the gaps with Facebook Messenger

Not all of Jarvis’ skills had to be built from scratch, and a few actually drew from advancements with Facebook Messenger. Zuckerberg explained that his AI was able to take advantage of recent improvement to Messenger, including the ability to use bots to execute commands.

“I started off building a Messenger bot to communicate with Jarvis because it was so much easier than building a separate app,” Zuckerberg said. “Messenger has a simple framework for building bots, and it automatically handles many things for you — working across both iOS and Android, supporting text, image and audio content, reliably delivering push notifications, managing identity and permissions for different people, and more.”

This may sound like an ad for Messenger – and it probably is – but it shows how far Facebook Messenger has progressed in the last two years since it was officially split off from the core Facebook app.

Long way to go

While Zuckerberg’s answer to Jarvis has certainly showed the potential of AI, the Facebook CEO pointed out that we still have a long way to go.

“We are still far off from understanding how learning works,” Zuckerberg said. “Everything I did this year — natural language, face recognition, speech recognition and so on — are all variants of the same fundamental pattern recognition techniques. We know how to show a computer many examples of something so it can recognize it accurately, but we still do not know how to take an idea from one domain and apply it to something completely different.”

To put that in perspective, he said, he spent about 100 hours building Jarvis, giving him a “pretty good system that understands me and can do lots of things.” But, he added, “even if I spent 1,000 more hours, I probably wouldn’t be able to build a system that could learn completely new skills on its own — unless I made some fundamental breakthrough in the state of AI along the way.”

Image courtesy of Facebook

Leave a Comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Scroll to Top