When Apple announced the iPhone 4S on October 4, 2011, the headlines were not about its speedy A5 chip or improved camera. Instead they focused on an unusual new feature: an intelligent assistant, dubbed Siri. At first Siri, endowed with a female voice, seemed almost human in the way she understood what you said to her and responded, an advance in artificial intelligence that seemed to place us on a fast track to the Singularity. She was brilliant at fulfilling certain requests, like “Can you set the alarm for 6:30?” or “Call Diane’s mobile phone.” And she had a personality: If you asked her if there was a God, she would demur with deft wisdom. “My policy is the separation of spirit and silicon,” she’d say.
Over the next few months, however, Siri’s limitations became apparent. Ask her to book a plane trip and she would point to travel websites—but she wouldn’t give flight options, let alone secure you a seat. Ask her to buy a copy of Lee Child’s new book and she would draw a blank, despite the fact that Apple sells it. Though Apple has since extended Siri’s powers—to make an OpenTable restaurant reservation, for example—she still can’t do something as simple as booking a table on the next available night in your schedule. She knows how to check your calendar and she knows how to use OpenTable. But putting those things together is, at the moment, beyond her.
Now a small team of engineers at a stealth startup called Viv Labs claims to be on the verge of realizing an advanced form of AI that removes those limitations. Whereas Siri can only perform tasks that Apple engineers explicitly implement, this new program, they say, will be able to teach itself, giving it almost limitless capabilities. In time, they assert, their creation will be able to use your personal preferences and a near-infinite web of connections to answer almost any query and perform almost any function.
“Siri is chapter one of a much longer, bigger story,” says Dag Kittlaus, one of Viv’s cofounders. He should know. Before working on Viv, he helped create Siri. So did his fellow cofounders, Adam Cheyer and Chris Brigham.
For the past two years, the team has been working on Viv Labs’ product—also named Viv, after the Latin root meaning live. Their project has been draped in secrecy, but the few outsiders who have gotten a look speak about it in rapturous terms. “The vision is very significant,” says Oren Etzioni, a renowned AI expert who heads the Allen Institute for Artificial Intelligence. “If this team is successful, we are looking at the future of intelligent agents and a multibillion-dollar industry.”
Viv is not the only company competing for a share of those billions. The field of artificial intelligence has become the scene of a frantic corporate arms race, with Internet giants snapping up AI startups and talent. Google recently paid a reported $500 million for the UK deep-learning company DeepMind and has lured AI legends Geoffrey Hinton and Ray Kurzweil to its headquarters in Mountain View, California. Facebook has its own deep-learning group, led by prize hire Yann LeCun from New York University. Their goal is to build a new generation of AI that can process massive troves of data to predict and fulfill our desires.
Viv strives to be the first consumer-friendly assistant that truly achieves that promise. It wants to be not only blindingly smart and infinitely flexible but omnipresent. Viv’s creators hope that some day soon it will be embedded in a plethora of Internet-connected everyday objects. Viv founders say you’ll access its artificial intelligence as a utility, the way you draw on electricity. Simply by speaking, you will connect to what they are calling “a global brain.” And that brain can help power a million different apps and devices.
“I’m extremely proud of Siri and the impact it’s had on the world, but in many ways it could have been more,” Cheyer says. “Now I want to do something bigger than mobile, bigger than consumer, bigger than desktop or enterprise. I want to do something that could fundamentally change the way software is built.”
Viv labs is tucked behind an unmarked door on a middle floor of a generic glass office building in downtown San Jose. Visitors enter into a small suite and walk past a pool table to get to the single conference room, glimpsing on the way a handful of engineers staring into monitors on trestle tables. Once in the meeting room, Kittlaus—a product-whisperer whose career includes stints at Motorola and Apple—is usually the one to start things off.
He acknowledges that an abundance of voice-navigated systems already exists. In addition to Siri, there is Google Now, which can anticipate some of your needs, alerting you, for example, that you should leave 15 minutes sooner for the airport because of traffic delays. Microsoft, which has been pursuing machine-learning techniques for decades, recently came out with a Siri-like system called Cortana. Amazon uses voice technology in its Fire TV product.
But Kittlaus points out that all of these services are strictly limited. Cheyer elaborates: “Google Now has a huge knowledge graph—you can ask questions like ‘Where was Abraham Lincoln born?’ And it can name the city. You can also say, ‘What is the population?’ of a city and it’ll bring up a chart and answer. But you cannot say, ‘What is the population of the city where Abraham Lincoln was born?’” The system may have the data for both these components, but it has no ability to put them together, either to answer a query or to make a smart suggestion. Like Siri, it can’t do anything that coders haven’t explicitly programmed it to do.
Viv breaks through those constraints by generating its own code on the fly, no programmers required. Take a complicated command like “Give me a flight to Dallas with a seat that Shaq could fit in.” Viv will parse the sentence and then it will perform its best trick: automatically generating a quick, efficient program to link third-party sources of information together—say, Kayak, SeatGuru, and the NBA media guide—so it can identify available flights with lots of legroom. And it can do all of this in a fraction of a second.
Viv is an open system that will let innumerable businesses and applications become part of its boundless brain. The technical barriers are minimal, requiring brief “training” (in some cases, minutes) for Viv to understand the jargon of the specific topic. As Viv’s knowledge grows, so will its understanding; its creators have designed it based on three principles they call its “pillars”: It will be taught by the world, it will know more than it is taught, and it will learn something every day. As with other AI products, that teaching involves using sophisticated algorithms to interpret the language and behavior of people using the system—the more people use it, the smarter it gets. By knowing who its users are and which services they interact with, Viv can sift through that vast trove of data and find new ways to connect and manipulate the information.
Kittlaus says the end result will be a digital assistant who knows what you want before you ask for it. He envisions someone unsteadily holding a phone to his mouth outside a dive bar at 2 am and saying, “I’m drunk.” Without any elaboration, Viv would contact the user’s preferred car service, dispatch it to the address where he’s half passed out, and direct the driver to take him home. No further consciousness required.
If Kittlaus is in some ways the Steve Jobs of Viv—he is the only non-engineer on the 10-person team and its main voice on strategy and marketing—Cheyer is the company’s Steve Wozniak, the project’s key scientific mind. Unlike the whimsical creator of the Apple II, though, Cheyer is aggressively analytical in every facet of his life, even beyond the workbench. As a kid, he was a Rubik’s Cube champion, averaging 26 seconds a solution. When he encountered programming, he dove in headfirst. “I felt that computers were invented for me,” he says. And while in high school he discovered a regimen to force the world to bend to his will. “I live my life by what I call verbally stated goals,” he says. “I crystallize a feeling, a need, into words. I think about the words, and I tell everyone I meet, ‘This is what I’m doing.’ I say it, and then I believe it. By telling people, you’re committed to it, and they help you. And it works. ”
He says he used the technique to land his early computing jobs, including the most significant—at SRI International, a Menlo Park think tank that invented the concept of computer windows and the mouse. It was there, in the early 2000s, that Cheyer led the engineering of a Darpa-backed AI effort to build “a humanlike system that could sense the world, understand it, reason about it, plan, communicate, and act.” The SRI-led team built what it called a Cognitive Assistant that Learns and Organizes, or CALO. They set some AI high-water marks, not least being the system’s ability to understand natural language. As the five-year program wound down, it was unclear what would happen next.
That was when Kittlaus, who had quit his job at Motorola, showed up at SRI as an entrepreneur in residence. When he saw a CALO-related prototype, he told Cheyer he could definitely build a business from it, calling it the perfect complement to the just-released iPhone. In 2007, with SRI’s blessing, they licensed the technology for a startup, taking on a third cofounder, an AI expert named Tom Gruber, and eventually renaming the system Siri.
The small team, which grew to include Chris Brigham, an engineer who had impressed Cheyer on CALO, moved to San Jose and worked for two years to get things right. “One of the hardest parts was the natural language understanding,” Cheyer says. Ultimately they had an iPhone app that could perform a host of interesting tasks—call a cab, book a table, get movie tickets—and carry on a conversation with brio. They released it publicly to users in February 2010. Three weeks later, Steve Jobs called. He wanted to buy the company.
“I was shocked at how well he knew our app,” Cheyer says. At first they declined to sell, but Jobs persisted. His winning argument was that Apple could expose Siri to a far wider audience than a startup could reach. He promised to promote it as a key element on every iPhone. Apple bought the company in April 2010 for a reported $200 million.
The core Siri team came to Apple with the project. But as Siri was honed into a product that millions could use in multiple languages, some members of the original team reportedly had difficulties with executives who were less respectful of their vision than Jobs was. Kittlaus left Apple the day after the launch—the day Steve Jobs died. Cheyer departed several months later. “I do feel if Steve were alive, I would still be at Apple,” Cheyer says. “I’ll leave it at that.” (Gruber, the third Siri cofounder, remains at Apple.)
After several months, Kittlaus got back in touch with Cheyer and Brigham. They asked one another what they thought the world would be like in five years. As they drew ideas on a whiteboard in Kittlaus’ house, Brigham brought up the idea of a program that could put the things it knows together in new ways. As talks continued, they lit on the concept of a cloud-based intelligence, a global brain. “The only way to make this ubiquitous conversational assistant is to open it up to third parties to allow everyone to plug into it,” Brigham says.
In retrospect, they were re-creating Siri as it might have evolved had Apple never bought it. Before the sale, Siri had partnered with around 45 services, from AllMenus.com to Yahoo; Apple had rolled Siri out with less than half a dozen. “Siri in 2014 is less capable than it was in 2010,” says Gary Morgenthaler, one of the funders of the original app.
Cheyer and Brigham tapped experts in various AI and coding niches to fill out their small group. To produce some of the toughest parts—the architecture to allow Viv to understand language and write its own programs—they brought in Mark Gabel from the University of Texas at Dallas. Another key hire was David Gondek, one of the creators of IBM’S Watson.
Funding came from Solina Chau, the partner (in business and otherwise) of the richest man in China, Li Ka-shing. Chau runs the venture firm Horizons Ventures. In addition to investing in Facebook, DeepMind, and
Summly (bought by Yahoo), it helped fund the original Siri. When Viv’s founders asked Chau for $10 million, she said, “I’m in. Do you want me to wire it now?”
It’s early May, and Kittlaus is addressing the team at its weekly engineering meeting. “You can see the progress,” he tells the group, “see it get closer to the point where it just works.” Each engineer delineates the advances they’ve made and next steps. One explains how he has been refining Viv’s response to “Get me a ticket to the cheapest flight from SFO to Charles de Gaulle on July 2, with a return flight the following Monday.” In the past week, the engineer added an airplane-seating database. Using a laptop-based prototype of Viv that displays a virtual phone screen, he speaks into the microphone. Lufthansa Flight 455 fits the bill. “Seat 61G is available according to your preferences,” Viv replies, then purchases the seat using a credit card.
Viv’s founders don’t see it as just one product tied to a hardware manufacturer. They see it as a service that can be licensed. They imagine that everyone from TV manufacturers and car companies to app developers will want to incorporate Viv’s AI, just as PC manufacturers once clamored to boast of their Intel microprocessors. They envision its icon joining the pantheon of familiar symbols like Power On, Wi-Fi, and Bluetooth.
“Intelligence becomes a utility,” Kittlaus says. “Boy, wouldn’t it be nice if you could talk to everything, and it knew you, and it knew everything about you, and it could do everything?”
That would also be nice because it just might provide Viv with a business model. Kittlaus thinks Viv could be instrumental in what he calls “the referral economy.” He cites a factoid about Match.com that he learned from its CEO: The company arranges 50,000 dates a day. “What Match.com isn’t able to do is say, ‘Let me get you tickets for something. Would you like me to book a table? Do you want me to send Uber to pick her up? Do you want me to have flowers sent to the table?’” Viv could provide all those services—in exchange for a cut of the transactions that resulted.
Building that ecosystem will be a difficult task, one that Viv Labs could hasten considerably by selling out to one of the Internet giants. “Let me just cut through all the usual founder bullshit,” Kittlaus says. “What we’re really after is ubiquity. We want this to be everywhere, and we’re going to consider all paths along those lines.” To some associated with Viv Labs, selling the company would seem like a tired rerun. “I’m deeply hoping they build it,” says Bart Swanson, a Horizons adviser on Viv Labs’ board. “They will be able to control it only if they do it themselves.”
Whether they will succeed, of course, is not certain. “Viv is potentially very big, but it’s all still potential,” says Morgenthaler, the original Siri funder. A big challenge, he says, will be whether the thousands of third-party components work together—or whether they clash, leading to a confused Viv that makes boneheaded errors. Can Viv get it right? “The jury is out, but I have very high confidence,” he says. “I only have doubt as to when and how.”
Most of the carefully chosen outsiders who have seen early demos are similarly confident. One is Vishal Sharma, who until recently was VP of product for Google Now. When Cheyer showed him how Viv located the closest bottle of wine that paired well with a dish, he was blown away. “I don’t know any system in the world that could answer a question like that,” he says. “Many things can go wrong, but I would like to see something like this exist.”
Indeed, many things have to go right for Viv to make good on its founders’ promises. It has to prove that its code-making skills can scale to include petabytes of data. It has to continually get smarter through omnivorous learning. It has to win users despite not having a preexisting base like Google and Apple have. It has to lure developers who are already stressed adapting their wares to multiple platforms. And it has to be as seductive as Scarlett Johansson in Her so that people are comfortable sharing their personal information with a robot that might become one of the most important forces in their lives.
The inventors of Siri are confident that their next creation will eclipse the first. But whether and when that will happen is a question that even Viv herself cannot answer. Yet.