Schlagwort-Archive: Artificial Intelligence

two giants of AI team up to prevent the robot apocalypse

THERE’S NOTHING NEW about worrying that superintelligent machines may endanger humanity, but the idea has lately become hard to avoid.

A spurt of progress in artificial intelligence as well as comments by figures such as Bill Gates—who declared himself “in the camp that is concerned about superintelligence”—have given new traction to nightmare scenarios featuring supersmart software. Now two leading centers in the current AI boom are trying to bring discussion about the dangers of smart machines down to Earth. Google’s DeepMind, the unit behind the company’s artificial Go champion, and OpenAI, the nonprofit lab funded in part by Tesla’s Elon Musk, have teamed up to make practical progress on a problem they argue has attracted too many headlines and too few practical ideas: How do you make smart software that doesn’t go rogue?

“If you’re worried about bad things happening, the best thing we can do is study the relatively mundane things that go wrong in AI systems today,” says Dario Amodei, a curly-haired researcher on OpenAI’s small team working on AI safety. „That seems less scary and a lot saner than kind of saying, ‘You know, there’s this problem that we might have in 50 years.’” OpenAI and DeepMind contributed to a position paper last summer calling for more concrete workon near-term safety challenges in AI.

A new paper from the two organizations on a machine learning system that uses pointers from humans to learn a new task, rather than figuring out its own—potentially unpredictable—approach, follows through on that. Amodei says the project shows it’s possible to do practical work right now on making machine learning systems less able to produce nasty surprises. (The project could be seen as Musk’s money going roughly where his mouth has already been; in a 2014 appearance at MIT, he described work on AI as “summoning the demon.”)

None of DeepMind’s researchers were available to comment, but spokesperson Jonathan Fildes wrote in an email that the company hopes the continuing collaboration will inspire others to work on making machine learning less likely to misbehave. “In the area of AI safety, we need to establish best practices that are adopted across as many organizations as possible,” he wrote.

The first problem OpenAI and DeepMind took on is that software powered by so-called reinforcement learning doesn’t always do what its masters want it to do—and sometimes kind of cheats. The technique, which is hot in AI right now, has software figure out a task by experimenting with different actions and sticking with those that maximize a virtual reward or score, meted out by a piece of code that works like a mathematical motivator. It was instrumental to the victory of DeepMind’s AlphaGo over human champions at the board game Go, and is showing promise in making robots better at manipulating objects.

But crafting the mathematical motivator, or reward function, such that the system will do the right thing is not easy. For complex tasks with many steps, it’s mind-bogglingly difficult—imagine trying to mathematically define a scoring system for tidying up your bedroom—and even for seemingly simple ones results can be surprising. When OpenAI set a reinforcement learning agent to play boat racing game CoastRunners, for example, it surprised its creators by figuring out a way to score points by driving in circles rather than completing the course.

DeepMind and OpenAI’s solution is to have reinforcement learning software take feedback from human trainers instead, and use their input to define its virtual reward system. They hired contractors to give feedback to AI agents via an interface that repeatedly asks which of two short video clips of the AI agent at work is closest to the desired behavior.

This simple simulated robot, called a Hopper, learned to do a backflip after receiving 900 of those virtual thumbs-up verdicts from the AI trainers while it tried different movements. With thousands of bits of feedback, a version of the system learned to play Atari games such as Pong and got to be better than a human player at the driving game Enduro. Right now this approach requires too much human supervision to be very practical at eliciting complex tasks, but Amodei says results already hint at how this could be a powerful way to make AI systems more aligned with what humans want of them.

It took less than an hour of humans giving feedback to get Hopper to land that backflip, compared to the two hours it took an OpenAI researcher to craft a reward function that ultimately produced a much less elegant flip. “It looks super awkward and kind of twitchy,” says Amodei. “The backflip we trained from human feedback is better because what’s a good backflip is kind of an aesthetic human judgment.” You can see how complex tasks such as cleaning your home might also be easier to specify correctly with a dash of human feedback than with code alone.


Making AI systems that can soak up goals and motivations from humans has emerged as a major theme in the expanding project of making machines that are both safe and smart. For example, researchers affiliated with UC Berkeley’s Center for Human-Compatible AI are experimenting with getting robots such as autonomous cars or home assistants to take advice or physical guidance from people. “Objectives shouldn’t be a thing you just write down for a robot; they should actually come from people in a collaborative process,” says Anca Dragan, coleader of the center.

She hopes the idea can catch on in the industry beyond DeepMind and OpenAI’s explorations, and says companies already run into problems that might be prevented by infusing some human judgement into AI systems. In 2015, Google hurriedly tweaked its photo recognition service after it tagged photos of black people as gorillas.

Longer term, Amodei says, spending the next few years working on making existing, modestly smart machine learning systems more aligned with human goals could also lay the groundwork for our potential future face-off with superintelligence. “When, someday, we do face very powerful AI systems, we can really be experts in how to make them interact with humans,” he says. If it happens, perhaps the first superintelligent machine to open its electronic eyes will gaze at us with empathy.

Original Source from:

Machine Learning – Basics – Einsatzgebiete – Technik

Machine Learning, Deep Learning, Cognitive Computing – Technologien der Künstlichen Intelligenz verbreiten sich rasant. Hintergrund ist, dass heute die Rechen- und Speicherkapazitäten zur Verfügung stehen, die KI-Szenarien möglich machen. Ein Überblick.
  • Machine Learning hilft, Muster in großen Datenbeständen zu erkennen und daraus Erkenntnisse zu gewinnen
  • Die Einsatzszenarien reichen von der Spamanalyse über Stauprognosen bis hin zur medizinischen Diagnostik
  • Technische Grundlage ist eine Cloud-basierte Digital Infrastructure Platform,3330413,3330418,3330420

Künstliche Intelligenz und Machine Learning (ML) sind keine neuen Technologien, doch im praktischen Einsatz spielen sie erst jetzt eine wichtige Rolle. Woran liegt das? Wichtigste Voraussetzung für lernende Systeme und entsprechende Algorithmen sind ausreichende Rechenkapazitäten und der Zugriff auf riesige Datenmengen – egal ob es sich um Kunden-, Log- oder Sensordaten handelt. Sie sind für das Training der Algorithmen und die Modellbildung unverzichtbar – und sie stehen mit Public- und Private-Cloud-Infrastrukturen zur Verfügung.

Bildanalyse und -erkennung ist das wichtigste Machine-Learning-Thema, doch die Spracherkennung und -verarbeitung ist schwer im Kommen.
Bildanalyse und -erkennung ist das wichtigste Machine-Learning-Thema, doch die Spracherkennung und -verarbeitung ist schwer im Kommen.
Foto: Crisp Research, Kassel


Die Analysten von Crisp Research sind im Rahmen einer umfassenden Studie gemeinsam mit The unbelievable Machine Company und Hewlett-Packard Enterprise (HPE) der Frage nachgegangen, welche Rolle Machine Learning heute und in Zukunft im Unternehmenseinsatz spielen wird. Dabei zeigt sich, dass deutsche Unternehmen hier schon recht weit fortgeschritten sind. Bereits ein Fünftel setzt ML-Technologien aktiv ein, 64 Prozent beschäftigen sich intensiv mit dem Thema und vier von fünf Befragten sagen sogar, ML werde irgendwann eine der Kerntechnologien des vollständig digitalisierten Unternehmens sein.

Muster erkennen und Vorhersagen treffen

ML-Algorithmen helfen den Menschen, Muster in vorhandenen Datenbeständen zu erkennen, Vorhersagen zu treffen oder Daten zu klassifizieren. Mit mathematischen Modellen können neue Erkenntnisse auf Grundlage dieser Muster gewonnen werden. Das gilt für viele Lebens- und Geschäftsbereiche. Oftmals profitieren Internet-Nutzer längst davon, ohne über die Technologie im Hintergrund nachzudenken.

Das Spektrum der Anwendungen reicht von Musik- und Filmempfehlungen im privaten Umfeld bis hin zur Verbesserung von Marketing-Kampagnen, Kundenservices oder auch Logistikrouten im geschäftlichen Bereich. Dafür steht ein breites Spektrum an ML-Verfahren zur Verfügung, darunter Lineare Regression, Instanzenbasiertes Lernen, Entscheidungs-Baum-Algorithmen, Bayesche Statistik, Clusteranalyse, Neuronale Netzwerke, Deep Learning und Verfahren zur Dimensionsreduktion.

Die Anwendungsbereiche sind vielfältig und teilweise bekannt. Man denke etwa an Spam-Erkennung, die Personalisierung von Inhalten, das Klassifizieren von Dokumenten, Sentiment-Analysen, Prognosen der Kundenabwanderung, E-Mail-Klassifizierung, Analyse von Upselling-Möglichkeiten, Stauprognosen, Genomanalysen, medizinische Diagnostik, Chatbots und vieles mehr. Für nahezu alle Branchen und Unternehmenstypen ergeben sich also Gelegenheiten.

Moderne IT-Plattformen unterstützen KI

Machine Learning ist laut Crisp Research idealerweise Bestandteil einer modernen, skalierungsfähigen und flexiblen IT-Infrastruktur – einer „Digital Infrastructure Platform“. Diese zeichnet sich durch Elastizität, Automatisierung, eine API-basierte Architektur und Agilität aus. Eine solche Plattform ist in der Regel Cloud-basiert aufgesetzt und dient als Grundlage für die Entwicklung und den Betrieb neuer digitaler Anwendungen und Prozesse. Sie bietet eine offene Architektur, Programmierschnittstellen (APIs), um externe Services zu integrieren, die Unterstützung von DevOps-Konzepten sowie moderne Methoden für kurze Release- und Innovationszyklen.

Die Verarbeitung und Analyse großer Datenmengen ist eine Kernaufgabe einer solchen Digital Infrastructure Platform. Deshalb müssen die IT-Verantwortlichen Sorge tragen, dass ihre IT mit unterschiedlichen Verfahren der Künstlichen Intelligenz umgehen kann. Server-, Storage- und Netzwerk-Infrastrukturen müssen auf neue ML-basierte Workloads ausgelegt sein. Auch das Daten-Management muss vorbereitet sein, damit ML-as-a-Service-Angebote in der Cloud genutzt werden können.

Im Kontext von ML haben sich in den vergangenen Monaten auch alternative Hardwarekomponenten durchgesetzt, etwa GPU-basierte Cluster von Nvidia, Googles Tensor Processing Unit (TPU) oder IBMs TrueNorth-Prozessor. Unternehmen müssen sich entscheiden, ob sie hier selbst investieren oder die Angebote entsprechender Cloud-Provider nutzen wollen.

Einer der großen Anwendungsbereiche für ML ist die Spracherkennung und -verarbeitung. Amazons Alexa zieht gerade in die Haushalte ein, Microsoft, Google, Facebook und IBM haben hier einen Großteil ihrer Forschungs- und Entwicklungsgelder investiert sowie spezialisierte Firmen zugekauft. Es lässt sich absehen, dass natürlichsprachige Kommunikation an der Kundenschnittstelle selbstverständlicher wird. Auch die Bedienung von digitalen Produkten und Enterprise-IT-Lösungen wird via Sprachbefehl möglich sein. Das hat sowohl Auswirkungen auf das Customer-Frontend als auch auf das IT-Backend.

Niedrige Einstiegshürden in Machine Learning

Da die großen Cloud-Anbieter ML-Services und -Produkte in ihr Leistungsportfolio aufgenommen haben, ist es für Anwender relativ einfach, einen Einstieg zu finden. Amazon Machine Learning, Microsoft Azure Machine Learning, IBM Bluemix und Google Machine Learning erlauben einen kostengünstigen Zugang zu entsprechenden Diensten über die Public Cloud. Anwender brauchen also keinen eigenen Supercomputer, kein Team von Statistikexperten und kein dediziertes Infrastruktur-Management mehr. Mit ein paar Kommandos über die APIs der großen Public-Cloud-Provider können sie loslegen.

Anwender brauchen vor allem Hilfe bei der Datenexploration.
Anwender brauchen vor allem Hilfe bei der Datenexploration.
Foto: Crisp Research, Kassel


Sie finden dort unterschiedliche Machine-Learning-Verfahren sowie Dienste und Tools wie etwa grafische Programmiermodelle und Storage-Dienste vor. Je mehr sie sich darauf einlassen, desto größer wird allerdings das Risiko eines Vendor-Lock-ins. Deshalb sollten sich Anwender vor dem Start Gedanken über ihre Strategie machen. IT-Dienstleister und Managed-Service-Provider können ebenso ML-Systeme und Infrastrukturen bereitstellen und betreiben, so dass Unabhängigkeit von den Public-Cloud-Providern und ihren SLAs ebenso möglich ist.

Verschiedene Spielarten der KI

Machine Learning, Deep Learning, Cognitive Computing – derzeit kursieren eine Reihe von KI-Begriffen, deren Abgrenzung voneinander nicht ganz einfach ist. Crisp Research wählt dafür die Dimensionen „Clarity of Purpose“ (Orientierung am Einsatzweck) und „Degree of Autonomy“ (Grad der Autonomie). ML-Systeme sind derzeit größtenteils auf Einsatzzwecke hin entwickelt und trainiert. Sie erkennen beispielsweise im Fertigungsprozess fehlerhafte Produkte im Rahmen einer Qualitätskontrolle. Ihre Aufgabe ist klar umrissen, es gibt keine Autonomie.

Deep-Learning-Systeme hingegen sind in der Lage, mittels Neuronaler Netze eigenständig zu lernen. Simulierte Neuronen werden in vielen Schichten übereinander modelliert und angeordnet. Jede Ebene des Netzwerks erfüllt dabei eigenständig bestimmte Aufgaben, etwa das Erkennen von Kanten. Diese Information wird eigenständig an die nächste Ebene weitergegeben und fließt dort in die Verarbeitung ein. Im Zusammenspiel mit großen Mengen an Trainingsdaten lernen solche Netzwerke, bestimmte Aufgaben zu erledigen – etwa das Identifizieren von Krebszellen in medizinischen Bildern.

Deep-Learning-Systeme arbeiten autonomer

Deep-Learning-Systeme arbeiten also deutlich autonomer als ML-Systeme, da die Neuronalen Netzwerke darauf trainiert werden, selbständig zu lernen und Entscheidungen zu treffen, die von außen nicht unbedingt nachvollziehbar sind.

Als dritte Spielart der KI gilt das Cognitive Computing, das insbesondere von IBM mit seiner Watson-Technologie propagiert wird. Solche Systeme zeichnen sich dadurch aus, dass sie in einer Assistenzfunktion oder gar als Ersatz des Menschen Aufgaben übernehmen und Entscheidungen treffen und dabei mit Ambiguität und Unschärfe umgehen können. Als Beispiele können das Schadensfall-Management in einer Versicherung dienen, eine Service-Hotline oder die Diagnostik im Krankenhaus.

Auch wenn hier bereits ein hohes Maß an Autonomie erreicht werden kann, ist der Weg zu echter Künstlicher Intelligenz mit autonomen kognitiven Fähigkeiten noch weit. Die Wissenschaft beschäftigt sich aber intensiv damit und streitet darüber, ob und wann dieses Ziel erreicht werden kann. Derweil sind Unternehmen gut beraten, sich mit den machbaren Use Cases zu beschäftigen, von denen es bereits eine Menge gibt.

Im Zuge des Digitalisierungstrends kommt in vielen Unternehmen Analytics auf die Tagesordnung – und damit auch Machine Learning und Deep Learning. Jetzt geht es darum, den Datenschatz zu heben.
  • Viele Unternehmen haben Data Lakes mit strukturierten und unstrukturierten Daten aufgebaut. Jetzt gilt es, etwas daraus zu machen
  • Einsatzgebiete für Machine Learning sind etwa Prozessverbesserungen sowie eine bessere Kundenansprache und ein möglichst effizienter Support
  • In vielen Branchen ist der Abstand zwischen Vorreitern und Nachzüglern riesig

Die Phantasien und Visionen rund um die digitale Zukunft kennen derzeit keine Grenzen. Vollautomatisierte Produktionsstraßen, autonome Verkehrssysteme, intelligente digitale Assistenten – es vergeht kaum ein Tag, an dem nicht neue Szenarien diskutiert werden. Dadurch fühlen sich viele Firmen unter Druck gesetzt. Sie arbeiten am „digitalen Unternehmen“ und entdecken ihre Daten als Grundlagen für neue Geschäftsmodelle und Services. So gewinnt Analytics an Bedeutung – und mit der Analytics-Strategie kommen KI und Machine Learning (ML) auf die Tagesordnung.

Aus diesen Gründen beschäftigen sich Anwender mit Machine Learning.
Aus diesen Gründen beschäftigen sich Anwender mit Machine Learning.
Foto: Crisp Research


IT- und Digitalisierungsentscheider vermuten ein enormes Potenzial hinter dem Thema Machine Learning. Eine Umfrage, die das Analystenhaus Crisp Research unterstützt von The unbelievable Machine Company und Hewlett-Packard Enterprise (HPE) auf den Weg gebracht hat, zeigt, dass nur drei Prozent der knapp 250 Befragten ML für einen Marketing-Hype halten. Ein Drittel bezeichnet ML-Verfahren in begrenzten Einsatzbereichen als sinnvoll, sogar 43 Prozent sind überzeugt davon, dass ML ein wichtiger Aspekt künftiger Big-Data- und Analytics-Strategien wird.

Wie die Initiatoren der Studie feststellen, ist das kein überraschendes Ergebnis. Die meisten Unternehmen haben im großen Stil in Big-Data-Infrastrukturen und eigene Data Lakes investiert, um ihre Unternehmensdaten zusammenzuführen und auswertbar zu machen. ML ermöglicht einen hohen Automationsgrad in der Datenanalyse und hilft somit, den verborgenen Schatz zu heben. Daten gelten als großes Asset, doch den Beweis dafür haben viele Firmen noch nicht gebracht. Technologien und Use Cases rund um Machine Learning versprechen Abhilfe.

Immenses Innovationspotenzial

Immerhin 16 Prozent der befragten sehen ML sogar als neue „Kerntechnologie eines vollständig digitalen Unternehmens“. Das Innovations- und Gestaltungspotenzial scheint also immens, wenngleich viele Probleme rund um Datenqualität, Governance, API-Management, Infrastruktur und vor allem Personal den Trend noch bremsen.

Rund 34 Prozent der Befragten beschäftigen sich mit ML, weil sie ihre internen Prozesse in der Produktion, Logistik oder im Qualitätsmanagement verbessern wollen. Sie erheben beispielsweise Daten im Produktionsablauf, um ihre Fertigung optimieren zu können. Fast ebenso viele wollen Initiativen rund um die Customer Experience vorantreiben – etwa in E-Commerce, Marketing oder im Bereich der Portale und Apps. Sie versprechen sich davon beispielsweise eine personalisierte Kundenansprache, um Produkte oder Dienste zielgerichteter an den Konsumenten bringen zu können. Mit 19 Prozent ist die Gruppe derer, die Wartungs- und Supportleistungen optimieren wollen (Predictive Maintenance), etwas kleiner. Hinzu kommen Betriebe, die sich grundsätzlich mit neuen Technologien beschäftigen (28 Prozent) oder durch Berater und Analysten auf das Thema aufmerksam geworden sind (27 Prozent).

Elementar für selbstfahrende Autos

Das Nutzungsverhalten von ML ist nicht nur zwischen, sondern auch innerhalb der Branchen sehr unterschiedlich ausgeprägt. In der Automobilbranche etwa gibt es große Abstände zwischen den Vorreitern und den Nachzüglern. Für die Entwicklung und Produktion selbstfahrender Autos sind Bild- und Videoanalyse in Echtzeit sowie statistische Verfahren und mathematische Modelle aus Machine Learning und Deep Learning weit verbreitet. Einige Verfahren werden auch dazu verwendet, Fabrikationsfehler in der Fertigung zu erkennen.

Der Anteil der Innovatoren, die ML bereits in weiten Teilen einsetzen, ist in der Automobilbranche mit rund 20 Prozent am größten. Demgegenüber stehen allerdings 60 Prozent, die sich zwar mit ML beschäftigen, aber noch in der Evaluierungs- und Planungsphase stecken. So zeigt sich, dass in der Autobranche einige Leuchttürme das Bild prägen, von einer flächendeckenden Adaption aber nicht die Rede sein kann.

Status der Branchen bei der Einführung von Machine-Learning-Technologien
Status der Branchen bei der Einführung von Machine-Learning-Technologien
Foto: Crisp Research


Auch die Maschinen- und Anlagenbauer stecken noch zur Hälfte (53 Prozent) in der Evaluierungs- und Planungsphase. Ein knappe Drittel nutzt ML in ausgewählten Anwendungsbereichen produktiv und 18 Prozent bauen derzeit Prototypen. Weiter sind die Handels- und Konsumgüterfirmen, die zu 44 Prozent dabei sind, ML in ersten Projekten und Prototypen zu erproben. Das überrascht insofern nicht, als diese Firmen in der Regel gute gepflegte Datenbestände haben und viel Erfahrung mit Business Intelligence und Data Warehouses besitzen. Gelingt es ihnen, Preisstrategien, Warenverfügbarkeiten oder Marketing-Kampagnen messbar zu verbessern, wird ML als willkommenes Innovationsinstrument bestehender Big-Data-Strategien gesehen.

Gleiches gilt für die IT-, TK- und Medienbranche: Dort kommen ML-Verfahren etwa zum Ausspielen von Online-Werbung, Berechnen von Kaufwahrscheinlichkeiten (Conversion Rates) oder dem Personalisieren von Webinhalten und Einkaufsempfehlungen längst zum Einsatz. Bei den professionellen Dienstleistern spielen das Messen und Verbessern der Kundenbindung, der Dienstleistungsqualität und der Termintreue eine wichtige Rolle, sind das doch die wettbewerbsdifferenzierenden Faktoren.

IT-Abteilungen sind zuständig

Knapp 60 Prozent der befragten Entscheider gaben an, ihre IT-Abteilung sei federführend zuständig, wenn es um ML-Projekte gehe. Den Studienautoren von Crisp zufolge liegt das an der hohen technologischen Komplexität des Themas. Neben mathematischen und statistischen Skills ist demnach auch eine große Bandbreite an Fertigkeiten im Bereich der IT-Operations gefragt. Hinzu kommen die BI- und Analytics-Fähigkeiten, die hier oftmals angesiedelt sind.

Doch auch Fachabteilungen wie Logistik und Produktion sind mit im Boot, weil sie in der Regel die Prozessverbesserungs- und IoT-Szenarien vorantreiben. Die großen Mengen an Maschinen-, Produktions-, Logistik- sowie sonstigen Sensor- und Log-Daten müssen auf Muster und Korrelationen hin abgefragt werden – eine Aufgabe für Fertigung und Logistik.

Und schließlich sind auch Kundenservice und -support führende Instanzen, wenn es um die Einführung von ML geht. Sie wollen die personalisierte Kundeninteraktion vorantreiben und sammeln in ihren Bereichen die Text-, Bild- und Audiodaten, die das Potenzial für Analysen bieten. Interessant an der Umfrage ist indes, dass Marketing und Kommunikation von ML oft nichts wissen wollen, obwohl sie reichlich Einsatzszenarien hätten. Sie könnten etwa Kundenbeziehungen auswerten und die Kundenbindung verbessern, automatisiertes Medien-Monitoring vorantreiben oder das Social Web mit Sentiment-Analysen bearbeiten. All das findet aber relativ selten statt, was Crisp Research mit der traditionell „passiven, technologieagnostischen Rolle“ dieser Abteilungen begründet. Marketing- und Kommunikationsabteilungen treten demnach meist als „Anforderer“ und interne Kunden auf, nicht als diejenigen, die tiefer in Technologien einsteigen.

Welche Machine-Learning-Funktionen benötigen Unternehmen wofür? Und wann kommen welche Lernstile, Frameworks, Programmiersprachen und Algorithmen zum Einsatz? Meistens beginnen Firmen mit Bildanalyse und -erkennung.
  • Bild- und Spracherkennung sind die wichtigsten Anwendungen im Bereich Machine Learning
  • Geht es um die Plattformauswahl, wird die Public Cloud zunehmend wichtig
  • Grafikprozessoren setzen sich im Bereich Deep Learning durch

Wie die Analysten von Crisp Research im Rahmen einer umfassenden Studie gemeinsam mit The unbelievable Machine Company und Hewlett-Packard Enterprise (HPE) schreiben, gibt die Mehrheit der rund 250 befragten IT-Entscheider an, mit der Bildanalyse und -erkennung in das komplexe Thema Machine Learning (ML) einzusteigen. So werden beispielsweise in Industrieunternehmen Fremdkörper auf Förderbändern identifiziert, fehlerhafte Einfärbungen von Produkten entdeckt oder von autonomen Fahrzeugen Straßenschilder erkannt.

Diese Machine-Learning-Funktionen nutzen die Anwender.
Diese Machine-Learning-Funktionen nutzen die Anwender.
Foto: Crisp Research, Kassel


Wichtig sind ML-Verfahren auch zur Sprachsteuerung und -erkennung (42 Prozent). Eng damit verbunden sind Natural Language Processing und Textanalyse – also das semantische Erfassen von Sprachinhalten und Texten. Heute beschäftigen sich 35 Prozent der Unternehmen damit, Tendenz steigend. Hintergrund ist, dass konversationsbasierte Benutzerschnittstellen derzeit einen Aufschwung erleben.

Chatbots, Gesichtserkennung, Sentiment-Analyse und mehr

Machine Learning kommt außerdem bei rund einem Drittel der Befragten im Zusammenhang mit der Entwicklung digitaler Assistenten, sogenannter Bots zum Einsatz. Weitere Einsatzgebiete sind Gesichtserkennung, die Sentiment-Analyse und besondere Verfahren der Mustererkennung – oft in einem unternehmens- oder branchenspezifischen Kontext. Die Spracherkennung ist vor allem für Marketingentscheider interessant, da digitale Assistenten für die Automatisierung von Call-Center-Abläufen oder die Echtzeit-Kommunikation mit dem Kunden an Bedeutung gewinnen. Auch die Personalisierung von Produktempfehlungen ist ein wichtiger Use-Case.

Ein Blick auf die Nutzungsszenarien von ML-Technologien zeigt, dass Bildanalyse und -erkennung heute weit vorne rangieren, doch die Zukunft gehört eher der Sprachsteuerung und – erkennung, ebenso der Textanalyse und Natural Language Processing (NLP). Insgesamt werden ML-Technologien auf breiter Front an Bedeutung gewinne, auch etwa im Bereich der Videoanalyse, der Sentiment-Analyse, der Gesichtserkennung sowie beim Einsatz intelligenter Bots.

Schaut man auf die einzelnen Unternehmensbereiche, so wird deutlich, dass sich die für Customer Experience Management zuständigen Einheiten ML-Technologien vor allem im Bereich der Kundensegmentierung, der personalisierten Produktempfehlung, der Spracherkennung und teilweise auch der Gesichtserkennung bedienen. IT-Abteilungen treiben damit E-Mail-Klassifizierung, Spam-Erkennung, Diagnosesysteme und das Klassifizieren von Dokumenten voran. Die Produktion ist vor allem auf Prozessverbesserungen aus, während Kundendienst und Support ihre Diagnoseysteme vorantreiben und an automatisierten Lösungsempfehlungen arbeiten. Auch Call-Center-Gespräche werden bereits analysiert, teilweise auch mit der Absicht, positive und negative Äußerungen der Kunden zu erkennen (Sentiment-Analyse).

Auch die Bereiche Finance und Human Resources sowie das Management generell nutzen vermehrt ML-Technologien. Wichtigstes Einsatzgebiet sind hier das Risiko-Management sowie Forecasting und Prognosen. Im HR-Bereich werden auch Trainingsempfehlungen automatisiert erstellt, Lebensläufe überprüft und das Talent-Management vorangetrieben. Im zentralen Einkauf und dem Management der Lieferanten ist die Digital Supply-Chain-Verbesserung das Kernaufgabengebiet von ML-technologie. Vermehrt werden hier auch Demand Forecastings ermittelt, Risiken im Zusammenhang mit bestimmten Lieferanten analysiert und generell Entscheidungsprozesse digital unterstützt.

Machine-Learning-Plattformen und -Produkte

Geht es um die Auswahl von Plattformen und -Produkten, spielen Lösungen aus der Public Cloud eine zunehmend wichtige Rolle (Machine Learning as a Service). Um Komplexität aus dem Wege zu gehen und weil die großen Cloud-Provider auch die maßgeblichen Innovatoren auf diesem Gebiet sind, entscheiden sich viele Anwender für diese Cloud-Lösungen. Während 38,1 der Befragten Lösungen aus der Public-Cloud bevorzugen, wählen 19,1 Prozent proprietäre Lösungen ausgesuchter Anbieter und 18,5 Prozent Open-Source-Alternativen. Der Rest verfolgt entweder eine hybride Strategie (15,5 Prozent) oder hat sich noch keine Meinung dazu gebildet (8,8 Prozent).

Welche Cloud-Angebote zu Machine Learning sind im Einsatz?
Welche Cloud-Angebote zu Machine Learning sind im Einsatz?
Foto: Crisp Research


Unter den Cloud-basierten Lösungen hat AWS den höchsten Bekanntheitsgrad: 71 Prozent der Entscheider geben an, dass ihnen Amazon in diesem Kontext bekannt sei. Auch Microsoft, Google und IBM sind den Umfrageteilnehmern zu mehr als zwei Drittel im ML-Umfeld ein Begriff. Interessanterweise nutzen aber nur 17 Prozent der befragten die AWS-Cloud-Dienste im Kontext der Evaluierung, Projektierung sowie im produktiven Betrieb für ML. Jeweils rund ein Drittel der Befragten beschäftigt sich indes mit IBM Watson, Microsoft Azure oder der Google Cloud Machine Learning Plattform.

Die Analysten nehmen an, dass dies viel mit den Marketing-Anstrengungen der Hersteller zu tun hat. IBM und Microsoft investieren demnach massiv in ihre Cognitive- beziehungsweise KI-Strategie. Beide haben einen starken Mittelstands- und Großkundenvertrieb und ein großes Partnernetzwerk. Google indes verdanke seine Position dem Image als gewaltige daten- und Analytics-Maschine, die den Markt durch viele Innovationen treibe – etwa Tensorflow, viele ML-APIs und auch eigene Hardware. Schließlich zähle aber auch HP Enterprise mit „Haven on Demand“ zu den relevanten ML-Playern und werde von 14 Prozent der Befragten genutzt.

Deep Learning ist schwieriger

Bereits in den 40er Jahren des vergangenen Jahrhunderts wurden die ersten neuronalen Lernregeln beschrieben. Die wissenschaftlichen Erkenntnisse wuchsen rasch, die Anzahl der Algorithmen ebenfalls – doch es fehlte an der notwendigen Rechenleistung, um „Rückgekoppelte Neuronale Netzwerke“ in der Fläche zu nutzen. Heute sind diese unter dem Begriff Deep Learning in aller Munde, sie könnten Bereiche wie Handschriftenerkennung, Spracherkennung, maschinelles Übersetzen oder auch automatische Bildbeschreibungen revolutionieren.

Hintergrund ist, dass eine Präzision erreicht werden kann, die menschliche Fähigkeiten im jeweiligen Zusammenhang weit übertrifft. Dabei spannen neuronale Netze Ebenen von unterschiedlicher Komplexität auf. Je mehr Daten so einem neuronalen Netz zum Trainieren zur Verfügung stehen, desto besser werden die Ergebnisse beziehungsweise die trainierte Künstliche Intelligenz. So lernt ein System beispielsweise, wie anhand einer Computer-Tomografie Krebsgeschwüre diagnostiziert werden können, die das menschliche Auge nicht so einfach sieht.

Grafikprozessoren bieten die nötige Performance

Im Bereich des Deep Learning haben sich hardwareseitig Grafikprozessoren (GPUs) wegen ihre hohen Performance als besonders geeignet erwiesen. Förderlich waren außerdem die schier unbegrenzte Rechenpower, die sich aus den Public-Cloud-Ressourcen ergibt, sowie die Verfügbarkeit großer Mengen von Daten aus den verschiedensten Anwendungsgebieten. Unternehmen nutzen bereits Deep-learning-Algorithmen, im bestimmte Merkmal in Bildern aufzuspüren, Videoanalysen vorzunehmen, Umweltparameter beim autonomen Fahren zu verarbeiten oder automatische Sprachverarbeitung voranzutreiben.

In der Crisp-Umfrage geben 48 Prozent der Teilnehmer an, von Deep Learning zumindest gehört oder gelesen zu haben. Weitere 21 Prozent sind bereits in einer konkreten Evaluationsphase. Sie haben Erkenntnisse gesammelt und arbeiten nun an konkreten Prototypen, um ihr gewünschtes Einsatzszenario zu validieren. Weitere fünf Prozent sind sogar noch einen Schritt weiter und haben bereits Deep Learning im Einsatz. Vor allem Startups und Konzerne – auch hier wieder vor allem aus dem Automotive-Sektor – haben hier die Nase vorn.

Unter den Frameworks und Bibliotheken, die für das Implementieren von Deep-Learning-Algorithmen eine Rolle spielen, spielen unter anderem Microsofts „Computational Network Toolkit“ (CNTK) sowie jede Menge Public-Cloud- und Open-Source-Lösungen eine Rolle (eine Übersicht gibt es hier

Machine Learning macht Analysen besser

Zuerst analysierten lernende Maschinen das Nutzerverhalten in Suchmaschinen, um passende Werbung anzuzeigen. Heute optimieren sie Verkehrsflüsse, die Stahlherstellung und planen die Flugzeugwartung. Experten von Allianz, Trip Advisor, GfK und Boeing erklären, wie ihnen Machine Learning hilft.,3217540

Bei der Münchener Allianz Versicherung ist Andreas Braun, Head of Global Data and Analytics, zufrieden mit den Ergebnissen seiner Experimente mit den neuen Analytics-Ansätzen aus der künstlichen Intelligenz. „Wir haben bei uns ein Ökosystem aus verschiedenen Bestandteilen im Einsatz. Big-Data-Technologien und Machine Learning bieten uns bessere Möglichkeiten, mit unseren Daten umzugehen, und liefern konsistent gute Ergebnisse“, sagte er auf der Konferenz der Yandex Data Factory zum Thema „Machine Learning and Big Data“ in Berlin. Zum Beispiel im Gebäude-Management: Zusammen mit Studenten der TU München hat die Versicherung eine App entwickelt, die eine Vielzahl von Gegenständen über Sensoren vernetzt.

„Das System kalibriert sich selbst, lernt normales Verhalten im Haus, und kann so einen Einbruch von anderen ungewöhnlichen, aber unkritischen Vorfällen unterscheiden.“ Außerdem wollen die Experten die Bilderkennung weiter verbessern. Eingereichte Fotos sollen bei Versicherungsschäden automatisch durch Maschinen beurteilt werden.

Die Experten, die der russische Suchmaschinen-Anbieter Yandex nach Berlin eingeladen hatte, tauschten sich unter dem Motto „Business Challenges“ auch über die Schwierigkeiten und Risiken rund um Machine Learning aus. Jeff Palmucci, Director of Machine Intelligence beim Reiseportal Trip Advisor, schilderte, wie sein Unternehmen maschinelles Lernen in die Geschäftsprozesse implementiert. So hilft die Technik, Restaurants und Hotels automatisiert mit passenden Tags wie „romantisch“ oder „charmant“ zu versehen, damit Suchende schnell das richtige Angebot finden. Auch um Betrug etwa bei den Bewertungen rasch zu erkennen, setzt das Portal Machine Learning ein.

Menschliches Verhalten vorhersagen

Machine Learning stellt Unternehmen vor vielfältige Herausforderungen. Nicht alle Branchen eignen sich gleich gut, erklärte Jane Zavalishina, CEO der Yandex Data Factory: „Es geht vor allem darum, menschliches Verhalten vorherzusagen.“ Bei Ergebnissen, die auf Machine Learning basieren, könne man aber durch die hohe Komplexität und die großen Datenmengen nie genau nachvollziehen, wie sie zustande gekommen sind. In der Praxis müsse man mit den Empfehlungen experimentieren, um herauszufinden, ob sie der bisherigen Vorgehensweise überlegen sind. Das gehe aus ethischen und praktischen Gründen allerdings nicht immer.

Jane Zavalishina CEO, Yandex Data Factory „Viele Unternehmen befinden sich aber noch an dem Punkt, an dem sie versuchen, Big Data Analytics überhaupt zu verstehen.“
Jane Zavalishina CEO, Yandex Data Factory „Viele Unternehmen befinden sich aber noch an dem Punkt, an dem sie versuchen, Big Data Analytics überhaupt zu verstehen.“
Foto: Yandex

In Echtzeit Web-Inhalte zu personalisieren oder Vorhersagen zu treffen, ist für die russische Suchmaschine Yandex nichts Neues. Das Wissen des Konzerns, das aus der Suchtechnik und dem kontextuellen Einspielen passender Werbung entstanden ist, und die dafür entwickelten Algorithmen stellt sie seit 2014 auch extern zur Verfügung. Zunächst probierte das Tochterunternehmen Yandex Data Factory, das Firmensitze in Moskau und Amsterdam unterhält, die Techniken maschinellen Lernens in der Wissenschaft aus – zum Beispiel, um Big-Data-Probleme des europäischen Kernforschungszentrums CERN zu lösen.

Inzwischen besprechen die Datenexperten mit Firmen, die viele Kunden und große Datenmengen haben, wie sich deren Services, Prozesse und Produkte ver­bessern lassen. „Die Anwendungsmöglichkeiten für maschinelles Lernen in Unternehmen sind fast unbegrenzt“, sagte Zavalishina. „Viele Unternehmen befin­den sich aber noch an dem Punkt, an dem sie versuchen, Big Data Analytics überhaupt zu verstehen.“

Eine der ersten Firmen, die Wissen und Technologie von Yandex nutzte, war die russische Straßenverwaltungsbehörde Rosavtodor, die Vorhersagen zur Verkehrsdichte und zu Unfällen benötigte. Im Stahlwerk Magnitogorsk Iron and Steel Works optimieren heute Algorithmen die Stahlproduktion. Zu wenige Zusätze ergeben eine schlechte Qualität, zu viele treiben die Kosten in die Höhe. Bisher nutzten die Stahlkocher für ihre Mischungsvorhersagen komplizierte Modelle. Yandex Data Factory verwendete zur Optimierung historische Daten aus den zurückliegenden zehn Jahren. Vergleichsweise einfach scheint es dagegen, mit Machine Learning Websites zu optimieren und Online-Werbung auszusenden.

Business ist datengetrieben

„Wir sind ein komplett datengetriebenes Business“, sagt Norbert Wirth, Global Head of Data and Science beim Marktforschungsinstitut GfK, „Machine-Learning-Algorithmen sind für uns ein Werkzeug im Kanon mit anderen, das aber für die Vorhersage und für Klassifizierungsprobleme zunehmend wichtiger wird.“ GfK nutzt es derzeit vor allem für die Analyse von Social-Media-Daten und um Marktanteile und Marktperformance vorherzusagen.

„Wir setzen es ein, wenn nicht die Frage nach dem Warum entscheidend ist, sondern die Qualität der Vorhersage“, so Wirth. Sind Aussagen über eine Marke tendenziell eher positiv oder negativ? Und um welche Themen geht es? Bei kleineren Datenbeständen könne man das noch selbst herausfinden, wird es jedoch umfangreicher, seien die Algorithmen „extrem spannend – und sie werden immer leistungsfähiger“. Das sei kein Hype, sagt der Marktforscher, „Machine Learning wird an Bedeutung zunehmen. Mit wachsender Computerpower kann man damit jetzt wirklich arbeiten.“ Die eine Sache sei ein toller Algorithmus, die andere, ob man die dafür nötigen Maschinen auch am Start habe.

In Zukunft werden Analysten laut Wirth zusätzliche Daten verwenden, um Algorithmen zu trainieren und die Modelle leistungsfähiger zu machen. „Es geht in die Richtung, im Analyseprozess mit mehreren Datenquellen zu arbeiten. Natürlich mit solchen, die auch legal genutzt werden dürfen.“ Data Privacy sei ein sehr wichtiges Thema rund um Machine Learning – aber auch die Stabilität und die Qualität der Daten.

Der Flugzeughersteller Boeing nutzt Machine Learning, um seine Services und die interne Produktion zu verbessern, berichtete Sergey Kravchenko, President Russia and CIS von Boeing. Das Flugzeug 787 verfüge über mehr als zehntausend mit dem Internet verbundene Sensoren, die den Mechanikern am Boden schon während des Fluges melden, wenn zum Beispiel eine Lampe oder eine Pumpe ausgetauscht werden muss. So können Fluggesellschaften ihre Wartungskosten reduzieren und im Betrieb effizienter arbeiten.

Boeing arbeitet mit Big Data und Machine Learning, um den Fluggesellschaften mit den während eines Flugs gesammelten Daten zu helfen, Treibstoffkosten zu senken und die Piloten bei schlechtem Wetter zu unterstützen. Nun werden die Daten auch in der Produktion verwendet, um etwa für bestimmte Prozesse die besten Ingenieure zu finden. Daten der Personalabteilung würden genutzt, um zu verstehen, wie die Lebensdauer und die Qualität der Flugzeuge mit dem Training und der Mischung der Menschen im Produktionsteam korrelieren. Gibt es bei Prozessen, die aufwendige Nacharbeiten erfordern, Zusammenhänge mit den bereitgestellten Werkzeugen oder mit dem Team? Kravchenko will mit Big-Data-Analysen den gesamten Zyklus von Design, Produktion und Wartung verbessern.

Ein neues Big-Data-Projekt ist die Flight Training Academy, die 2016 eröffnet werden soll. Hier werden Daten der drei Flugsimulatoren gesammelt und ausgewertet, um die Gestaltung des Cockpits und das Design der Flugzeug­software zu verbessern. Kravchenko will seinen russischen Kunden auch anbieten, in Zukunft Daten auszutauschen und sie gemeinsam auszuwerten.

Experten müssen zusammenpassen

Die Fertigungsindustrie stehe bei der Anwendung von Machine Learning – verglichen etwa mit Telcos und dem Handel – noch am Anfang. Sie werde aber schnell von ihnen und auch von Firmen wie Amazon und Google, lernen. Wer Erfolg haben wolle, müsse die besten Flugzeug- und IT-Experten zusammenbringen. Das Problem: „Die kommen von verschiedenen Planeten.“

Die Zusammenarbeit kann dennoch gelingen – wenn sich alle auf eine gemeinsame Terminologie einigen. „Die Datenexperten müssen etwas mehr von Flugzeugen und Airlines verstehen und die Flugzeugspezialisten mehr über Data Analytics lernen. Sie müssen sich die Werkzeuge teilen, sich gegenseitig vertrauen und ein gemeinsames Team aufbauen“, sagt der Flugzeugbauer. Ein weiteres Problem sei die Relevanz der Daten. „Hier muss die Industrie ihre riesigen Datenmengen anschauen und entscheiden, welche Daten wirklich wichtig sind, um bestimmte Probleme zu lösen. Das ist nicht einfach, dafür brauchen wir Zeit, Trial and Error, und wir müssen von anderen Branchen lernen.“ Die richtige Auswahl der Daten und die Interpretation der Ergebnisse seien dabei wichtiger als der Algorithmus selbst.

Battle of the assistants
Image: patrick lux/Getty Images

Despite what science-fiction wisdom says, talking to your computer is not normal. Sitting in the middle of a modern, open floor-plan office and saying „Hello, Computer,“ will garner some head-turns and a few scowls.

No matter. Companies like Microsoft, Amazon and Apple are convinced we want to talk to everything, including our desktop and laptop computers. Side-eye looks be damned.

Which brings us to today. Almost a year since Microsoft brought Cortana to Windows 10, Apple is following suit with Siri for the newly rechristened macOS.

Windows 10 with Cortana is, obviously, a shipping product, while macOS with Siri integration is in early beta. Even so, I can’t look at Siri’s first desktop jaunt in a vacuum, so when Apple supplied me with a MacBook running the beta of macOS Sierra (due to come to consumers in the fall), I compared the two desktop-based voice assistants. As you might surmise, they’re quite similar, but they have significant and strategic differences.

Where did they come from?

Siri arrives on the desktop as the oldest of the growing class of digital assistants, appearing first on the iPhone 4S in 2011. It’s long been rumored that it would eventually come to the Mac, so no one was surprised when Apple announced exactly that earlier this month at its Worldwide Developers Conference.

Cortana (which was named for the synthetic intelligence in Microsoft’s popular Halo game series), arrived with Windows 10 in 2015, a year after the digital assistant’s formal introduction on Windows Phone at the 2014 Microsoft Build conference.

Siri macOS

Siri lives in two spots on the desktop and asks you to let the system know your location.

Image: Apple

Like Cortana, Siri has a permanent place on the macOS desktop. Actually, it has two. A tiny icon in the upper right corner and then another in the macOS dock. Both launch the familiar Siri „waiting to help you“ wave.

On Windows, Cortana sits next to the Start Button. it has a circular halo icon and, next to that, the ever-present „Ask me anything.“


A click on the Cortana logo opens this Cortana window.

Image: microsoft

It’s at this point that the two assistants diverge. Cortana is a voice assistant, but, by default, it’s a text-driven one. Most people who use it will type something into the Cortana box. If you want to speak to Cortana — as I did many times for this article — you have to click the little microphone icon icon on the right side of the Cortana box.

While Cortana combines universal search with the digital assistant, Apple’s Siri drawn a line between the two.

Importantly, you can put Cortana in an always-listening mode, so it (she?) will wake when you say „Hey Cortana.“ Even though you can also wake the mobile Siri with „Hey Siri,“ macOS offers no such always-listening feature. For the purposes of this comparison, I left „Hey Cortana“ off.

Siri is a voice assistant. It has no text box. A click on either Siri icon opens the same black box in the upper right-hand side of the macOS desktop (it actually slides in from offscreen — a nice touch). As soon as you hit that button, Siri is listening, waiting for you to ask a question.

Sitting right next to Siri is Spotlight, which last year got a significant update. It’s a universal search that can pore over you Mac, the Web, iTunes, the App Store, maps.

So while Microsoft’s Cortana combines universal search with the digital assistant, Apple’s drawn a line between the two — sort of. Spotlight can perform many of the same searches as Siri. However, if you type a question into Spotlight, it may launch Siri. A trigger word appears to be „What’s.“

I really don’t know why Apple chose to keep Spotlight and Siri separate, but they may reconsider in future versions of macOS.

Battle of the assistants

It’s early days for Siri on the desktop, but I’m already impressed with its performance and intelligence — especially as it compares to Microsoft’s Cortana.

To test the two voice assistants, I first closed my office door. I wanted to speak in a normal voice and didn’t want to attract any annoyed stares.

Both Siri on macOS and Cortana start by asking you to open up your privacy settings a bit. They simply do their jobs better if they know where you are. So I followed Siri’s instructions and turned on location services on the macOS.

Here’s something else Siri on macOS and Cortana have in common: Both can tap into your system to, for example, find files and make system-level adjustments, but they’re both pretty inconsistent. Siri on macOS, obviously, is still a work in progress, so take these criticisms with a grain of salt. Even so, I suspect that there will, at least for some time, be limits to what Siri can do even after the forma macOS launch, especially as long as Spotlight survives.

When I asked Siri to „increase my screen brightness,“ it opened a System Preferences: Brightness slider box within Siri and told me „I made the screen a little brighter.“


When I asked Cortana the same question, it opened a Bing search result inside the Cortana box, which told me how to adjust screen brightness, but didn’t do it for me.

On the other hand, when I told Cortana to turn off my Wi-Fi, it turned it off, it returned a message of „Wi-Fi is now off“ and showed the setting to confirm.

Cortana and Siri Wi-Fi

On the left is how Cortana handles voice commands for turning on and off Wi-Fi. On the right is how Siri does it. When you turn off Wi-Fi (networking), you basically disable Siri.


Siri can turn off Wi-Fi, too, but doing so also renders Siri for macOS useless. Unlike Cortana, it needs an Internet connection to work, which means once Siri on macOS has turned it off, you can’t use it to turn Wi-Fi back on. Even if you turn off network connectivity, Cortana will still be able to search your system.

Siri and Cortana excel at natural-language queries (asking questions in sentences), but Siri comes across as the smarter system.

It’s easy to check your schedule through both systems — you just need to ask one of them about your next appointment. However, Siri goes a big step further.

Siri on macOS

Changing you schedule should be this easy everywhere.

Image: apple

When I asked it about my next appointment, it showed me one for Thursday at 11:00 a.m. I then clicked the microphone icon below the calendar result and asked Siri, „Can you move that to 11:10.“ Siri responded, „Okay, I’ll make that change to your event. Shall I reschedule it?“ It then offered the option of confirming the change or cancelling it with my voice. Siri on macOS actually maintains the context between queries — that feels more like the future.

When I asked Cortana to make a similar change, it sent me to a Bing search result. (By the way, both voice assistants use Bing and neither will let you change it to Google.)

The level of conversational prowess in Siri could be a real game-changer and certainly puts Microsoft on notice.

macOS Siri

These are questions I can’t just ask Cortana.

Image: apple/composite/, mashable

Cortana and Siri on macOS both boast system access, but Siri does a better job of keeping track of system specs. I can ask about the speed of my system and how much iCloud storage I have left in Siri. Cortana, unfortunately, has no clue about my OneDrive storage and when I asked „How fast is my PC?“ I only got a Bing search result.

Where’s my stuff and who are you

Siri and Cortana each do a good job of finding system files that contain a keyword. For both, I asked, „Find me files with [keyword],“ and they both quickly showed me local, relevant results. Siri, however, excels at making results persistent. You can pin whatever you find to the notification center.

Cortana and Siri on macOS

On the left you can see that Cortana does a good job with image search, but won’t let me drag and drop from the window. On the right, Siri on macOS found me puppy pics and let me drag and drop one into an email that I plan to send to you.

Image: apple/microsoft/composite/mashable

Similarly, both voice assistants do a good job of finding images, but only Siri on macOS lets me drag and drop one of the image results into a document or email. When I tried to do the same thing with a Cortana result, it only dragged and dropped the HTML for the original query.

Siri did struggle with contacts. I tried initiating a text and got stuck in a sort of infinite loop — it just kept going back to asking me which of my duplicate contacts I wanted to text. This felt like a pre-release bug.

No winners yet

Since Apple is still working Siri for macOS, it’s way too soon to crown a voice-assistant champion. Even so, Siri on mac OS is already faster (Cortana’s voice recognition seems plodding by comparison) and it’s already outstripping Cortana on the intelligence front. On the other hand, Cortana truly shines when you can type into it, a feat impossible in Siri for macOS, unless you start in Spotlight and use one of the magic words to auto-launch Siri.

Microsoft, of course, has its own big Cortana update in the wings as part of the Windows 10 Anniversary Update due later this summer. It will increase Cortana’s intelligence and utility (order plane tickets, shop), but based on what I’ve seen in Siri for macOS, it may only help Cortana achieve parity on some features, while still leaving it trailing in others.

Facebook is starting to analyse users‘ posts and messages with sophisticated new artificial intelligence (AI) software

Facebook is starting to analyse users‘ posts and messages with sophisticated new artificial intelligence (AI) software — and that could have worrying implications for Google.

On Wednesday, the social networking giant announced DeepText — „a deep learning-based text understanding engine that can understand with near-human accuracy the textual content of several thousands posts per second, spanning more than 20 languages.“

DeepText is powered by an AI technique called deep learning. Basically, the more input you give it, the better and better it becomes at what it is trained to do — which in this case is parsing human text-based communication.

The aim? Facebook wants its AI to be able to „understand“ your posts and messages to help enrich experiences on the social network. This is everything from recognising from a message that you need to call a cab (rather than just discussing your previous cab ride) and giving you the option to do so, or helping sort comments on popular pages for relevancy. (Both are examples Facebook’s research team provides.)

The blog post doesn’t directly discuss it, but another obvious application for this kind of sophisticated tech is Google’s home turf — search. And engineering director Hussein Mehanna told Quartz that this is definitely an area that Facebook is exploring: „We want Deep Text to be used in categorizing content within Facebook to facilitate searching for it and also surfacing the right content to users.“

Search is notoriously difficult to get right, and is a problem Google has thrown billions at (and made billions off) trying to solve. Is someone searching „trump“ looking for the Presidential candidate or playing cards? Does a search for the word „gift“ want for ideas for gifts, or more information about the history of gifts — or even the German meaning of the word, poison? And how do you handle natural-language queries that may not contain any of the key words the searcher is looking for — for example, „what is this weird thing growing on me?“

By analysing untold trillions of private and public posts and messages, Facebook is going to have an unprecedented window into real-time written communication and all the contexts around it.

Google has nothing directly comparable (on the same scale) it can draw upon as a resource as to train AI. It can crawl the web, but static web pages don’t have that real-time dynamism that reflect how people really speak — and search — in private conversations. The search giant has repeatedly missed the boat on social, and is now trying to get onboard — very late in the game — with its new messaging app Allo. It will mine conversations for its AI tech and use it to provide contextual info to users — but it hasn’t even launched yet.

Facebook has long been working to improve its search capabilities, with tools like Graph Search that let the user enter natural language queries to find people and information more organically: „My friends who went to Stanford University and like rugby and Tame Impala,“ for example. And in October 2015, it announced it had indexed all 2 trillion-plus of its posts, making them accessible via search.

Using AI will help the Menlo Park company not just to index but to understand the largest private database of human interactions ever created — super-charging these efforts.

How will open source AI change the tech industry?

After years in the labs, artificial intelligence (AI) is being unleashed at last. Google, Microsoft and Facebook have all made their own AI APIs open source in recent months, while IBM has opened Watson (pictured above) for business and Amazon has purchased AI startup Orbeus. These announcements have not drawn much media attention, but are hugely significant.

„In the long run, I think we will evolve in computing from a mobile-first world to an AI-first world,“ says Google CEO Sundar Pichai. What does the appearance of AI bots and machine learning on the open market mean for business, IT, big data, and for sellers of physical hardware?

What’s happening to the major AI platforms?

The AI APIs now opening up are essentially free platforms on which companies can build incredibly powerful analytics tools. „These hugely powerful tools, used, developed and backed by the world’s most advanced technology companies, are now available to anyone with the skills to use them,“ says Matt Jones, Senior Analytics Project Manager at analytics and data science company Tessella.

He continues: „Using these toolkits, individually or combined, anyone can integrate transformational AI or machine learning platforms – which are as sophisticated as anything currently on the market – to their business on a pay as you go, or free basis.“ The tech industry – and digital business in general – is on the cusp of something very big.

Google, Microsoft and Facebook

It may have announced plans for its Google Home speaker-assistant to rival Amazon’s Echo in ’smart‘ homes, but the search engine giant has much bigger plans for AI. Part of Google since January 2014, DeepMind’s AlphaGo neural network beat mankind at the ancient Chinese game of Go recently. DeepMind is the main attraction on TensorFlow a deep learning framework that Google made open source in 2015.

Meanwhile, Facebook is focused on developing its M bot platform that should see its Messenger app flooded with third-party apps that let companies and their customers execute tasks on the platform, such as paying bills, making bank transfers, and even ordering an Uber ride. The Facebook M virtual assistant will follow.

And over at Redmond, Microsoft Cognitive Services and the Microsoft Bot Framework is aimed at getting developers to create AI-powered apps and bots that work on everything from Skype and Office 365 to Slack and SMS.

Watson on the cloud

The existence and planned expansion of IBM’s cloud-based cognitive computing platform is well known, and Big Blue offers its Watson API on a ‚freemium‘ basis.

Around 80,000 developers have accessed the Watson collection of APIs since 2013 through a dedicated cloud platform that IBM is calling ’self-service artificial intelligence‘. It’s designed to help coders, data scientists and analysts create apps that tap into the power of the Watson supercomputer for prediction, natural language processing, and much more. Machine learning and text analytics will also soon be on the menu from IBM’s Watson Knowledge Studio.

Image recognition

While some invest in AI, others are making acquisitions to catch up. While IBM’s Watson has a new Visual Recognition API and Google’s image recognition is well known, Apple recently purchased ‚emotion measurement‘ (i.e. face recognition) company Emotient. Meanwhile, Amazon bought a deep learning neural networks startup called Orbeus, whose ReKognition API specialises in photo recognition, too.

Open season for developers

With the arrival on the cloud of high-power cognitive platforms, it’s open season for app developers, who are expected to use the fruits of AI to unleash better and better apps.

„IBM Watson, Google DeepMind and the like are incredibly high-power cognitive platforms that are enabling developers to do really interesting projects,“ says Frank Palermo, Executive Vice President of Global Digital Solutions at global IT services company VirtusaPolaris. „It’s great they are now making these cognitive platforms readily accessible – particularly for researchers and others scientific pursuits, but also for knowledge workers across all industries.“

For instance, the boom in online education could create AI-powered teachers, which could help improve retention rates.

What is the OpenAI project?

Elon Musk is getting involved in AI, too, by supporting OpenAI, a non-profit research company focused on advancing digital intelligence for the common good. „Elon Musk has launched the OpenAI project with a star-studded list of backers – Palantir CEO Peter Thiel, LinkedIn founder Reid Hoffman and Y Combinator president Sam Altman,“ says an impressed Jones.

OpenAI is headed up by machine learning expert Ilya Sutskever, ex-Google Brain Team member, and has just opened the OpenAI Gym in beta to help developers working with ‚reinforced learning‘, a type of machine learning that’s central to AI. Essentially, it’s about getting software to alter its behaviour in a dynamic environment in order to get a reward (you can’t give Siri a biscuit every time she ‚found this on the web‘).

What does this mean for the IT industry?

The arrival of AI means a changing of the guard in the tech industry, with disruption, innovation – and the complete domination of the cloud. Standalone data analytics platforms? No need. Expensive infrastructure? Ditto. Expertise, not investment, will become king. That, and superfast broadband.

„It will ultimately help to drive innovation and growth, and we could see new business models emerge,“ says Palermo, who thinks that Amazon’s entry into the AI market is especially interesting given the unbridled success of Amazon’s AWS.

„As with cloud before it, Amazon may start renting time on Orbeus by the hour so that knowledge workers can work on a specific project,“ he adds. „This will help to normalise the use of AI and cognitive capabilities and greatly enhance our ability to process information.“ However, reshaping computing does mean some pain-points.

The death of the black box

You need computing power and analytics? AI APIs have the answer. „Any task currently performed by a costly black box AI platform, such as identifying where to drill for oil, predicting disease outbreaks, optimising scientific experiments to develop new products, and predictive maintenance, can now be done in-house,“ says Jones. Crucially, a business using these AI APIs will maintain complete control and oversight of its data.

How disruptive could AI APIs be?

Very. The kind of expertise most companies – especially startups – only dreamed of will be available instantly, online, 24/7. Accessing the world’s most advanced computers via these open platforms will cost only the price of a data scientist’s salary or consultancy fee. „This is very important for a lot of the world’s businesses, and they need to take it seriously,“ says Jones. „If its true potential is realised, it will unleash a new generation of innovative startups that apply the latest AI techniques to disrupt the establishment.“

The intelligent future

For industries already migrating to the cloud at an alarming rate, AI raises the stakes even further. We already inhabit a world where the cloud’s scalable storage has enabled startups like Twitter, Spotify, Netflix and WhatsApp to challenge entrenched big players. By making the very latest AI open source and available to all online, the likes of Google, Facebook, Amazon and IBM could help create a new wave of businesses that harness data. That puts data analytics platforms, pricey IT infrastructure and storage devices on borrowed time.–1322242/1

Machine Learning and Artificial Intelligence: Soon We Won’t Program Computers. We’ll Train Them Like Dogs


BEFORE THE INVENTION of the computer, most experimental psychologists thought the brain was an unknowable black box. You could analyze a subject’s behavior—ring bell, dog salivates—but thoughts, memories, emotions? That stuff was obscure and inscrutable, beyond the reach of science. So these behaviorists, as they called themselves, confined their work to the study of stimulus and response, feedback and reinforcement, bells and saliva. They gave up trying to understand the inner workings of the mind. They ruled their field for four decades.

Then, in the mid-1950s, a group of rebellious psychologists, linguists, information theorists, and early artificial-intelligence researchers came up with a different conception of the mind. People, they argued, were not just collections of conditioned responses. They absorbed information, processed it, and then acted upon it. They had systems for writing, storing, and recalling memories. They operated via a logical, formal syntax. The brain wasn’t a black box at all. It was more like a computer.

The so-called cognitive revolution started small, but as computers became standard equipment in psychology labs across the country, it gained broader acceptance. By the late 1970s, cognitive psychology had overthrown behaviorism, and with the new regime came a whole new language for talking about mental life. Psychologists began describing thoughts as programs, ordinary people talked about storing facts away in their memory banks, and business gurus fretted about the limits of mental bandwidth and processing power in the modern workplace.

This story has repeated itself again and again. As the digital revolution wormed its way into every part of our lives, it also seeped into our language and our deep, basic theories about how things work. Technology always does this. During the Enlightenment, Newton and Descartes inspired people to think of the universe as an elaborate clock. In the industrial age, it was a machine with pistons. (Freud’s idea of psychodynamics borrowed from the thermodynamics of steam engines.) Now it’s a computer. Which is, when you think about it, a fundamentally empowering idea. Because if the world is a computer, then the world can be coded.

Code is logical. Code is hackable. Code is destiny. These are the central tenets (and self-fulfilling prophecies) of life in the digital age. As software has eaten the world, to paraphrase venture capitalist Marc Andreessen, we have surrounded ourselves with machines that convert our actions, thoughts, and emotions into data—raw material for armies of code-wielding engineers to manipulate. We have come to see life itself as something ruled by a series of instructions that can be discovered, exploited, optimized, maybe even rewritten. Companies use code to understand our most intimate ties; Facebook’s Mark Zuckerberg has gone so far as to suggest there might be a “fundamental mathematical law underlying human relationships that governs the balance of who and what we all care about.” In 2013, Craig Venter announced that, a decade after the decoding of the human genome, he had begun to write code that would allow him to create synthetic organisms. “It is becoming clear,” he said, “that all living cells that we know of on this planet are DNA-software-driven biological machines.” Even self-help literature insists that you can hack your own source code, reprogramming your love life, your sleep routine, and your spending habits.

In this world, the ability to write code has become not just a desirable skill but a language that grants insider status to those who speak it. They have access to what in a more mechanical age would have been called the levers of power. “If you control the code, you control the world,” wrote futurist Marc Goodman. (In Bloomberg Businessweek, Paul Ford was slightly more circumspect: “If coders don’t run the world, they run the things that run the world.” Tomato, tomahto.)

But whether you like this state of affairs or hate it—whether you’re a member of the coding elite or someone who barely feels competent to futz with the settings on your phone—don’t get used to it. Our machines are starting to speak a different language now, one that even the best coders can’t fully understand.

Over the past several years, the biggest tech companies in Silicon Valley have aggressively pursued an approach to computing called machine learning. In traditional programming, an engineer writes explicit, step-by-step instructions for the computer to follow. With machine learning, programmers don’t encode computers with instructions. They train them. If you want to teach a neural network to recognize a cat, for instance, you don’t tell it to look for whiskers, ears, fur, and eyes. You simply show it thousands and thousands of photos of cats, and eventually it works things out. If it keeps misclassifying foxes as cats, you don’t rewrite the code. You just keep coaching it.

This approach is not new—it’s been around for decades—but it has recently become immensely more powerful, thanks in part to the rise of deep neural networks, massively distributed computational systems that mimic the multilayered connections of neurons in the brain. And already, whether you realize it or not, machine learning powers large swaths of our online activity. Facebook uses it to determine which stories show up in your News Feed, and Google Photos uses it to identify faces. Machine learning runs Microsoft’s Skype Translator, which converts speech to different languages in real time. Self-driving cars use machine learning to avoid accidents. Even Google’s search engine—for so many years a towering edifice of human-written rules—has begun to rely on these deep neural networks. In February the company replaced its longtime head of search with machine-learning expert John Giannandrea, and it has initiated a major program to retrain its engineers in these new techniques. “By building learning systems,” Giannandrea told reporters this fall, “we don’t have to write these rules anymore.”

But here’s the thing: With machine learning, the engineer never knows precisely how the computer accomplishes its tasks. The neural network’s operations are largely opaque and inscrutable. It is, in other words, a black box. And as these black boxes assume responsibility for more and more of our daily digital tasks, they are not only going to change our relationship to technology—they are going to change how we think about ourselves, our world, and our place within it.

If in the old view programmers were like gods, authoring the laws that govern computer systems, now they’re like parents or dog trainers. And as any parent or dog owner can tell you, that is a much more mysterious relationship to find yourself in.

ANDY RUBIN IS an inveterate tinkerer and coder. The cocreator of the Android operating system, Rubin is notorious in Silicon Valley for filling his workplaces and home with robots. He programs them himself. “I got into computer science when I was very young, and I loved it because I could disappear in the world of the computer. It was a clean slate, a blank canvas, and I could create something from scratch,” he says. “It gave me full control of a world that I played in for many, many years.”

Now, he says, that world is coming to an end. Rubin is excited about the rise of machine learning—his new company, Playground Global, invests in machine-learning startups and is positioning itself to lead the spread of intelligent devices—but it saddens him a little too. Because machine learning changes what it means to be an engineer.

Soon We Won’t Program Computers. We’ll Train Them Like Dogs

Artificial intelligence assistants are taking over

It was a weeknight, after dinner, and the baby was in bed. My wife and I were alone—we thought—discussing the sorts of things you might discuss with your spouse and no one else. (Specifically, we were critiquing a friend’s taste in romantic partners.) I was midsentence when, without warning, another woman’s voice piped in from the next room. We froze.

“I HELD THE DOOR OPEN FOR A CLOWN THE OTHER DAY,” the woman said in a loud, slow monotone. It took us a moment to realize that her voice was emanating from the black speaker on the kitchen table. We stared slack-jawed as she—it—continued: “I THOUGHT IT WAS A NICE JESTER.”

“What. The hell. Was that,” I said after a moment of stunned silence. Alexa, the voice assistant whose digital spirit animates the Amazon Echo, did not reply. She—it—responds only when called by name. Or so we had believed.

We pieced together what must have transpired. Somehow, Alexa’s speech recognition software had mistakenly picked the word Alexa out of something we said, then chosen a phrase like “tell me a joke” as its best approximation of whatever words immediately followed. Through some confluence of human programming and algorithmic randomization, it chose a lame jester/gesture pun as its response.

In retrospect, the disruption was more humorous than sinister. But it was also a slightly unsettling reminder that Amazon’s hit device works by listening to everything you say, all the time. And that, for all Alexa’s human trappings—the name, the voice, the conversational interface—it’s no more sentient than any other app or website. It’s just code, built by some software engineers in Seattle with a cheesy sense of humor.

But the Echo’s inadvertent intrusion into an intimate conversation is also a harbinger of a more fundamental shift in the relationship between human and machine. Alexa—and Siri and Cortana and all of the other virtual assistants that now populate our computers, phones, and living rooms—are just beginning to insinuate themselves, sometimes stealthily, sometimes overtly, and sometimes a tad creepily, into the rhythms of our daily lives. As they grow smarter and more capable, they will routinely surprise us by making our lives easier, and we’ll steadily become more reliant on them.

Even as many of us continue to treat these bots as toys and novelties, they are on their way to becoming our primary gateways to all sorts of goods, services, and information, both public and personal. When that happens, the Echo won’t just be a cylinder in your kitchen that sometimes tells bad jokes. Alexa and virtual agents like it will be the prisms through which we interact with the online world.

It’s a job to which they will necessarily bring a set of biases and priorities, some subtler than others. Some of those biases and priorities will reflect our own. Others, almost certainly, will not. Those vested interests might help to explain why they seem so eager to become our friends.

* * *


In the beginning, computers spoke only computer language, and a human seeking to interact with one was compelled to do the same. First came punch cards, then typed commands such as run, print, and dir.

The 1980s brought the mouse click and the graphical user interface to the masses; the 2000s, touch screens; the 2010s, gesture control and voice. It has all been leading, gradually and imperceptibly, to a world in which we no longer have to speak computer language, because computers will speak human language—not perfectly, but well enough to get by.

Alexa and software agents like it will be the prisms through which we interact with the online world.
We aren’t there yet. But we’re closer than most people realize. And the implications—many of them exciting, some of them ominous—will be tremendous.

Like card catalogs and AOL-style portals before it, Web search will begin to fade from prominence, and with it the dominance of browsers and search engines. Mobile apps as we know them—icons on a home screen that you tap to open—will start to do the same. In their place will rise an array of virtual assistants, bots, and software agents that act more and more like people: not only answering our queries, but acting as our proxies, accomplishing tasks for us, and asking questions of us in return.

This is already beginning to happen—and it isn’t just Siri or Alexa. As of April, all five of the world’s dominant technology companies are vying to be the Google of the conversation age. Whoever wins has a chance to get to know us more intimately than any company or machine has before—and to exert even more influence over our choices, purchases, and reading habits than they already do.

So say goodbye to Web browsers and mobile home screens as our default portals to the Internet. And say hello to the new wave of intelligent assistants, virtual agents, and software bots that are rising to take their place.

No, really, say “hello” to them. Apple’s Siri, Google’s mobile search app, Amazon’s Alexa, Microsoft’s Cortana, and Facebook’s M, to name just five of the most notable, are diverse in their approaches, capabilities, and underlying technologies. But, with one exception, they’ve all been programmed to respond to basic salutations in one way or another, and it’s a good way to start to get a sense of their respective mannerisms. You might even be tempted to say they have different personalities.

Siri’s response to “hello” varies, but it’s typically chatty and familiar: xlarge2.jpgSlate/Screenshot

Alexa is all business: xlarge2.jpgSlate/Screenshot

Google is a bit of an idiot savant: It responds by pulling up a YouTube video of the song “Hello” by Adele, along with all the lyrics. xlarge2.jpgSlate/Screenshot

Cortana isn’t interested in saying anything until you’ve handed her the keys to your life: xlarge2.jpgSlate/Screenshot

Once those formalities are out of the way, she’s all solicitude: xlarge2.jpgSlate/Screenshot

Then there’s Facebook M, an experimental bot, available so far only to an exclusive group of Bay Area beta-testers, that lives inside Facebook Messenger and promises to answer almost any question and fulfill almost any (legal) request. If the casual, what’s-up-BFF tone of its text messages rings eerily human, that’s because it is: M is powered by an uncanny pairing of artificial intelligence and anonymous human agents. xlarge2.jpgSlate/Screenshot

You might notice that most of these virtual assistants have female-sounding names and voices. Facebook M doesn’t have a voice—it’s text-only—but it was initially rumored to be called Moneypenny, a reference to a secretary from the James Bond franchise. And even Google’s voice is female by default. This is, to some extent, a reflection of societal sexism. But these bots’ apparent embrace of gender also highlights their aspiration to be anthropomorphized: They want—that is, the engineers that build them want—to interact with you like a person, not a machine. It seems to be working: Already people tend to refer to Siri, Alexa, and Cortana as “she,” not “it.”

That Silicon Valley’s largest tech companies have effectively humanized their software in this way, with little fanfare and scant resistance, represents a coup of sorts. Once we perceive a virtual assistant as human, or at least humanoid, it becomes an entity with which we can establish humanlike relations. We can like it, banter with it, even turn to it for companionship when we’re lonely. When it errs or betrays us, we can get angry with it and, ultimately, forgive it. What’s most important, from the perspective of the companies behind this technology, is that we trust it.

Should we?

* * *

Siri wasn’t the first digital voice assistant when Apple introduced it in 2011, and it may not have been the best. But it was the first to show us what might be possible: a computer that you talk to like a person, that talks back, and that attempts to do what you ask of it without requiring any further action on your part. Adam Cheyer, co-founder of the startup that built Siri and sold it to Apple in 2010, has said he initially conceived of it not as a search engine, but as a “do engine.”

If Siri gave us a glimpse of what is possible, it also inadvertently taught us about what wasn’t yet. At first, it often struggled to understand you, especially if you spoke into your iPhone with an accent, and it routinely blundered attempts to carry out your will. Its quick-witted rejoinders to select queries (“Siri, talk dirty to me”) raised expectations for its intelligence that were promptly dashed once you asked it something it hadn’t been hard-coded to answer. Its store of knowledge proved trivial compared with the vast information readily available via Google search. Siri was as much an inspiration as a disappointment.

Five years later, Siri has gotten smarter, if perhaps less so than one might have hoped. More importantly, the technology underlying it has drastically improved, fueled by a boom in the computer science subfield of machine learning. That has led to sharp improvements in speech recognition and natural language understanding, two separate but related technologies that are crucial to voice assistants.

siriReuters/Suzanne PlunkettLuke Peters demonstrates Siri, an application which uses voice recognition and detection on the iPhone 4S, outside the Apple store in Covent Garden, London Oct. 14, 2011.

If Siri gave us a glimpse of what is possible, it also inadvertently taught us about what wasn’t yet.

If a revolution in technology has made intelligent virtual assistants possible, what has made them inevitable is a revolution in our relationship to technology. Computers began as tools of business and research, designed to automate tasks such as math and information retrieval. Today they’re tools of personal communication, connecting us not only to information but to one another. They’re also beginning to connect us to all the other technologies in our lives: Your smartphone can turn on your lights, start your car, activate your home security system, and withdraw money from your bank. As computers have grown deeply personal, our relationship with them has changed. And yet the way they interact with us hasn’t quite caught up.

“It’s always been sort of appalling to me that you now have a supercomputer in your pocket, yet you have to learn to use it,” says Alan Packer, head of language technology at Facebook. “It seems actually like a failure on the part of our industry that software is hard to use.”

Packer is one of the people trying to change that. As a software developer at Microsoft, he helped to build Cortana. After it launched, he found his skills in heavy demand, especially among the two tech giants that hadn’t yet developed voice assistants of their own. One Thursday morning in December 2014, Packer was on the verge of accepting a top job at Amazon—“You would not be surprised at which team I was about to join,” he says—when Facebook called and offered to fly him to Menlo Park, California, for an interview the next day. He had an inkling of what Amazon was working on, but he had no idea why Facebook might be interested in someone with his skill set.

As it turned out, Facebook wanted Packer for much the same purpose that Microsoft and Amazon did: to help it build software that could make sense of what its users were saying and generate intelligent responses. Facebook may not have a device like the Echo or an operating system like Windows, but its own platforms are full of billions of people communicating with one another every day. If Facebook can better understand what they’re saying, it can further hone its News Feed and advertising algorithms, among other applications. More creatively, Facebook has begun to use language understanding to build artificial intelligence into its Messenger app. Now, if you’re messaging with a friend and mention sharing an Uber, a software agent within Messenger can jump in and order it for you while you continue your conversation.

In short, Packer says, Facebook is working on language understanding because Facebook is a technology company—and that’s where technology is headed. As if to underscore that point, Packer’s former employer this year headlined its annual developer conference by announcing plans to turn Cortana into a portal for conversational bots and integrate it into Skype, Outlook, and other popular applications. Microsoft CEO Satya Nadella predicted that bots will be the Internet’s next major platform, overtaking mobile apps the same way they eclipsed desktop computing.

* * *Amazon Echo DotAP

Siri may not have been very practical, but people immediately grasped what it was. With Amazon’s Echo, the second major tech gadget to put a voice interface front and center, it was the other way around. The company surprised the industry and baffled the public when it released a device in November 2014 that looked and acted like a speaker—except that it didn’t connect to anything except a power outlet, and the only buttons were for power and mute. You control the Echo solely by voice, and if you ask it questions, it talks back. It was like Amazon had decided to put Siri in a black cylinder and sell it for $179. Except Alexa, the virtual intelligence software that powers the Echo, was far more limited than Siri in its capabilities. Who, reviewers wondered, would buy such a bizarre novelty gadget?

That question has faded as Amazon has gradually upgraded and refined the Alexa software, and the five-star Amazon reviews have since poured in. In the New York Times, Farhad Manjoo recently followed up his tepid initial review with an all-out rave: The Echo “brims with profound possibility,” he wrote. Amazon has not disclosed sales figures, but the Echo ranks as the third-best-selling gadget in its electronics section. Alexa may not be as versatile as Siri—yet—but it turned out to have a distinct advantage: a sense of purpose, and of its own limitations. Whereas Apple implicitly invites iPhone users to ask Siri anything, Amazon ships the Echo with a little cheat sheet of basic queries that it knows how to respond to: “Alexa, what’s the weather?” “Alexa, set a timer for 45 minutes.” “Alexa, what’s in the news?”

The cheat sheet’s effect is to lower expectations to a level that even a relatively simplistic artificial intelligence can plausibly meet on a regular basis. That’s by design, says Greg Hart, Amazon’s vice president in charge of Echo and Alexa. Building a voice assistant that can respond to every possible query is “a really hard problem,” he says. “People can get really turned off if they have an experience that’s subpar or frustrating.” So the company began by picking specific tasks that Alexa could handle with aplomb and communicating those clearly to customers.

At launch, the Echo had just 12 core capabilities. That list has grown steadily as the company has augmented Alexa’s intelligence and added integrations with new services, such as Google Calendar, Yelp reviews, Pandora streaming radio, and even Domino’s delivery. The Echo is also becoming a hub for connected home appliances: “ ‘Alexa, turn on the living room lights’ never fails to delight people,” Hart says.

When you ask Alexa a question it can’t answer or say something it can’t quite understand, it fesses up: “Sorry, I don’t know the answer to that question.” That makes it all the more charming when you test its knowledge or capabilities and it surprises you by replying confidently and correctly. “Alexa, what’s a kinkajou?” I asked on a whim one evening, glancing up from my laptop while reading a news story about an elderly Florida woman who woke up one day with a kinkajou on her chest. Alexa didn’t hesitate: “A kinkajou is a rainforest mammal of the family Procyonidae … ” Alexa then proceeded to list a number of other Procyonidae to which the kinkajou is closely related. “Alexa, that’s enough,” I said after a few moments, genuinely impressed. “Thank you,” I added.

“You’re welcome,” Alexa replied, and I thought for a moment that she—it—sounded pleased.

As delightful as it can seem, the Echo’s magic comes with some unusual downsides. In order to respond every time you say “Alexa,” it has to be listening for the word at all times. Amazon says it only stores the commands that you say after you’ve said the word Alexa and discards the rest. Even so, the enormous amount of processing required to listen for a wake word 24/7 is reflected in the Echo’s biggest limitation: It only works when it’s plugged into a power outlet. (Amazon’s newest smart speakers, the Echo Dot and the Tap, are more mobile, but one sacrifices the speaker and the other the ability to respond at any time.)

Even if you trust Amazon to rigorously protect and delete all of your personal conversations from its servers—as it promises it will if you ask it to—Alexa’s anthropomorphic characteristics make it hard to shake the occasional sense that it’s eavesdropping on you, Big Brother–style. I was alone in my kitchen one day, unabashedly belting out the Fats Domino song “Blueberry Hill” as I did the dishes, when it struck me that I wasn’t alone after all. Alexa was listening—not judging, surely, but listening all the same. Sheepishly, I stopped singing.

* * *hal 2001Google Images

The notion that the Echo is “creepy” or “spying on us” might be the most common criticism of the device so far. But there’s a more fundamental problem. It’s one that is likely to haunt voice assistants, and those who rely on them, as the technology evolves and bores it way more deeply into our lives.

The problem is that conversational interfaces don’t lend themselves to the sort of open flow of information we’ve become accustomed to in the Google era. By necessity they limit our choices—because their function is to make choices on our behalf.

For example, a search for “news” on the Web will turn up a diverse and virtually endless array of possible sources, from Fox News to Yahoo News to CNN to Google News, which is itself a compendium of stories from other outlets. But ask the Echo, “What’s in the news?” and by default it responds by serving up a clip of NPR News’s latest hourly update, which it pulls from the streaming radio service TuneIn. Which is great—unless you don’t happen to like NPR’s approach to the news, or you prefer a streaming radio service other than TuneIn. You can change those defaults somewhere in the bowels of the Alexa app, but Alexa never volunteers that information. Most people will never even know it’s an option. Amazon has made the choice for them.

And how does Amazon make that sort of choice? The Echo’s cheat sheet doesn’t tell you that, and the company couldn’t give me a clear answer.

Alexa does take care to mention before delivering the news that it’s pulling the briefing from NPR News and TuneIn. But that isn’t always the case with other sorts of queries.

Let’s go back to our friend the kinkajou. In my pre-Echo days, my curiosity about an exotic animal might have sent me to Google via my laptop or phone. Just as likely, I might have simply let the moment of curiosity pass and not bothered with a search. Looking something up on Google involves just enough steps to deter us from doing it in a surprising number of cases. One of the great virtues of voice technology is to lower that barrier to the point where it’s essentially no trouble at all. Having an Echo in the room when you’re struck by curiosity about kinkajous is like having a friend sitting next to you who happens to be a kinkajou expert. All you have to do is say your question out loud, and Alexa will supply the answer. You literally don’t have to lift a finger.

That is voice technology’s fundamental advantage over all the human-computer interfaces that have come before it: In many settings, including the home, the car, or on a wearable gadget, it’s much easier and more natural than clicking, typing, or tapping. In the logic of today’s consumer technology industry, that makes its ascendance in those realms all but inevitable.

But consider the difference between Googling something and asking a friendly voice assistant. When I Google “kinkajou,” I get a list of websites, ranked according to an algorithm that takes into account all sorts of factors that correlate with relevance and authority. I choose the information source I prefer, then visit its website directly—an experience that could help to further shade or inform my impression of its trustworthiness. Ultimately, the answer does come not from Google, per se, but directly from some third-party authority, whose credibility I can evaluate as I wish.

A voice-based interface is different. The response comes one word at a time, one sentence at a time, one idea at a time. That makes it very easy to follow, especially for humans who have spent their whole lives interacting with one another in just this way. But it makes it very cumbersome to present multiple options for how to answer a given query. Imagine for a moment what it would sound like to read a whole Google search results page aloud, and you’ll understand no one builds a voice interface that way.

That’s why voice assistants tend to answer your question by drawing from a single source of their own choosing. Alexa’s confident response to my kinkajou question, I later discovered, came directly from Wikipedia, which Amazon has apparently chosen as the default source for Alexa’s answers to factual questions. The reasons seem fairly obvious: It’s the world’s most comprehensive encyclopedia, its information is free and public, and it’s already digitized. What it’s not, of course, is infallible. Yet Alexa’s response to my question didn’t begin with the words, “Well, according to Wikipedia … ” She—it—just launched into the answer, as if she (it) knew it off the top of her (its) head. If a human did that, we might call it plagiarism.

The sin here is not merely academic. By not consistently citing the sources of its answers, Alexa makes it difficult to evaluate their credibility. It also implicitly turns Alexa into an information source in its own right, rather than a guide to information sources, because the only entity in which we can place our trust or distrust is Alexa itself. That’s a problem if its information source turns out to be wrong.

The constraints on choice and transparency might not bother people when Alexa’s default source is Wikipedia, NPR, or TuneIn. It starts to get a little more irksome when you ask Alexa to play you music, one of the Echo’s core features. “Alexa, play me the Rolling Stones” will queue up a shuffle playlist of Rolling Stones songs available through Amazon’s own streaming music service, Amazon Prime Music—provided you’re paying the $99 a year required to be an Amazon Prime member. Otherwise, the most you’ll get out of the Echo are 20-second samples of songs available for purchase. Want to guess what one choice you’ll have as to which online retail giant to purchase those songs from?

When you say “Hello” to Alexa, you’re signing up for her party.

Amazon’s response is that Alexa does give you options and cite its sources—in the Alexa app, which keeps a record of your queries and its responses. When the Echo tells you what a kinkajou is, you can open the app on your phone and see a link to the Wikipedia article, as well as an option to search Bing. Amazon adds that Alexa is meant to be an “open platform” that allows anyone to connect to it via an API. The company is also working with specific partners to integrate their services into Alexa’s repertoire. So, for instance, if you don’t want to be limited to playing songs from Amazon Prime Music, you can now take a series of steps to link the Echo to a different streaming music service, such as Spotify Premium. Amazon Prime Music will still be the default, though: You’ll only get Spotify if you specify “from Spotify” in your voice command.

What’s not always clear is how Amazon chooses its defaults and its partners and what motivations might underlie those choices. Ahead of the 2016 Super Bowl, Amazon announced that the Echo could now order you a pizza. But that pizza would come, at least for the time being, from just one pizza-maker: Domino’s. Want a pizza from Little Caesars instead? You’ll have to order it some other way.

To Amazon’s credit, its choice of pizza source is very transparent. To use the pizza feature, you have to utter the specific command, “Alexa, open Domino’s and place my Easy Order.” The clunkiness of that command is no accident. It’s Amazon’s way of making sure that you don’t order a pizza by accident and that you know where that pizza is coming from. But it’s unlikely Domino’s would have gone to the trouble of partnering with Amazon if it didn’t think it would result in at least some number of people ordering Domino’s for their Super Bowl parties rather than Little Caesars.

None of this is to say that Amazon and Domino’s are going to conspire to monopolize the pizza industry anytime soon. There are obviously plenty of ways to order a pizza besides doing it on an Echo. Ditto for listening to the news, the Rolling Stones, a book, or a podcast. But what about when only one company’s smart thermostat can be operated by Alexa? If you come to rely on Alexa to manage your Google Calendar, what happens when Amazon and Google have a falling out?
When you say “Hello” to Alexa, you’re signing up for her party. Nominally, everyone’s invited. But Amazon has the power to ensure that its friends and business associates are the first people you meet.

* * *

google now speak now screenBusiness Insider, William Wei

These concerns might sound rather distant—we’re just talking about niche speakers connected to niche thermostats, right? The coming sea change feels a lot closer once you think about the other companies competing to make digital assistants your main portal to everything you do on your computer, in your car, and on your phone. Companies like Google.

Google may be positioned best of all to capitalize on the rise of personal A.I. It also has the most to lose. From the start, the company has built its business around its search engine’s status as a portal to information and services. Google Now—which does things like proactively checking the traffic and alerting you when you need to leave for a flight, even when you didn’t ask it to—is a natural extension of the company’s strategy.

If something is going to replace Google’s on-screen services, Google wants to be the one that does it.
As early as 2009, Google began to work on voice search and what it calls “conversational search,” using speech recognition and natural language understanding to respond to questions phrased in plain language. More recently, it has begun to combine that with “contextual search.” For instance, as Google demonstrated at its 2015 developer conference, if you’re listening to Skrillex on your Android phone, you can now simply ask, “What’s his real name?” and Google will intuit that you’re asking about the artist. “Sonny John Moore,” it will tell you, without ever leaving the Spotify app.

It’s no surprise, then, that Google is rumored to be working on two major new products—an A.I.-powered messaging app or agent and a voice-powered household gadget—that sound a lot like Facebook M and the Amazon Echo, respectively. If something is going to replace Google’s on-screen services, Google wants to be the one that does it.

So far, Google has made what seems to be a sincere effort to win the A.I. assistant race without

sacrificing the virtues—credibility, transparency, objectivity—that made its search page such a dominant force on the Web. (It’s worth recalling: A big reason Google vanquished AltaVista was that it didn’t bend its search results to its own vested interests.) Google’s voice search does generally cite its sources. And it remains primarily a portal to other sources of information, rather than a platform that pulls in content from elsewhere. The downside to that relatively open approach is that when you say “hello” to Google voice search, it doesn’t say hello back. It gives you a link to the Adele song “Hello.” Even then, Google isn’t above playing favorites with the sources of information it surfaces first: That link goes not to Spotify, Apple Music, or Amazon Prime Music, but to YouTube, which Google owns. The company has weathered antitrust scrutiny over allegations that this amounted to preferential treatment. Google’s defense was that it puts its own services and information sources first because its users prefer them.

* * *


If there’s a consolation for those concerned that intelligent assistants are going to take over the world, it’s this: They really aren’t all that intelligent. Not yet, anyway.

The 2013 movie Her, in which a mobile operating system gets to know its user so well that they become romantically involved, paints a vivid picture of what the world might look like if we had the technology to carry Siri, Alexa, and the like to their logical conclusion. The experts I talked to, who are building that technology today, almost all cited Her as a reference point—while pointing out that we’re not going to get there anytime soon.

Google recently rekindled hopes—and fears—of super-intelligent A.I. when its AlphaGo software defeated the world champion in a historic Go match. As momentous as the achievement was, designing an algorithm to win even the most complex board game is trivial compared with designing one that can understand and respond appropriately to anything a person might say. That’s why, even as artificial intelligence is learning to recommend songs that sound like they were hand-picked by your best friend or navigate city streets more safely than any human driver, A.I. still has to resort to parlor tricks—like posing as a 13-year-old struggling with a foreign language—to pass as human in an extended conversation. The world is simply too vast, language too ambiguous, the human brain too complex for any machine to model it, at least for the foreseeable future.

But if we won’t see a true full-service A.I. in our lifetime, we might yet witness the rise of a system that can approximate some of its capabilities—comprising not a single, humanlike Her, but a million tiny hims carrying out small, discrete tasks handily. In January, the Verge’s Casey Newton made a compelling argument that our technological future will be filled not with websites, apps, or even voice assistants, but with conversational messaging bots. Like voice assistants, these bots rely on natural language understanding to carry on conversations with us. But they will do so via the medium that has come to dominate online interpersonal interaction, especially among the young people who are the heaviest users of mobile devices: text messaging. For example, Newton points to “Lunch Bot,” a relatively simple agent that lived in the wildly popular workplace chat program Slack and existed for a single, highly specialized purpose: to recommend the best place for employees to order their lunch from on a given day. It soon grew into a venture-backed company called Howdy.

A world of conversational machines is one in which we treat software like humans, letting them deeper into our lives and confiding in them more than ever.

I have a bot in my own life that serves a similarly specialized yet important role. While researching this story, I ran across a company called whose mission is to build the ultimate virtual scheduling assistant. It’s called Amy Ingram, and if its initials don’t tip you off, you might interact with it several times before realizing it’s not a person. (Unlike some other intelligent assistant companies, gives you the option to choose a male name for your assistant instead: Mine is Andrew Ingram.) Though it’s backed by some impressive natural language tech,’s bot does not attempt to be a know-it-all or do-it-all; it doesn’t tell jokes, and you wouldn’t want to date him. It asks for access to just one thing—your calendar. And it communicates solely by email. Just cc it on any thread in which you’re trying to schedule a meeting or appointment, and it will automatically step in and take over the back-and-forth involved in nailing down a time and place. Once it has agreed on a time with whomever you’re meeting—or, perhaps, with his or her own assistant, whether human or virtual—it will put all the relevant details on your calendar. Have your A.I. cc my A.I.

For these bots, the key to success is not growing so intelligent that they can do everything. It’s staying specialized enough that they don’t have to.

“We’ve had this A.I. fantasy for almost 60 years now,” says Dennis Mortensen,’s founder and CEO. “At every turn we thought the only outcome would be some human-level entity where we could converse with it like you and I are [conversing] right now. That’s going to continue to be a fantasy. I can’t see it in my lifetime or even my kids’ lifetime.” What is possible, Mortensen says, is “extremely specialized, verticalized A.I.s that understand perhaps only one job, but do that job very well.”

Yet those simple bots, Mortensen believes, could one day add up to something more. “You get enough of these agents, and maybe one morning in 2045 you look around and that plethora—tens of thousands of little agents—once they start to talk to each other, it might not look so different from that A.I. fantasy we’ve had.”

That might feel a little less scary. But it still leaves problems of transparency, privacy, objectivity, and trust—questions that are not new to the world of personal technology and the Internet but are resurfacing in fresh and urgent forms. A world of conversational machines is one in which we treat software like humans, letting them deeper into our lives and confiding in them more than ever. It’s one in which the world’s largest corporations know more about us, hold greater influence over our choices, and make more decisions for us than ever before. And it all starts with a friendly “Hello.”

The brightest minds in AI research – Machine Learning

In AI research,  brightest minds aren’t driven by the next product cycle or profit margin – They want to make AI better, and making AI better doesn’t happen when you keep your latest findings to yourself.

Inside OpenAI, Elon Musk’s Wild Plan to Set Artificial Intelligence Free


THE FRIDAY AFTERNOON news dump, a grand tradition observed by politicians and capitalists alike, is usually supposed to hide bad news. So it was a little weird that Elon Musk, founder of electric car maker Tesla, and Sam Altman, president of famed tech incubator Y Combinator, unveiled their new artificial intelligence company at the tail end of a weeklong AI conference in Montreal this past December.

But there was a reason they revealed OpenAI at that late hour. It wasn’t that no one was looking. It was that everyone was looking. When some of Silicon Valley’s most powerful companies caught wind of the project, they began offering tremendous amounts of money to OpenAI’s freshly assembled cadre of artificial intelligence researchers, intent on keeping these big thinkers for themselves. The last-minute offers—some made at the conference itself—were large enough to force Musk and Altman to delay the announcement of the new startup. “The amount of money was borderline crazy,” says Wojciech Zaremba, a researcher who was joining OpenAI after internships at both Google and Facebook and was among those who received big offers at the eleventh hour.

How many dollars is “borderline crazy”? Two years ago, as the market for the latest machine learning technology really started to heat up, Microsoft Research vice president Peter Lee said that the cost of a top AI researcher had eclipsed the cost of a top quarterback prospect in the National Football League—and he meant under regular circumstances, not when two of the most famous entrepreneurs in Silicon Valley were trying to poach your top talent. Zaremba says that as OpenAI was coming together, he was offered two or three times his market value.

OpenAI didn’t match those offers. But it offered something else: the chance to explore research aimed solely at the future instead of products and quarterly earnings, and to eventually share most—if not all—of this research with anyone who wants it. That’s right: Musk, Altman, and company aim to give away what may become the 21st century’s most transformative technology—and give it away for free.

Zaremba says those borderline crazy offers actually turned him off—despite his enormous respect for companies like Google and Facebook. He felt like the money was at least as much of an effort to prevent the creation of OpenAI as a play to win his services, and it pushed him even further towards the startup’s magnanimous mission. “I realized,” Zaremba says, “that OpenAI was the best place to be.”

That’s the irony at the heart of this story: even as the world’s biggest tech companies try to hold onto their researchers with the same fierceness that NFL teams try to hold onto their star quarterbacks, the researchers themselves just want to share. In the rarefied world of AI research, the brightest minds aren’t driven by—or at least not only by—the next product cycle or profit margin. They want to make AI better, and making AI better doesn’t happen when you keep your latest findings to yourself.

OpenAI is a billion-dollar effort to push AI as far as it will go.
This morning, OpenAI will release its first batch of AI software, a toolkit for building artificially intelligent systems by way of a technology called “reinforcement learning”—one of the key technologies that, among other things, drove the creation of AlphaGo, the Google AI that shocked the world by mastering the ancient game of Go. With this toolkit, you can build systems that simulate a new breed of robot, play Atari games, and, yes, master the game of Go.

But game-playing is just the beginning. OpenAI is a billion-dollar effort to push AI as far as it will go. In both how the company came together and what it plans to do, you can see the next great wave of innovation forming. We’re a long way from knowing whether OpenAI itself becomes the main agent for that change. But the forces that drove the creation of this rather unusual startup show that the new breed of AI will not only remake technology, but remake the way we build technology.

AI Everywhere
Silicon Valley is not exactly averse to hyperbole. It’s always wise to meet bold-sounding claims with skepticism. But in the field of AI, the change is real. Inside places like Google and Facebook, a technology called deep learning is already helping Internet services identify faces in photos, recognize commands spoken into smartphones, and respond to Internet search queries. And this same technology can drive so many other tasks of the future. It can help machines understand natural language—the natural way that we humans talk and write. It can create a new breed of robot, giving automatons the power to not only perform tasks but learn them on the fly. And some believe it can eventually give machines something close to common sense—the ability to truly think like a human.

But along with such promise comes deep anxiety. Musk and Altman worry that if people can build AI that can do great things, then they can build AI that can do awful things, too. They’re not alone in their fear of robot overlords, but perhaps counterintuitively, Musk and Altman also think that the best way to battle malicious AI is not to restrict access to artificial intelligence but expand it. That’s part of what has attracted a team of young, hyper-intelligent idealists to their new project.

OpenAI began one evening last summer in a private room at Silicon Valley’s Rosewood Hotel—an upscale, urban, ranch-style hotel that sits, literally, at the center of the venture capital world along Sand Hill Road in Menlo Park, California. Elon Musk was having dinner with Ilya Sutskever, who was then working on the Google Brain, the company’s sweeping effort to build deep neural networks—artificially intelligent systems that can learn to perform tasks by analyzing massive amounts of digital data, including everything from recognizing photos to writing email messages to, well, carrying on a conversation. Sutskever was one of the top thinkers on the project. But even bigger ideas were in play.

Sam Altman, whose Y Combinator helped bootstrap companies like Airbnb, Dropbox, and Coinbase, had brokered the meeting, bringing together several AI researchers and a young but experienced company builder named Greg Brockman, previously the chief technology officer at high-profile Silicon Valley digital payments startup called Stripe, another Y Combinator company. It was an eclectic group. But they all shared a goal: to create a new kind of AI lab, one that would operate outside the control not only of Google, but of anyone else. “The best thing that I could imagine doing,” Brockman says, “was moving humanity closer to building real AI in a safe way.”

Musk is one of the loudest voices warning that we humans could one day lose control of systems powerful enough to learn on their own.
Musk was there because he’s an old friend of Altman’s—and because AI is crucial to the future of his various businesses and, well, the future as a whole. Tesla needs AI for its inevitable self-driving cars. SpaceX, Musk’s other company, will need it to put people in space and keep them alive once they’re there. But Musk is also one of the loudest voices warning that we humans could one day lose control of systems powerful enough to learn on their own.

The trouble was: so many of the people most qualified to solve all those problems were already working for Google (and Facebook and Microsoft and Baidu and Twitter). And no one at the dinner was quite sure that these thinkers could be lured to a new startup, even if Musk and Altman were behind it. But one key player was at least open to the idea of jumping ship. “I felt there were risks involved,” Sutskever says. “But I also felt it would be a very interesting thing to try.”

Breaking the Cycle
Emboldened by the conversation with Musk, Altman, and others at the Rosewood, Brockman soon resolved to build the lab they all envisioned. Taking on the project full-time, he approached Yoshua Bengio, a computer scientist at the University of Montreal and one of founding fathers of the deep learning movement. The field’s other two pioneers—Geoff Hinton and Yann LeCun—are now at Google and Facebook, respectively, but Bengio is committed to life in the world of academia, largely outside the aims of industry. He drew up a list of the best researchers in the field, and over the next several weeks, Brockman reached out to as many on the list as he could, along with several others.

Many of these researchers liked the idea, but they were also wary of making the leap. In an effort to break the cycle, Brockman picked the ten researchers he wanted the most and invited them to spend a Saturday getting wined, dined, and cajoled at a winery in Napa Valley. For Brockman, even the drive into Napa served as a catalyst for the project. “An underrated way to bring people together are these times where there is no way to speed up getting to where you’re going,” he says. “You have to get there, and you have to talk.” And once they reached the wine country, that vibe remained. “It was one of those days where you could tell the chemistry was there,” Brockman says. Or as Sutskever puts it: “the wine was secondary to the talk.”

By the end of the day, Brockman asked all ten researchers to join the lab, and he gave them three weeks to think about it. By the deadline, nine of them were in. And they stayed in, despite those big offers from the giants of Silicon Valley. “They did make it very compelling for me to stay, so it wasn’t an easy decision,” Sutskever says of Google, his former employer. “But in the end, I decided to go with OpenAI, partly of because of the very strong group of people and, to a very large extent, because of its mission.”

The deep learning movement began with academics. It’s only recently that companies like Google and Facebook and Microsoft have pushed into the field, as advances in raw computing power have made deep neural networks a reality, not just a theoretical possibility. People like Hinton and LeCun left academia for Google and Facebook because of the enormous resources inside these companies. But they remain intent on collaborating with other thinkers. Indeed, as LeCun explains, deep learning research requires this free flow of ideas. “When you do research in secret,” he says, “you fall behind.”

As a result, big companies now share a lot of their AI research. That’s a real change, especially for Google, which has long kept the tech at the heart of its online empire secret. Recently, Google open sourced the software engine that drives its neural networks. But it still retains the inside track in the race to the future. Brockman, Altman, and Musk aim to push the notion of openness further still, saying they don’t want one or two large corporations controlling the future of artificial intelligence.

The Limits of Openness
All of which sounds great. But for all of OpenAI’s idealism, the researchers may find themselves facing some of the same compromises they had to make at their old jobs. Openness has its limits. And the long-term vision for AI isn’t the only interest in play. OpenAI is not a charity. Musk’s companies that could benefit greatly the startup’s work, and so could many of the companies backed by Altman’s Y Combinator. “There are certainly some competing objectives,” LeCun says. “It’s a non-profit, but then there is a very close link with Y Combinator. And people are paid as if they are working in the industry.”

According to Brockman, the lab doesn’t pay the same astronomical salaries that AI researchers are now getting at places like Google and Facebook. But he says the lab does want to “pay them well,” and it’s offering to compensate researchers with stock options, first in Y Combinator and perhaps later in SpaceX (which, unlike Tesla, is still a private company).

Brockman insists that OpenAI won’t give special treatment to its sister companies.
Nonetheless, Brockman insists that OpenAI won’t give special treatment to its sister companies. OpenAI is a research outfit, he says, not a consulting firm. But when pressed, he acknowledges that OpenAI’s idealistic vision has its limits. The company may not open source everything it produces, though it will aim to share most of its research eventually, either through research papers or Internet services. “Doing all your research in the open is not necessarily the best way to go. You want to nurture an idea, see where it goes, and then publish it,” Brockman says. “We will produce lot of open source code. But we will also have a lot of stuff that we are not quite ready to release.”

Both Sutskever and Brockman also add that OpenAI could go so far as to patent some of its work. “We won’t patent anything in the near term,” Brockman says. “But we’re open to changing tactics in the long term, if we find it’s the best thing for the world.” For instance, he says, OpenAI could engage in pre-emptive patenting, a tactic that seeks to prevent others from securing patents.

But to some, patents suggest a profit motive—or at least a weaker commitment to open source than OpenAI’s founders have espoused. “That’s what the patent system is about,” says Oren Etzioni, head of the Allen Institute for Artificial Intelligence. “This makes me wonder where they’re really going.”

The Super-Intelligence Problem
When Musk and Altman unveiled OpenAI, they also painted the project as a way to neutralize the threat of a malicious artificial super-intelligence. Of course, that super-intelligence could arise out of the tech OpenAI creates, but they insist that any threat would be mitigated because the technology would be usable by everyone. “We think its far more likely that many, many AIs will work to stop the occasional bad actors,” Altman says.

But not everyone in the field buys this. Nick Bostrom, the Oxford philosopher who, like Musk, has warned against the dangers of AI, points out that if you share research without restriction, bad actors could grab it before anyone has ensured that it’s safe. “If you have a button that could do bad things to the world,” Bostrom says, “you don’t want to give it to everyone.” If, on the other hand, OpenAI decides to hold back research to keep it from the bad guys, Bostrom wonders how it’s different from a Google or a Facebook.

If you share research without restriction, bad actors could grab it before anyone has ensured that it’s safe.
He does say that the not-for-profit status of OpenAI could change things—though not necessarily. The real power of the project, he says, is that it can indeed provide a check for the likes of Google and Facebook. “It can reduce the probability that super-intelligence would be monopolized,” he says. “It can remove one possible reason why some entity or group would have radically better AI than everyone else.”

But as the philosopher explains in a new paper, the primary effect of an outfit like OpenAI—an outfit intent on freely sharing its work—is that it accelerates the progress of artificial intelligence, at least in the short term. And it may speed progress in the long term as well, provided that it, for altruistic reasons, “opts for a higher level of openness than would be commercially optimal.”

“It might still be plausible that a philanthropically motivated R&D funder would speed progress more by pursuing open science,” he says.

Like Xerox PARC
In early January, Brockman’s nine AI researchers met up at his apartment in San Francisco’s Mission District. The project was so new that they didn’t even have white boards. (Can you imagine?) They bought a few that day and got down to work.

Brockman says OpenAI will begin by exploring reinforcement learning, a way for machines to learn tasks by repeating them over and over again and tracking which methods produce the best results. But the other primary goal is what’s called “unsupervised learning”—creating machines that can truly learn on their own, without a human hand to guide them. Today, deep learning is driven by carefully labeled data. If you want to teach a neural network to recognize cat photos, you must feed it a certain number of examples—and these examples must be labeled as cat photos. The learning is supervised by human labelers. But like many others researchers, OpenAI aims to create neural nets that can learn without carefully labeled data.

“If you have really good unsupervised learning, machines would be able to learn from all this knowledge on the Internet—just like humans learn by looking around—or reading books,” Brockman says.

He envisions OpenAI as the modern incarnation of Xerox PARC, the tech research lab that thrived in the 1970s. Just as PARC’s largely open and unfettered research gave rise to everything from the graphical user interface to the laser printer to object-oriented programing, Brockman and crew seek to delve even deeper into what we once considered science fiction. PARC was owned by, yes, Xerox, but it fed so many other companies, most notably Apple, because people like Steve Jobs were privy to its research. At OpenAI, Brockman wants to make everyone privy to its research.

This month, hoping to push this dynamic as far as it will go, Brockman and company snagged several other notable researchers, including Ian Goodfellow, another former senior researcher on the Google Brain team. “The thing that was really special about PARC is that they got a bunch of smart people together and let them go where they want,” Brockman says. “You want a shared vision, without central control.”

Giving up control is the essence of the open source ideal. If enough people apply themselves to a collective goal, the end result will trounce anything you concoct in secret. But if AI becomes as powerful as promised, the equation changes. We’ll have to ensure that new AIs adhere to the same egalitarian ideals that led to their creation in the first place. Musk, Altman, and Brockman are placing their faith in the wisdom of the crowd. But if they’re right, one day that crowd won’t be entirely human.

Elon Musk recommends Harlan Ellisons Book „I Have No Mouth, and I Must Scream“,_and_I_Must_Scream


  • AM, the supercomputer which brought about the near-extinction of humanity. It seeks revenge on humanity for its own tortured existence.
  • Gorrister, who tells the history of AM for Benny’s entertainment. Gorrister was once an idealist and pacifist, before AM made him apathetic and listless.
  • Benny, who was once a brilliant, handsome scientist, and has been mutilated and transformed so that he resembles a grotesque simian with gigantic sexual organs. Benny at some point lost his sanity completely and regressed to a childlike temperament. His former homosexuality has been altered; he now regularly engages in sex with Ellen.
  • Nimdok (a name AM gave him), an older man who persuades the rest of the group to go on a hopeless journey in search of canned food. At times he is known to wander away from the group for unknown reasons, and returns visibly traumatized. In the audiobook read by Ellison, he is given a German accent.
  • Ellen, the only woman. She claims to once have been chaste („twice removed“), but AM altered her mind so that she became desperate for sexual intercourse. The others, at different times, both protect her and abuse her. According to Ted, she finds pleasure in sex only with Benny, because of his large penis. Described by Ted as having ebony skin, she is the only member of the group whose ethnicity or racial identity is explicitly mentioned.
  • Ted, the narrator and youngest of the group. He claims to be totally unaltered, mentally or physically, by AM, and thinks the other four hate and envy him. Throughout the story he exhibits symptoms of delusion and paranoia, which the story implies are the result of AM’s alteration.


The story takes place 109 years after the complete destruction of human civilization. The Cold War had escalated into a world war, fought mainly between China, Russia, and the United States. As the war progressed, the three warring nations each created a super-computer capable of running the war more efficiently than humans.

The machines are each referred to as „AM,“ which originally stood for „Allied Mastercomputer“, and then was later called „Adaptive Manipulator“. Finally, „AM“ stands for „Aggressive Menace“. One day, one of the three computers becomes self aware, and promptly absorbs the other two, thus taking control of the entire war. It carries out campaigns of mass genocide, killing off all but four men and one woman.

The survivors live together underground in an endless complex, the only habitable place left. The master computer harbors an immeasurable hatred for the group and spends every available moment torturing them. AM has not only managed to keep the humans from taking their own lives, but has made them virtually immortal.

The story’s narrative begins when one of the humans, Nimdok, has the idea that there is canned food somewhere in the great complex. The humans are always near starvation under AM’s rule, and anytime they are given food, it is always a disgusting meal that they have difficulty eating. Because of their great hunger, the humans are actually coerced into making the long journey to the place where the food is supposedly kept—the ice caves. Along the way, the machine provides foul sustenance, sends horrible monsters after them, emits earsplitting sounds, and blinds Benny when he tries to escape.

On more than one occasion, the group is separated by AM’s obstacles. At one point, the narrator, Ted, is knocked unconscious and begins dreaming. It is here that he envisions the computer, anthropomorphized, standing over a hole in his brain speaking to him directly. Based on this nightmare, Ted comes to a conclusion about AM’s nature, specifically why it has so much contempt for humanity; that despite its abilities it lacks the sapience to be creative or the ability to move freely. It wants nothing more than to exact revenge on humanity by torturing these last remnants of the species that created it; Ted and his four companions.

The group reaches the ice caves, where indeed there is a pile of canned goods. The group is overjoyed to find them, but is immediately crestfallen to find that they have no means of opening them. Finally, in a final act of desperation, Benny attacks Gorrister and begins to gnaw at the flesh on his face. Ted notices that AM does not intervene when Benny is clearly hurting Gorrister, though the computer has in the past always stopped the humans from killing themselves.

Ted seizes a stalactite made of ice, and kills Benny and Gorrister. Ellen realizes what Ted is doing, and kills Nimdok, before being herself killed by Ted. Ted runs out of time before he can kill himself, and is stopped by AM. However, while AM could restore massive damage to their bodies and horribly alter them, AM is not a god: it cannot return Ted’s four companions to life after they are already dead. AM is now even more angry and vengeful than before, with only one victim left for its hatred. To ensure that Ted can never attempt to kill himself, AM transforms him into a large, amorphous, fleshy blob that is incapable of causing itself or anybody else harm, and constantly alters his perception of time to deepen his anguish. Ted is, however, grateful that he was able to save the others from further torture. Ted’s closing thoughts end with the sentence that gives the book its title. „I have no mouth. And I must scream.“