Robust.AI’s Rodney Brooks offers valuable insights on robotics and AI, from early beginnings to future prospects in human-machine interaction.
Last week’s Automate Forward event featured numerous keynote sessions around robotics, AI, manufacturing and automation. One speech I thoroughly enjoyed was delivered by Rodney Brooks, co-founder and CTO of Robust.AI. Brooks—previously co-founder and CTO of iRobot and Rethink Robotics—is a Panasonic Professor of Robotics Emeritus at MIT, where he also served as director of MIT’s Computer Science and Artificial Intelligence Laboratory.
Since the original version of his session was quite long (over 8000 words!), the transcript below has been edited for clarity and brevity.
Very little has actually been done to date in artificial intelligence and robots. I’ve been involved in both fields for a long time. In 1977, I joined the robotics group at the Stanford Artificial Intelligence Lab. Over the years, I’ve delivered tens of millions of robots in one domain and many thousands of robots in two other domains—all with AI on them.
When we look back from the future, most people from the computer pantheon will have contributed to robotics and AI. Let’s go over my favorites for the analogy.
We start with Countess Ada Lovelace, who wrote a paper in the 1840s describing how to make a computer do things. It was a mechanical computer and she wrote the first computer program. (By the way, she also wrote the first bug.) She was never able to test the code at the time, but she speculated that computers could be programmed to do much more than just numerical computation. So she sort of opened up the idea that there could be artificial intelligence.
Then in the 1950s, Admiral Grace Hopper came along at a time when computers were all unique in their architecture and machine language. She invented the common business-oriented language (COBOL), a computer language for business which was independent of the underlying computer. Everything changed—coding changed. It was a tremendous advance in two ways: you could write a program and run it on different manufacturers’ computers, and it could be for business. This transformed computers from computing artillery tables.
Another great member of the pantheon of computer science, Douglas Engelbart, came along in the late 1960s. He realized we had to have better ways of interacting with computers, and invented the first mouse, along with all sorts of other stuff that we use today.
And that was commercialized by a guy named Steve Jobs, who got the technology out in the business.
These four are my pantheon of computer science’s important people—but in robotics and AI, the roles are still open. I want to encourage every person out there who’s getting into robots and AI to step up and fill the roles. There’s plenty of room to do great things, and most of the important work hasn’t been done yet.
Every machine in the future is going to be intelligent. All machines are going to be robots with AI on them. That’s a long way from where we are today, but over the next few decades we’re going to see old machines become robots and new machines start as robots. We already see the precursors happening.
Let’s look at automobiles as an example. When they first started out as a new invention, we didn’t even have roads. And then over the next 50 or 60 years, cars turned into really reliable machines. Over the last couple of decades, we’ve seen the turn of the digitalized machine. If you buy a new car from almost any manufacturer, you get a lot of digital technology in your car. And once they’re digital machines, they’re ripe to become autonomous machines. Tens of billions of dollars are being fed into making automobiles autonomous machines. We see the trend here: new invention, reliable machine, digitalized machine, autonomous machine.
Moore’s Law has said that if there’s a digital way to do something that was previously done mechanically, then eventually digital wins. You still need motors, but everything else that was done with things like mechanical regulators, eventually becomes digital. Also, we’re all about IoT these days—analytics and predictive modeling. That’s pushing more digitalization out into machines. So the days of non-digitalized machines are in the rearview mirror—or slightly more accurately, they’re in the backward-looking camera.
Robotification follows soon after that. For example, turning underground mining machines into robots gets people out of working in dangerous conditions 10,000 feet underground. Then there are small vehicles around town doing dull jobs like cleaning. Those can be turned into robots. Bigger machines for farming have been digitalized and robotified to some extent, but small machines are not there yet—so there’s a lot of room there. Loading luggage canisters in planes is becoming more digitalized as part of tracking. Machines for gardening, grounds-keeping, construction. They’re all becoming digitalized at the moment, so they’re ripe for robotization.
As we look forward, things can also change. Maybe we’ll go straight to robotification because the new machines of today are already digitalized, so there’s no need to wait for that digitalization step. New classes of machines will be born robotic.
There are places where we currently have no machines, but there’s a real need for more labor because of the way population and our lifestyle is changing. In elder care, there’s a demographic inversion where there used to be a lot more younger people than older people. The majority of the population is now older, and we’re running out of people across the world to help look after the elderly. There’ll be a real demand for robots that can address, not necessarily the human-to-human touch parts, but the other aspects of daily life for the elderly.
As climate change comes along and we move more and more towards controlled environments, there’s going to be a tremendous need for new machines for indoor farming, and they’re going to be born robotic.
And then of course, we’ve all seen how robotics has transformed fulfillment of goods during COVID-19. At the moment, there are still very few machines for packing—which is almost completely manual—so that’s going to change.
It’s really hard to build robot software for fulfilling needs in these new niches, and companies are going to have to adapt. It’s going to be a big, big business. To make this work, where we don’t program everything as we have in the past for industrial robots, we’re going to need our robots to have common sense.
One example of common sense could be a robotic human aid working in elder care. An elder asks for help walking to the sunroom. In this situation, the elder was just handed a full cup of coffee—maybe the aid had given them a coffee—and then the elder person says, “I’d like to go to sunroom.” The aid would naturally take the cup back from the other person and help them walk, because they understand intuitively that a person who needs help walking is going to need help balancing their full cup of coffee as they walk.
On the other hand, if the elder was holding a book, he probably doesn’t need to take it. But the robot may think, “Oh, they’re going to need their reading glasses” and ask the person about their reading glasses. That sort of common-sense interaction happens between people and their clients all the time. As we robotify tasks, we’ll need that sort of common sense to be natural.
The robot’s knowledge is common across different sorts of machines. The world is made up of objects, with appearance, roles, use and behavior. Currently, our robots mostly look at the world as point clouds. We have to have semantics attached to them. Even a child has an understanding that predicts a lot about what’s going to happen in a world with people and objects. So even having child-like understanding is going to make our robot so much better. Our machines are going to be naive for quite a while, but they will get better over time. Human-robot interaction is really important.
Many deployed robots in hospitals and malls sort of barrel along shared paths and make humans second-class citizens, which is annoying. We’re going to see more of this as we get autonomous vehicles on our roads. When is it appropriate to butt in with a request for a person? Showing enough deference to people doing what they’re doing, but at the same time being assertive enough to do the task a robot is supposed to do, is going to be a real tradeoff.
Then there’s mutual understanding. A robot needs to understand what a human is saying or gesturing, especially in elder care. So the robots need to be able to, in a sense, put themselves in the place of the human they’re interacting with to figure out their intent. They also need to be predictable, otherwise it throws people off and they want to avoid them. And so, the robots may have to do explicit signaling. Even the Roomba does explicit signaling with the little tunes it plays when it’s backing out of the recharging station, or when its dustbin is full.
We also couple with people. We couple with an elderly person who’s speaking more slowly. We couple with a child who doesn’t understand our sophisticated language, so we change our language when speaking with the child. We’ll probably want our robots to be able to invoke this unconscious coupling, because that’s going to put us together with the machines and we’ll be able to control them better.
A company building robot software in 2021 has to take care of all these things. This involves an understanding of scale and deployment from teams of different technical disciplines. It’s slow, costly, and often not the expertise of the company. That makes it high risk, and we’ve seen some failures in the autonomous vehicle domain through exactly this set of steps.
What happens is, a company will spend a year trying to figure out what to do with hardware. Then a couple of years in getting the software running on the machine and application testing. Then the little pilot with initial scaling—that can take one or two more years and can be a sort of valley of death. A large-scale deployment is many years out, and this is a long-term project. Longer term, high risk. Spend more money, high risk.
Another place where things have changed greatly over the last two or three decades is video game authoring, where in the old days you had just a load of C++ programs. Someone wrote character animation, interactions, personalities—all in C++. Over the last 10 years, that has changed drastically. Now, platforms—Unity and Unreal are the two most popular ones—largely build video games through an interaction with a platform providing the code.
At Robust.AI, we’re building what we call the first code-less platform for robot software. How to get robot software out there without having to write lots of code in lots of languages. By using AI tools, and putting them in the hands of people who understand the application, we’re trying to get the robots operating more quickly. Importantly, the application testing begins in Week One instead of waiting until the end. So in our case, even before the robot moves, you’ve got a lot of testing of the idea in the application.
So how do we do this? We use artificial intelligence, which is powered by common sense. Suppose you’re a machine manufacturer. A platform with all sorts of user interfaces extracts knowledge from people for functioning in a particular environment. The people working at the equipment manufacturer could have all sorts of titles: behavior author, task author, interaction designer. They tell the platform what they want their machine to be like. It compiles and reasons all the constraints, and adds in its own understanding of the world. Out of that comes a bunch of things—the zeros and ones for controlling machines at the computer code, along with analytics and diagnostics connecting the machine to the cloud. There’s scheduling, reporting, integration with existing software services such as BIM, WMS, ERP, NES. All that happens automatically. The machine is placed in its environment, and now has the intelligence to just work.
For now, all this is not completely automatic. As with game development, what started out as completely manual has now become automatic; that’s where we’re headed. This works through AI in the platform, the development environment, which allows the machine to plan and optimize. It also puts AI in the machine so it can interact with people—so there’s AI in multiple places. Some in the final machine, but a big chunk in the authoring tools.
As an example, consider a precision UV disinfecting robot. We’ve seen a lot of UV disinfecting robots, but essentially they blast out an incredible amount of UV and can’t be around people. In our case, we have a precision robot that goes to selected objects in a human-occupied environment, using precision control rather than blasting. It’s safe because it’s aware of people and changes its behavior in such a way that it’s never causing danger. For example, if it sees a person, it gives them a message of caution that they should go away. Meanwhile, it’s shut off the UV. It goes back to the task after they have left. If they’d stayed around too long, it eventually would have given up on that task and shown deference to the person as someone who wants to be here. It would have gone off to maybe disinfect the doorknob or something else that was on its to-do list.
Another example is a different mobile robot navigating an ever-changing retail space. Products on the shelves move all the time, so the robot actually has to be aware of shelves and use building maps from point clouds—all of which is going to use semantics. The robot has a model it’s built up of the environment, and it’s got various tasks in various places. It recognizes when things go wrong. In one instance, the robot comes across something blocking a corridor, which in its maps is an empty corridor. It realizes the thing is not a person; it’s a static object. So it reroutes. Take the same scenario where this time it’s a person that’s in the way, and the robot recognizes that the person is moving away. So it doesn’t replan—it just waits a little bit of time, and then goes on.
In another scenario where a person with a shopping cart is coming towards the robot, in deference to the person the robot gets out of the way, replans a new route, and off it goes. Pretty soon, if another person comes towards it, it’s got to replan yet again. So it just moves around as a child would do naturally, avoiding things or waiting rather than trying to barrel along. That semantic understanding is really important here.
Where we build the software, the platform knows how kinematics impact action, and how that turns into control with different kinematic robots. It uses folk psychology about people’s behavior to model what people typically do in an environment.
A golf course has certain behaviors you want. You have to understand certain objects—golf carts, for instance—and how those ontologies couple in with the task. How aggressive should the machine be? How deferential should it be, and how long should it wait? How much time should it take for people to make decisions? The machine maker gets to turn some knobs there. As with game platforms, our platform is going to evolve over time.
As some of you may know, I’ve been somewhat skeptical about the hype that has been around about AI taking over the world. I call a lot of people “hype-notists”, as they sort of hypnotize people about AI being able to do everything tomorrow. Back in September 2017, I wrote about the seven deadly sins of predicting the future of AI and how people get it wrong. Some people were surprised when I was involved in this new company. While I believe AI and robotics is the future, I just want to be a bit more realistic about it.
It’s not all going to happen. We’re just not going to be able to use machine learning and produce the necessary robot software. Deep learning comes into this. Today, we can build semantically aware, common-sense systems for a range of practical environments. It’s not going to be magic, but situations will demand that we have more intelligent machines in the world, and we’re going to be able to build them if we take the common-sense approach.
As an analogy of how this is going to unfold, the single lens reflex camera was a great innovation. You could take pretty good pictures, but you had to understand lighting and focus. Then the 7,000 series came along with auto-focus lenses, and this opened up photography for lots more people without expertise. Now we’ve gone further over generations. High-end iPhones have so much AI, and take photographs that can only be beaten by the best professional photographers. That’s how these platforms evolve.
Before, we had to do everything by hand. Now we can start to do a lot of things automatically, and it’s going to get better and better. Soon, these platforms will be as good as all but the very best people that could write robot-specific software. That’s what we’re trying to do, and I think it’s quite possible by using AI judiciously in the process.
In the meantime, don’t forget there’s a lot of work to do here. A lot of stuff hasn’t been invented, so please go and invent it. Please be part of this. This is where real growth is going to happen over the next few years.