Late last week, Google research scientist Fei Xia sat in the middle of a bright open kitchen on a laptop connected to a laptop Enter commands on – an armed wheeled robot that resembles a large floor lamp. “I’m hungry,” he wrote. The robot quickly flew to a nearby workbench, carefully picked up a bag of multi-grain potato chips with large plastic tongs, and then it was Xia’s turn to provide snacks.
At the most impressive demonstration at Google’s Robotics Lab in Mountain View, California, no human coders programmed the robot to know how to respond Xia’s order. Its controlling software has learned how to translate spoken phrases into a series of body movements using millions of pages of text scraped from the web.
This means that a person does not have to use specific pre-approved wording to issue commands, which is necessary for virtual assistants like Alexa or Siri. Tell the robot “I’m hungry” and it will try to get you something to drink; tell it “Oops, I just spilled my drink” and it should come back with a sponge.
Provided by Google
“To cope with the diversity of the real world, robots need to be able to adapt and learn from their experiences,” said Carol Hausman, a senior research scientist at Google, during the presentation. It also includes the robot bringing a sponge to clean up spills. In order to interact with humans, machines must learn how to combine words in multiple ways to produce different meanings. “It’s up to the robot to understand all the nuances and intricacies of language,” Hausman said.
Google’s demonstration is a move toward creating robots that can interact with robots. A step towards a long-term goal. Humans in complex environments. Over the past few years, researchers have discovered that feeding large amounts of text from books or the web into large machine learning models can produce programs with impressive language skills, including OpenAI’s text generator GPT-3. By digesting the many forms of online writing, software can gain the ability to summarize or answer questions about the text, generate coherent essays on a given topic, and even engage in persuasive conversations.
Google and other big tech companies are using these large language models extensively for search and advertising. Many companies offer the technology through cloud APIs, and new services are popping up that apply AI language capabilities to tasks like generating code or writing ad copy. Google engineer Blake Lemoine was recently fired after publicly warning that a chatbot called LaMDA powered by the technology could be sentient. A Google VP still employed by the company at The
think chatting with bots feels like “talking to smart people”.
Despite these advances, AI programs are still prone to confusion or gibberish. Language models trained using web text also lack a grasp of truth and often reproduce biased or hateful language in their training data, suggesting that careful engineering may be required to reliably guide bots without spinning out of control.
Hausman’s bot is powered by PaLM, the most powerful language model Google has announced to date. It has a lot of tricks, including explaining in natural language how to reach a specific conclusion when answering a question. The same method is used to generate a series of steps that a robot will perform to perform a given task.