Long Live RT-X: A Global Project for an AI Megabrain for All Robotic Humankind

General
Long Live RT-X: A Global Project for an AI Megabrain for All Robotic Humankind

Like it or not, we are now in the world of generative AI. Highly complex neural networks trained on vast amounts of data can produce pictures of donkeys riding space rockets or tell us which coating of churros tastes best. Of course, large-scale language models (LLMs) are extremely useful, but there are areas where they are not yet being used. Google, the University of California, and a number of labs around the world have launched the RT-X project, which aims to use AI to create an all-purpose "brain" for robots.

So far, no one seems to have attempted anything like this, if only because the data used to train neural networks is based almost entirely on human endeavors, such as art, music, and writing. As shocking as it may seem, the Internet is not flooded with data about robots and how well they can perform certain tasks.

That's why Google and the University of California launched the RT-X project (via Fudzilla) to work with 32 robotics labs around the world to generate the data needed to train neural networks. This means collating data from millions and millions of interactions with robots, such as pick-and-place and welding on manufacturing lines.

The goal is to have a data set large enough to create an LLM that can be used to generate the code needed to program a robot to do any task. In essence, this is a general-purpose robot brain.

Although my experience programming robotic arms when I taught engineering was primitive, I can easily see the appeal and potential of this research. Instead of manually coding everything myself, I would type into the interface, "Put the oranges in the gray box and leave the apples as they are," and the LLM would handle the generation of the code needed to do that.

Using specific inputs, such as a video feed from the robot's camera, the code is automatically adjusted to take into account not only the environment in which the robot is located, but also the make and model of the robot actually being used. the RT-X model reported in IEEE Spectrum was more successful than the institute's best efforts at coding in the first test of the RT-X model reported in IEEE Spectrum.

The next step was even more impressive. The human brain is very good at reasoning: if you tell someone to pick up an apple and place it between a soda can and an orange on the table, you would think they would do so without a problem. Not so with robots; usually all of this has to be coded directly into the robot.

However, Google found that even though this particular task was not part of the neural network training data set, the LLM could "figure it out."

Although the RT-X project is still in its infancy, the benefits of generative AI are tangible, and the current plan is to expand the amount of training from as many robotic facilities as possible to create a fully cross embodied LLM.

We are by nature cross embodied (i.e., we can teach our brains to perform many complex tasks, such as playing sports, riding a bike, or driving a car), but so far robots are not.

Someday, however, we will be able to go to the drive-thru, order food, and have exactly what we ordered delivered to us! If this is not progress, I don't know what is. I am an AI mega-brain overload... I mean... I can't wait to welcome these helpful robots.

Categories