Back to feed
Scrapbook

Google Finally Releases CaP, Enabling Robot Programming by Natural Language Command

NS
normalstory
cover image


On the 2nd, Google finally announced through its research blog the release of CaP, which enables robot programming via natural language commands.

The approach traditionally used to control robots was:
1) programming the robot with code to detect objects,
2) sequencing commands to move actuators,
3) using feedback loops that specify how the robot performs its tasks. Since humans were directly involved in this kind of programming, rich expression was possible.

But in today's increasingly smart-factoried era, there are bound to be practical limits on the floor. Reprogramming the policy for each new task in each environment requires people with specialized domain knowledge who must be placed on-site, and then new programming is required again for the new environment.

Google's hypothesis (proposal) for this was, "What if, given human instructions, a robot could write its own code to interact with the world?"

By leveraging the latest generation of language models — known to be trained on millions of lines of code and capable of complex reasoning, such as PaLM — the idea is that, given natural language commands, current language models can write highly proficient code that can control not only general code but also robot motions. The process goes as follows.
1) Through in-context learning,
2) when paired with several example instructions (in comment format) alongside their code,
3) the language model receives new instructions and can "autonomously" generate new code — restructuring API calls, synthesizing new functions, and expressing feedback loops.
4) In addition, by allowing each new action to be "composed" at runtime, it gains more efficient extensibility than before. Through this, Google can propose
(i) "generalization" of programming by modularizing steps 1–3, and
(ii) an alternative approach in which robots can leverage abundant open-source code and data on the internet during machine learning.

To explore this possibility, Google developed Code as Policies (CaP), a robot-centric formulation of language-model-generated programs that run on physical systems. Through this, they can propose a way for a single system to perform complex and diverse robot tasks without task-specific training.

Of course, AI tools that automatically generate source code existed before. The most representative is probably GitHub's Copilot. https://github.com/features/copilot/

GitHub Copilot · Your AI pair programmer

GitHub Copilot works alongside you directly in your editor, suggesting whole lines or entire functions for you.

github.com

That said, compared with past GitHub (IBM) approaches, the idea of using comment-based programming guidance is the same, but Google went beyond programming automation to 1) formalize this as policy (a standardization guideline at the robot-programming level) and 2) compose it into a "general (universally extensible)" programming model via modular form — which makes it incomparably more market (practical, on-site) friendly than the corporate offering.

.. It contains some paraphrasing and personal opinion, so checking the original is recommended.

Robots That Write Their Own Code - https://ai.googleblog.com/2022/11/robots-that-write-their-own-code.html?m=1

Robots That Write Their Own Code

Posted by Jacky Liang, Research Intern, and Andy Zeng, Research Scientist, Robotics at Google...

ai.googleblog.com

This English version was translated by Claude.

친절한 찰쓰씨
Written by
친절한 찰쓰씨

Pleasant Charles — UI/UX researcher at AIT. Keeping notes on design, planning, and slow days here since 2010.

More on the author's page

Keep reading

Scrapbook

What rich people work harder at than making money: keeping the maker and the money-earner separate is the key!

Sep 20, 2025·1 min
Scrapbook

Me, who doesn't know when to let go in life

Sep 20, 2025·1 min
Scrapbook

Passion is not intensity, it's grit

Sep 20, 2025·1 min