via “unified-code-action-space-for-llm-agents”
Official Repo for ICML 2024 paper "Executable Code Actions Elicit Better LLM Agents" by Xingyao Wang, Yangyi Chen, Lifan Yuan, Yizhe Zhang, Yunzhu Li, Hao Peng, Heng Ji.
Unique: Uses executable Python code as the ONLY action representation (vs. ReAct's text-based reasoning + tool calls, or function-calling APIs that separate action generation from execution). The LLM generates code directly, executes it in isolated environments, and receives execution feedback to refine subsequent code — creating a tight feedback loop between generation and validation.
vs others: Achieves 20% higher success rates on M³ToolEval benchmarks compared to text-based or JSON-based agent action spaces because code execution provides deterministic, verifiable feedback that grounds the LLM's reasoning in actual system behavior rather than simulated tool responses.