Code-Executing Agent

Actor agents published

Also known as: Code Agent, Code Interpreter Agent, Programming Agent

Definition

An agent that can write and execute code as part of its problem-solving process. Unlike agents that merely generate code for humans to run, code- executing agents have access to a runtime environment (Python interpreter, shell, browser, etc.) and can run their generated code, observe outputs or errors, and iterate. This enables precise computation, data manipulation, and interaction with programmatic APIs.

What this is NOT

  • Not the same as code generation (generation without execution is not a code-executing agent)
  • Not a compiler or interpreter itself (the agent uses the interpreter as a tool)
  • Not an IDE autocomplete (that's code suggestion, not agentic execution)

Alternative Interpretations

Different communities use this term differently:

llm-practitioners

An LLM agent with access to a code interpreter (often Python in a sandbox) that can write code, execute it, observe stdout/stderr/results, and iterate. Popularized by ChatGPT Code Interpreter and similar.

Sources: OpenAI Code Interpreter documentation, Open Interpreter project, E2B, Modal, and similar sandboxing services

Examples

  • ChatGPT with Code Interpreter analyzing a CSV and generating charts
  • Open Interpreter running shell commands to organize files
  • A coding agent that writes tests, runs them, and fixes failures
  • A data science agent that writes pandas code to answer questions about data

Counterexamples

Things that might seem like Code-Executing Agent but are not:

  • GitHub Copilot suggesting code in an editor (no execution)
  • A chatbot that explains how to write code but doesn't run it
  • A documentation generator that produces code blocks for humans to copy

Relations

  • specializes tool-using-agent (Code execution is a powerful form of tool use)
  • specializes agent (Code-executing agents are a type of agent)
  • requires inference (Requires inference to generate code)
  • overlapsWith agent-loop (Typically operates in a write-run-observe loop)

Implementations

Tools and frameworks that implement this concept: