Specification

This is a working draft (v0.1.0, 2026-02-01). It is subject to change and should not be considered stable. Author: @yaoandyan.

1. Introduction

Think of an agent like a person walking. The conscious brain sets the direction, but it does not micromanage every muscle correction. Local adjustment happens in the cerebellum, close to the body, close to feedback, saving cognitive load and ensuring high-fidelity feedback.

Re in Act applies the same idea to AI agents. Top-level reasoning should define goals and constraints. The fast, local adjustments needed in the action phase should happen inside a Reason-able Action Space (RAS).

This makes the core idea simple: instead of sending every small disturbance back through an outer reason-act-observe loop, let action handle local problems where they occur.

For an agent, that means the top-level reasoning loop does not need to stop and rethink after every failed command, noisy log, or small branch in action. It can define a bounded piece of work inside the RAS, let that work adapt locally, and only return when there is something meaningful to report.

The practical result is straightforward: fewer round trips, less context bloat, and a larger effective action space. The deeper claim of this specification is that many agent failures come from the action layer being too weak, not only from the model being too weak. Re in Act strengthens that action layer by giving it a Reason-able Action Space.

2. Motivation

AI agents, especially coding agents, have evolved significantly since the advent of ReAct, which established the core paradigm of problem-solving by interleaving thought and action. Today's function calling and tool-use loops still operate within this paradigm. Yet fundamental limitations persist.

ReAct relies on iterative cycles of serialized reasoning and acting, where action must pause for observation before the next decision. This works for simple tasks, but it starts to break down when an agent has to deal with messy local reality: logs, retries, partial failures, branching conditions, and tool-specific details.

2.1 What Goes Wrong for Agents

In a ReAct loop, every small disturbance tends to bounce back to the top-level reasoning loop:

a command prints too much output
a build fails and needs diagnosis
a web page or API returns something slightly different than expected
a task needs a short retry loop or branch before the real work can continue

The agent can still solve these problems, but it solves them in an awkward way: stop action, push the intermediate data back into context, think again, then take one more small step.

2.2 Why That Feels Slow and Fragile

This outer-loop style creates three familiar agent problems:

Too many round trips: local issues that should be handled in place become extra think-act-observe cycles.
Too much noise in context: raw logs, raw tool output, and other intermediate artifacts crowd the top-level context window.
Weak control over action: branching, retries, and loops are guided by repeated model turns instead of being handled directly inside the orchestrated action phase.

As tasks get longer, the agent spends more effort managing action itself and less effort advancing the user's actual goal.

2.3 What Re in Act Changes

Re in Act introduces a Reason-able Action Space (RAS) that top-level reasoning defines and orchestrates, so local issues can be handled where they happen inside a controllable action space.

Inside the RAS, action can:

inspect intermediate data without sending everything back to the outer loop
use reason() to turn noisy local evidence into a small structured judgment
continue through code or shell control flow until the local job is actually done

The effect is simple: top-level reasoning defines the objective and the executable workspace around it, while action gets enough local structure to adapt without escalating every small deviation.

2.4 The Deeper Point

This is the agent-facing meaning of the control-theoretic argument behind Re in Act. A capable model is not enough if the action layer is too narrow and every correction must go back through the outer loop.

Re in Act expands what the agent can do inside the action space, not just what it can say before action starts. That is why the visible improvements are fewer round trips, cleaner context, and more reliable task completion.

For the full design rationale behind this control-theoretic framing, see Control-Theoretic View.

3. Overview

Re in Act transforms a serial "Reason and Act" loop into an atomic "Reason in Action" primitive: combining deterministic control flow with non-deterministic decisions in one Reason-able Action Space (RAS).

This specification models the architecture with three core elements:

Top-level reasoning: defines the RAS, its success criteria, and its high-level constraints.
RAS runtime: keeps intermediate data local, runs deterministic control flow, and carries work forward inside the action space.
reason() inside the RAS: compares goal against local reality, then turns that local deviation into structured judgments that action can use immediately.

Together these form a practical local feedback loop. This is an engineering mapping, not a claim that the whole architecture is a fully formalized control-system proof. The point is architectural: direction stays global, while sensing, comparison, and correction move closer to action.

Benefits:

Fewer round trips: Multiple outer-loop turns collapse into a single orchestrated action phase, reducing latency and cost.
Clean context window: Intermediate data (build logs, API responses, intermediate reasoning) stays inside the RAS and never enters the top-level context. Only the final observation is returned.
Deterministic control flow: for/while/if logic runs in code, not through the LLM — correctness is guaranteed by the runtime, not by model attention.
Higher effective controller variety: The RAS is a programmable, adaptive action space with more ways to sense, compare, and respond to local perturbations.

Concrete contrast, from without a RAS to with a RAS:

Command-line noise: without a RAS, command output full of ANSI color codes, progress bars, and irrelevant logs gets pushed upward verbatim; with a RAS, that noisy stream can be cleaned locally and only the relevant result needs to leave the action space.
Web retrieval noise: without a RAS, HTML-heavy page fetches return sparse useful information buried inside markup and template noise; with a RAS, local processing can extract the relevant content before returning a bounded result.
Probabilistic loops: without a RAS, deterministic tasks like traversing files and renaming them can degenerate into turn-by-turn LLM reasoning, causing duplicate work or missed cases; with a RAS, the loop runs deterministically in code while reason() is used only for the bounded local judgments that actually require it.

New complexity introduced:

A Reason-able Action Space (RAS) is required: an action context that top-level reasoning can define and orchestrate through reason() and optional act() calls. It is a programmable action space for deterministic control flow, local intermediate data, and bounded reason() / act() steps. As a result, intermediate action state does not need to be exposed step by step to the outer loop.

4. Specification

4.1 Required Roles

An implementation conforming to this specification MUST provide the following logical roles, whether or not they are exposed as separate components:

Top-level reasoner that defines and orchestrates a RAS.
reason() capability that produces schema-bounded local judgments.
A runtime inside the RAS that can execute deterministic control flow and action calls.

4.2 Reason-able Action Space (RAS)

The RAS defines the action context and interfaces for reasoning in action. It can be run in two main forms:

Bash RAS — The Unix Philosophy

The agent controls flow using Unix pipes (|) and redirection (>). Intermediate data is persisted transparently on the filesystem.

Unix pipelines matter here because they are a compositional execution mechanism: processes can be connected through standard streams while hiding internal implementation details. They do not by themselves provide global sensing or semantic understanding; those must still be supplied by prompts, schemas, and local filtering.

act --manual | \
  reason \
    --prompt "Goal: find the tools needed to collect the most relevant API and documentation context for this task." \
    --prompt - \
    --prompt "Constraints and rules: return only a JSON array of tool names. Prefer the smallest sufficient set." \
    --structure '["tool_name"]' | \
  jq -r '.data[]' | while read -r name; do
    act --manual "$name"
  done

This pipeline keeps the raw tool catalog inside the RAS, uses reason() to select the minimum useful subset, and then expands only those tool definitions without bouncing the whole decision process back to the outer loop.

Code Interpreter RAS — The Programmatic Approach

The agent manages control flow using code execution (conditions, loops, branches). Python and TypeScript are recommended to leverage model pre-training on their syntax and async semantics.

General-purpose code sandboxes are useful because they are expressive enough to encode rich local policies, retries, branches, and derived checks. In abstract computability terms they are typically Turing-complete, but implementations remain bounded by real limits such as time, memory, permissions, and sandbox policy. This specification relies on their practical expressivity, not on any claim of infinite real-world capacity.

log = open("build.log").read()
decision = await reason(
  """
  Goal: decide whether action should continue or retry.
  Observation:
  """ + log + """
  Relevant context: this is the latest CI build log and it may contain ANSI noise, progress bars, and repeated lines.
  Constraints and rules: ignore cosmetic noise and return only continue or retry plus a short reason grounded in the log.
  """,
    {"action": "continue", "reason": ""}
)
if decision["data"]["action"] == "retry":
    await act("bash", "npm run build")
else:
    await act("deploy", "deploy to production")

This script uses Python's native control flow to keep retry logic local. reason() does the bounded judgment over noisy build output, while the runtime deterministically decides whether to rerun the build or continue.

4.3 Sandbox

Two sandbox strategies are supported:

Cloud Sandbox — the RAS runs in a managed cloud environment (e.g. E2B, Deno Sandbox, Modal, Daytona). Recommended for production agent deployments.
Client-side Sandbox — the RAS runs in a controlled local or self-hosted environment (e.g. Docker, WASM, Cloudflare Workers).

5. Interfaces

reason() MUST be supported regardless of the RAS chosen. act() is OPTIONAL and may be substituted by any user-defined action execution strategy. When both are provided, they share the same contract across Bash RAS and Python RAS.

For optional capability layers that sit on top of this core contract, see Extensions Overview.

5.1 `reason([prompt], [example_output])`

Description: A local regulator interface that maps non-deterministic language inputs into deterministic structured JSON for action control. The prompt integrates the reference objective, local observations, task-relevant context, and governing constraints and rules. In control-theoretic terms, reason() first compares goal against local reality to estimate the relevant deviation, then turns that deviation into a bounded control signal. It is an atomic reasoning step — no tools, no memory, no side effects.

Control-theoretic contract:

Goal + observation + context + constraints in, control signal out — prompts should contain the target objective, relevant local observations or feedback, task-relevant context, and governing constraints or rules.
Bounded variety output — example_output defines the admissible response shape, constraining output variety into an executable channel.
Isolated local regulation — reason() must not call tools or depend on hidden conversation state; otherwise the regulation boundary becomes ambiguous.
Explicit failure semantics — on repeated schema failure, return structured error output rather than unconstrained text.

Algorithm:

Infer a JSON schema from example_output
Request the LLM for structured output using the prompt and inferred schema
Validate the structured output against the schema
- If validation fails: retry up to N−1 times with the validation error as feedback; on the N-th failure, return { "error": "<error message>" }
- If validation passes: return { "data": <structured output> }

Implementation options:

New conversation context — reuse your LLM request logic but start a fresh inference context for the local call, while keeping orchestration, state, and control in the same RAS
MCP Sampling — where the RAS acts as an MCP Server and requests completion from the MCP client

5.2 `act([name], [args])` — Optional

Description: Gets external resources or calls external tools (MCP tools, custom functions, etc.).

Definition: Given a tool name and its proposed arguments, call the tool and return its result. On failure, return the error for better feedback.

Note: Action execution is ultimately up to the runtime and user. act() is provided as a standard reference interface. Implementations may substitute any user-defined action execution strategy.

6. Implementation

For reference, we open-sourced a Re in Act implementation project called one-agent.

GitHub: RIA-Spec/one-agent
Supports both Bash RAS and Python RAS modes
Uses MCP for tool discovery and execution

7. Agent Benchmarks

Benchmark results and methodology are deferred to a future revision.

8. Extensions

Re in Act defines a small core and allows optional extensions.

Core conformance remains unchanged:

reason() is REQUIRED
act() is OPTIONAL

Extensions are opt-in capability layers that add behavior without changing the core contract.

8.1 Extension Rules

An extension:

MUST NOT change the required status of reason()
MUST define explicit input/output contract and failure semantics
SHOULD define resource boundaries (time, step, token, or cost budgets)
SHOULD preserve RAS locality and avoid hidden global-state dependencies

8.2 `agent()` as an Optional Extension

agent() may be provided as an optional extension interface for delegated execution inside a RAS.

If implemented, agent() is treated as an action-layer delegation primitive under runtime policy and budget constraints. A recommended interface shape is agent(prompt, config).

For the agent() extension contract, see Agent Interface Extension.

In this model, the RAS is the harness: agent() performs delegated work and returns { data: { text, trajectory } }, reason() checks and normalizes those signals, and runtime control flow enforces deterministic limits such as iteration count, timeout, and escalation.

On success, agent() SHOULD return { data: { text, trajectory } }. A follow-up reason() call can validate and normalize both fields into schema-bounded data for downstream control flow.

agent() is not part of the required core interfaces.

This preserves the core Re in Act architecture: strengthen action in a bounded RAS while keeping top-level reasoning and local regulation boundaries explicit.