Fork, don't mutate
Fork before any action the agent might want to roll back. Keep the base sandbox clean.
AI agents that execute code face a fundamental problem: code can fail, corrupt state, or cause unintended side effects.
Iris solves this with instant forks:
import { Sandbox } from '@iris/sdk'import { generateCode } from './your-llm'
async function safeExecute(sandbox: Sandbox, task: string) { const branch = await sandbox.fork()
try { const code = await generateCode(task) const result = await branch.exec.run(code)
if (!result.ok) { throw new Error(result.stderr) }
return { success: true, result } } catch (error) { await branch.kill() return { success: false, error } }}import { Sandbox } from '@iris/sdk'import Anthropic from '@anthropic-ai/sdk'
const anthropic = new Anthropic()
async function reactAgent(task: string) { const sandbox = await Sandbox.create() const messages: Anthropic.MessageParam[] = []
while (true) { const response = await anthropic.messages.create({ model: 'claude-opus-4-7', max_tokens: 4096, messages: [ { role: 'user', content: task }, ...messages, ], tools: [{ name: 'execute_code', description: 'Execute a shell command in the sandbox', input_schema: { type: 'object' as const, properties: { code: { type: 'string', description: 'Shell command to run' }, }, required: ['code'], }, }], })
if (response.stop_reason === 'end_turn') break
const toolResults: Anthropic.ToolResultBlockParam[] = []
for (const block of response.content) { if (block.type === 'tool_use') { const input = block.input as { code: string }
// Fork before execution — original sandbox is preserved on failure const branch = await sandbox.fork() const result = await branch.exec.run(input.code)
if (result.ok) { toolResults.push({ type: 'tool_result', tool_use_id: block.id, content: result.stdout, }) } else { await branch.kill() toolResults.push({ type: 'tool_result', tool_use_id: block.id, content: `Error (exit ${result.exit_code}): ${result.stderr}`, is_error: true, }) } } }
messages.push({ role: 'assistant', content: response.content }) if (toolResults.length > 0) { messages.push({ role: 'user', content: toolResults }) } }
await sandbox.kill()}Checkpoint at each successful step so you can fork back to the last good state:
const checkpoints: string[] = []
for (const step of workflow) { const cp = await sandbox.checkpoint.create({ name: step.name }) checkpoints.push(cp.checkpoint_id)
const result = await sandbox.exec.run(step.code)
if (!result.ok) { // Roll back to the last good checkpoint and retry or bail const lastGood = checkpoints[checkpoints.length - 2] if (lastGood) await sandbox.checkpoint.restore(lastGood) console.error(`Step ${step.name} failed:`, result.stderr) break }}Fork from a decision point to explore multiple approaches simultaneously:
await sandbox.exec.run('python3 setup.py')
const approaches = ['approach_a.py', 'approach_b.py', 'approach_c.py']
const results = await Promise.all( approaches.map(async (approach) => { const branch = await sandbox.fork() const result = await branch.exec.run(`python3 ${approach}`) await branch.kill() return { approach, stdout: result.stdout, ok: result.ok } }),)
const best = results.find((r) => r.ok)Fork, don't mutate
Fork before any action the agent might want to roll back. Keep the base sandbox clean.
Use timeouts
Set timeout_ms on exec.run() to prevent runaway processes from blocking your agent.
Check result.ok
result.ok is true only when exit_code === 0. Always check before treating output as valid.
Clean up branches
Call branch.kill() on completed forks. Suspended sandboxes still consume quota.