Skip to content

AI Agents

Why Iris for AI Agents?

AI agents that execute code face a fundamental problem: code can fail, corrupt state, or cause unintended side effects.

Iris solves this with instant forks:

  1. Fork before risky operations
  2. Execute AI-generated code in the branch
  3. Discard the branch on failure — original is untouched

Basic Pattern

import { Sandbox } from '@iris/sdk'
import { generateCode } from './your-llm'
async function safeExecute(sandbox: Sandbox, task: string) {
const branch = await sandbox.fork()
try {
const code = await generateCode(task)
const result = await branch.exec.run(code)
if (!result.ok) {
throw new Error(result.stderr)
}
return { success: true, result }
} catch (error) {
await branch.kill()
return { success: false, error }
}
}

ReAct Agent Example

import { Sandbox } from '@iris/sdk'
import Anthropic from '@anthropic-ai/sdk'
const anthropic = new Anthropic()
async function reactAgent(task: string) {
const sandbox = await Sandbox.create()
const messages: Anthropic.MessageParam[] = []
while (true) {
const response = await anthropic.messages.create({
model: 'claude-opus-4-7',
max_tokens: 4096,
messages: [
{ role: 'user', content: task },
...messages,
],
tools: [{
name: 'execute_code',
description: 'Execute a shell command in the sandbox',
input_schema: {
type: 'object' as const,
properties: {
code: { type: 'string', description: 'Shell command to run' },
},
required: ['code'],
},
}],
})
if (response.stop_reason === 'end_turn') break
const toolResults: Anthropic.ToolResultBlockParam[] = []
for (const block of response.content) {
if (block.type === 'tool_use') {
const input = block.input as { code: string }
// Fork before execution — original sandbox is preserved on failure
const branch = await sandbox.fork()
const result = await branch.exec.run(input.code)
if (result.ok) {
toolResults.push({
type: 'tool_result',
tool_use_id: block.id,
content: result.stdout,
})
} else {
await branch.kill()
toolResults.push({
type: 'tool_result',
tool_use_id: block.id,
content: `Error (exit ${result.exit_code}): ${result.stderr}`,
is_error: true,
})
}
}
}
messages.push({ role: 'assistant', content: response.content })
if (toolResults.length > 0) {
messages.push({ role: 'user', content: toolResults })
}
}
await sandbox.kill()
}

Multi-Step Workflows

Checkpoint at each successful step so you can fork back to the last good state:

const checkpoints: string[] = []
for (const step of workflow) {
const cp = await sandbox.checkpoint.create({ name: step.name })
checkpoints.push(cp.checkpoint_id)
const result = await sandbox.exec.run(step.code)
if (!result.ok) {
// Roll back to the last good checkpoint and retry or bail
const lastGood = checkpoints[checkpoints.length - 2]
if (lastGood) await sandbox.checkpoint.restore(lastGood)
console.error(`Step ${step.name} failed:`, result.stderr)
break
}
}

Parallel Exploration

Fork from a decision point to explore multiple approaches simultaneously:

await sandbox.exec.run('python3 setup.py')
const approaches = ['approach_a.py', 'approach_b.py', 'approach_c.py']
const results = await Promise.all(
approaches.map(async (approach) => {
const branch = await sandbox.fork()
const result = await branch.exec.run(`python3 ${approach}`)
await branch.kill()
return { approach, stdout: result.stdout, ok: result.ok }
}),
)
const best = results.find((r) => r.ok)

Best Practices

Fork, don't mutate

Fork before any action the agent might want to roll back. Keep the base sandbox clean.

Use timeouts

Set timeout_ms on exec.run() to prevent runaway processes from blocking your agent.

Check result.ok

result.ok is true only when exit_code === 0. Always check before treating output as valid.

Clean up branches

Call branch.kill() on completed forks. Suspended sandboxes still consume quota.