Repository:
ComposioHQ/agent-orchestratorAnalysis Date: 2026-02-22 Analyst: Claude Opus 4.6 Source Path:/tmp/ai-harness-repos/agent-orchestrator/Report Length Target: 2000+ lines of detailed analysis
- Design Philosophy & Goals
- Core Architecture
- Harness Workflow
- Subagent Orchestration
- Multi-Agent & Parallelization Strategy
- Isolation Model
- Human-in-the-Loop Controls
- Context Handling
- Session Lifecycle
- Code Quality Gates
- Security & Compliance
- Hooks & Automation
- CLI & UX
- Cost & Usage Visibility
- Tooling & Dependencies
- External Integrations
- Operational Assumptions & Prerequisites
- Failure Modes & Recovery
- Governance & Guardrails
- Roadmap & Evolution Signals
- What to Borrow / Adapt into Maestro
- Cross-Links
Confidence: High
Agent Orchestrator (AO) positions itself as a parallel AI coding agent harness with a clear tagline from the README:
"Spawn parallel AI coding agents. Monitor from one dashboard. Merge their PRs."
This is not a general-purpose AI orchestration framework. It is laser-focused on software development workflows where multiple AI coding agents work on different issues simultaneously, each in an isolated workspace, producing pull requests that a human reviews and merges.
Source: /tmp/ai-harness-repos/agent-orchestrator/README.md (lines 1-10)
The codebase reveals several deliberate design choices:
-
Plugin-Everything Architecture: Every capability is behind a plugin interface — runtime, agent, workspace, tracker, SCM, notifier, terminal, lifecycle. This allows swapping implementations without touching core logic.
-
Process Isolation via tmux: Rather than embedding agents in-process, AO spawns them as independent terminal processes inside tmux sessions. This is a pragmatic choice: Claude Code, Codex, Aider, and OpenCode are all CLI tools that expect a terminal environment.
-
Flat-File State Over Databases: All session state lives in the filesystem as key=value metadata files. No SQLite, no Postgres, no Redis. This trades query capability for operational simplicity — you can debug state with
catandls. -
Polling Over Event-Driven: The lifecycle manager polls every 30 seconds. The web dashboard polls every 5 seconds via SSE. There is no event bus, no pub/sub, no WebSocket push from core. This is explicitly acknowledged as a limitation.
-
Fail-Open for Enrichment, Fail-Closed for Safety: PR enrichment (CI status, reviews) has timeouts and falls back gracefully. But CI status detection for open PRs is fail-closed — if the GitHub API errors, it reports "failing" rather than "none," preventing premature merges.
-
Developer-Local First: The entire system runs on a single developer's machine. There is no multi-user support, no cloud deployment story, no containerization. The "server" is a Next.js dev server on localhost.
- Extremely pragmatic: Instead of building a complex IPC system, they leverage tmux — a battle-tested terminal multiplexer that already handles process management, session persistence, and output capture.
- Low barrier to entry: If you have tmux and a coding agent CLI, you can start using AO immediately. No infrastructure setup required.
- Plugin system is well-designed: Clean interfaces with manifest metadata, Zod validation, and type-safe registration.
- Single-machine constraint: No distributed execution. All agents run on one machine, sharing CPU/memory/disk.
- No persistence guarantees: If the machine reboots, tmux sessions are lost. Session restoration depends on the agent supporting
--resume. - Polling latency: 30-second lifecycle polling means state changes can take up to 30 seconds to be detected and reacted to.
- No cost controls: While cost is tracked (see Section 14), there are no budget limits, spending alerts, or automatic shutoff mechanisms.
The README lists support for agents: Claude Code, Codex CLI, Aider, OpenCode. However:
- Proven: Claude Code plugin is 786 lines of deeply integrated code with JSONL parsing, activity detection, cost extraction, session restoration, and workspace hooks.
- Aspirational: Codex, Aider, and OpenCode plugins exist but are significantly thinner. The plugin registry lists them (
packages/core/src/plugin-registry.ts, line 20-23) but several have placeholder implementations.
Source: /tmp/ai-harness-repos/agent-orchestrator/packages/core/src/plugin-registry.ts (lines 14-30)
Confidence: High
agent-orchestrator/
├── packages/
│ ├── core/ # Types, config, session manager, lifecycle, plugins
│ ├── cli/ # Commander.js CLI (ao command)
│ ├── web/ # Next.js dashboard
│ ├── plugins/
│ │ ├── agent-claude-code/
│ │ ├── runtime-tmux/
│ │ ├── workspace-worktree/
│ │ ├── scm-github/
│ │ ├── tracker-github/
│ │ ├── tracker-linear/
│ │ ├── notifier-desktop/
│ │ └── notifier-slack/
│ └── integration-tests/
├── pnpm-workspace.yaml
└── agent-orchestrator.yaml.example
Source: /tmp/ai-harness-repos/agent-orchestrator/pnpm-workspace.yaml (lines 1-3)
The monorepo uses pnpm workspaces with two package locations: packages/* and packages/plugins/*. All packages are ESM-only ("type": "module" in root package.json) with TypeScript in strict mode.
The plugin architecture defines eight distinct capability slots:
| Slot | Purpose | Built-in Implementations |
|---|---|---|
runtime |
Process execution environment | tmux, process |
agent |
AI coding agent | claude-code, codex, aider, opencode |
workspace |
Code isolation | worktree, clone |
tracker |
Issue tracking | github, linear |
scm |
Source code management | github |
notifier |
Notifications | desktop, slack, composio, webhook |
terminal |
Terminal UI integration | iterm2, web |
lifecycle |
State machine customization | core (default) |
Each plugin implements a specific TypeScript interface and is registered with a manifest:
// From types.ts, lines 900-930
export interface PluginManifest {
name: string; // e.g., "tmux"
slot: string; // e.g., "runtime"
version: string;
description?: string;
}
export interface PluginModule<T = unknown> {
manifest: PluginManifest;
create: (ctx?: PluginContext) => T | Promise<T>;
}Source: /tmp/ai-harness-repos/agent-orchestrator/packages/core/src/types.ts (lines 900-960)
The registry is a simple Map keyed by "slot:name":
// plugin-registry.ts
const plugins = new Map<string, PluginModule>();
function register(mod: PluginModule): void {
const key = `${mod.manifest.slot}:${mod.manifest.name}`;
plugins.set(key, mod);
}
function get<T>(slot: string, name: string): T {
const key = `${slot}:${name}`;
const mod = plugins.get(key);
if (!mod) throw new Error(`Plugin not found: ${key}`);
return mod.create() as T;
}Source: /tmp/ai-harness-repos/agent-orchestrator/packages/core/src/plugin-registry.ts
The web package cannot use dynamic import() due to webpack bundling constraints, so it imports plugins statically:
// packages/web/src/lib/services.ts, lines 25-30
import pluginRuntimeTmux from "@composio/ao-plugin-runtime-tmux";
import pluginAgentClaudeCode from "@composio/ao-plugin-agent-claude-code";
import pluginWorkspaceWorktree from "@composio/ao-plugin-workspace-worktree";
import pluginScmGithub from "@composio/ao-plugin-scm-github";
import pluginTrackerGithub from "@composio/ao-plugin-tracker-github";
import pluginTrackerLinear from "@composio/ao-plugin-tracker-linear";This is a practical workaround but creates a maintenance burden — new plugins must be manually added to this import list.
The web package uses a globalThis-cached singleton pattern for services initialization:
// packages/web/src/lib/services.ts, lines 38-58
const globalForServices = globalThis as typeof globalThis & {
_aoServices?: Services;
_aoServicesInit?: Promise<Services>;
};
export function getServices(): Promise<Services> {
if (globalForServices._aoServices) {
return Promise.resolve(globalForServices._aoServices);
}
if (!globalForServices._aoServicesInit) {
globalForServices._aoServicesInit = initServices().catch((err) => {
globalForServices._aoServicesInit = undefined;
throw err;
});
}
return globalForServices._aoServicesInit;
}Note the error recovery: if initialization fails, the cached promise is cleared so subsequent calls retry rather than permanently returning a rejected promise.
AO uses a SHA-256 hash of the config file's directory path to create globally unique namespaces:
// paths.ts
export function generateConfigHash(configDir: string): string {
const resolved = realpathSync(configDir);
return createHash("sha256").update(resolved).digest("hex").slice(0, 12);
}The directory hierarchy:
~/.agent-orchestrator/
{12-char-hash}-{projectId}/
sessions/
{sessionName}/
metadata # key=value flat file
prompt.md # agent system prompt
archive/
{sessionName}_{timestamp} # archived metadata
worktrees/
{sessionName}/ # git worktree checkout
Source: /tmp/ai-harness-repos/agent-orchestrator/ARCHITECTURE.md (full document)
Source: /tmp/ai-harness-repos/agent-orchestrator/packages/core/src/paths.ts
The hash is computed from the resolved path (symlinks followed via realpathSync), meaning /foo/bar and /foo/bar-link -> /foo/bar hash to the same value. This prevents accidental duplication.
Collision detection is also implemented — each instance directory contains an .origin file storing the original path. If two different config directories produce the same hash prefix, the system will detect and error.
Source: /tmp/ai-harness-repos/agent-orchestrator/packages/core/src/paths.ts (validateAndStoreOrigin function)
The central types.ts file is 1084 lines and defines the entire domain model. Key types:
Session (the core entity):
export interface Session {
id: string;
projectId: string;
status: SessionStatus;
activity: ActivityState;
branch?: string;
issueId?: string;
pr?: { number: number; url: string; title?: string };
workspacePath?: string;
runtimeHandle?: RuntimeHandle;
agentInfo?: AgentInfo;
timestamps: SessionTimestamps;
metadata: SessionMetadata;
}SessionStatus (the state machine):
export const SESSION_STATUS = {
SPAWNING: "spawning",
WORKING: "working",
PR_OPEN: "pr_open",
CI_FAILED: "ci_failed",
REVIEW_PENDING: "review_pending",
CHANGES_REQUESTED: "changes_requested",
APPROVED: "approved",
MERGEABLE: "mergeable",
MERGED: "merged",
CLEANUP: "cleanup",
NEEDS_INPUT: "needs_input",
STUCK: "stuck",
ERRORED: "errored",
KILLED: "killed",
DONE: "done",
TERMINATED: "terminated",
} as const;ActivityState (runtime observation):
export const ACTIVITY_STATE = {
ACTIVE: "active",
IDLE: "idle",
WAITING_INPUT: "waiting_input",
BLOCKED: "blocked",
EXITED: "exited",
UNKNOWN: "unknown",
} as const;EventType (33 distinct event types triggering reactions):
export const EVENT_TYPE = {
SESSION_SPAWNED: "session.spawned",
SESSION_KILLED: "session.killed",
AGENT_ACTIVE: "agent.active",
AGENT_IDLE: "agent.idle",
AGENT_STUCK: "agent.stuck",
AGENT_NEEDS_INPUT: "agent.needs_input",
AGENT_EXITED: "agent.exited",
PR_OPENED: "pr.opened",
PR_MERGED: "pr.merged",
CI_PASSING: "ci.passing",
CI_FAILING: "ci.failing",
CI_PENDING: "ci.pending",
REVIEW_APPROVED: "review.approved",
REVIEW_CHANGES_REQUESTED: "review.changes_requested",
// ... 19 more
} as const;Source: /tmp/ai-harness-repos/agent-orchestrator/packages/core/src/types.ts (lines 1-1084)
Confidence: High
The typical AO workflow proceeds as follows:
- Configuration: User creates
agent-orchestrator.yamldefining projects, plugins, and reactions. - Start:
ao startlaunches the dashboard and spawns an orchestrator meta-agent. - Spawn: The orchestrator (or user) spawns worker sessions via
ao spawn <project> <issue>. - Work: Each worker agent runs in its own tmux session with an isolated git worktree.
- Monitor: The lifecycle manager polls every 30 seconds, tracking status transitions.
- React: When events occur (CI failure, review request, etc.), the reaction engine sends messages to agents or notifies humans.
- Review: PRs appear on the dashboard. Humans review and merge (or agents auto-merge if configured).
- Cleanup: After merge, sessions are cleaned up — worktrees removed, metadata archived.
The sessionManager.spawn() method in session-manager.ts is the most complex operation. Here is the exact sequence:
1. Validate issue exists (tracker.getIssue)
2. Generate session prefix from issue title
3. Reserve session ID atomically (O_EXCL file creation)
4. Create workspace (git worktree with new branch)
5. Run post-create hooks (symlinks, commands)
6. Build agent prompt (3-layer composition)
7. Get agent launch command
8. Get agent environment variables
9. Create runtime (tmux session)
10. Send launch command to runtime
11. Write metadata file (session_id, project_id, issue_id, branch, etc.)
12. Run post-launch setup (e.g., write Claude hooks)
At each step, failure triggers cleanup of previously completed steps:
// session-manager.ts, spawn method (simplified)
try {
const workspace = await workspacePlugin.create(...);
try {
const handle = await runtimePlugin.create(...);
try {
await runtimePlugin.sendMessage(handle, launchCommand);
await writeMetadata(...);
await agentPlugin.setupWorkspaceHooks?.(...);
} catch (err) {
await runtimePlugin.destroy(handle);
throw err;
}
} catch (err) {
await workspacePlugin.destroy(workspace);
throw err;
}
} catch (err) {
// Clean up session ID reservation
await deleteSessionDir(sessionDir);
throw err;
}Source: /tmp/ai-harness-repos/agent-orchestrator/packages/core/src/session-manager.ts (spawn method, approximately lines 80-250)
The ao batch-spawn command handles spawning multiple sessions:
// packages/cli/src/commands/spawn.ts (batch-spawn)
// 1. Check for duplicates against existing sessions
// 2. Check for duplicates within the batch
// 3. Spawn sequentially with 500ms delays
// 4. Report summary (success/failure counts)The 500ms delay between spawns is a pragmatic rate-limiting measure to avoid overwhelming the system.
Source: /tmp/ai-harness-repos/agent-orchestrator/packages/cli/src/commands/spawn.ts
Confidence: High
AO has a two-tier orchestration model:
- Tier 1 — The Orchestrator: A special agent session (suffixed
-orchestrator) that receives a comprehensive system prompt listing all AO CLI commands. It can spawn workers, check status, send messages, and manage the workflow. - Tier 2 — Worker Agents: Individual coding agents, each assigned to a single issue.
The orchestrator is spawned by ao start:
// packages/cli/src/commands/start.ts
const orchestratorPrompt = generateOrchestratorPrompt(config, project);
await sessionManager.spawnOrchestrator({
projectId,
prompt: orchestratorPrompt,
});The orchestrator prompt (generated in orchestrator-prompt.ts) includes:
- Project information (repo, branch, tracker)
- Quick-start section showing how to spawn agents
- Complete command reference table
- Session management workflows
- Dashboard information
- Configured reaction rules
- Common workflow patterns (bulk issue processing, stuck agent handling, PR review flow)
- Tips for effective orchestration
Source: /tmp/ai-harness-repos/agent-orchestrator/packages/core/src/orchestrator-prompt.ts (full file)
The orchestrator communicates with AO only through the CLI — it runs ao spawn, ao status, ao send, etc. as shell commands in its tmux session. There is no programmatic API between the orchestrator agent and the AO core.
This is both a strength and a limitation:
- Strength: The orchestrator uses the same interface as a human. No special plumbing needed.
- Limitation: Shell command parsing introduces latency and potential for error. The orchestrator must interpret CLI text output.
Worker agents receive their initial task via the system prompt and their first message (the issue content). Subsequent communication happens through runtime.sendMessage():
// runtime-tmux/src/index.ts, sendMessage method
async sendMessage(handle: RuntimeHandle, message: string): Promise<void> {
// Clear any partial input first
await sendKeys(handle.id, "C-u", false);
if (message.length > 200) {
// Use tmux named buffer for long messages
await loadBuffer(handle.id, message);
await pasteBuffer(handle.id);
await sleep(300);
await sendKeys(handle.id, "Enter", false);
} else {
await sendKeys(handle.id, message, true);
}
}The 200-character threshold and named buffer approach is a workaround for tmux's key-sending limitations. Messages longer than ~1000 characters can be corrupted when sent character-by-character, so the load-buffer/paste-buffer approach is used instead.
Source: /tmp/ai-harness-repos/agent-orchestrator/packages/plugins/runtime-tmux/src/index.ts (lines 70-95)
Worker agents cannot communicate with each other directly. All coordination goes through:
- The orchestrator (via
ao send) - Git (via shared repository)
- GitHub/Linear (via issue comments and PR reviews)
This is a deliberate design choice — it prevents complex agent interaction patterns but keeps the system simple and auditable.
Confidence: High
AO's parallelism is embarrassingly parallel — each agent works on an independent issue in an independent workspace. There is no:
- Shared memory between agents
- Lock coordination
- Task dependency graphs
- Work-stealing queues
- Agent-to-agent communication channels
This simplicity is the system's greatest strength for its intended use case. Each agent produces an independent PR. Conflicts, if any, are handled at the git level (merge conflicts in the target branch).
The system imposes no resource limits at the orchestration level. Each tmux session runs an AI agent process that:
- Consumes API tokens (Claude, OpenAI, etc.)
- Uses CPU for local processing
- Uses disk for workspace files
- Uses network bandwidth for API calls and git operations
There is no mechanism to:
- Limit the number of concurrent sessions
- Throttle API call rates across agents
- Set memory or CPU limits per agent
- Define a total budget ceiling
The only rate-limiting is the 500ms delay between batch spawns.
Source: /tmp/ai-harness-repos/agent-orchestrator/packages/cli/src/commands/spawn.ts (batch-spawn, sequential spawning loop)
The lifecycle manager polls all active sessions concurrently:
// lifecycle-manager.ts, pollAll method
const results = await Promise.allSettled(
activeSessions.map(session => this.pollSession(session))
);But it has a re-entrancy guard to prevent overlapping poll cycles:
if (this._polling) return;
this._polling = true;
try {
// ... poll all sessions
} finally {
this._polling = false;
}This means if a poll cycle takes longer than 30 seconds (e.g., due to slow GitHub API calls), the next cycle is skipped rather than creating concurrent polls.
Source: /tmp/ai-harness-repos/agent-orchestrator/packages/core/src/lifecycle-manager.ts (pollAll method)
The web API enriches session data in parallel with timeouts:
// packages/web/src/app/api/sessions/route.ts, lines 39-52
// Metadata enrichment: 3 second timeout
const metaTimeout = new Promise<void>((resolve) => setTimeout(resolve, 3_000));
await Promise.race([enrichSessionsMetadata(...), metaTimeout]);
// PR enrichment: 4 second timeout
const enrichPromises = workerSessions.map((core, i) => {
if (!core.pr) return Promise.resolve();
return enrichSessionPR(dashboardSessions[i], scm, core.pr);
});
const enrichTimeout = new Promise<void>((resolve) => setTimeout(resolve, 4_000));
await Promise.race([Promise.allSettled(enrichPromises), enrichTimeout]);The dual timeout approach (3s for metadata, 4s for PR data) ensures the dashboard remains responsive even when external APIs are slow. If enrichment times out, the dashboard shows stale or incomplete data rather than hanging.
Confidence: High
Each agent session gets its own git worktree — a separate checkout of the same repository on a different branch:
// workspace-worktree/src/index.ts, create method (simplified)
async create(options: WorkspaceCreateOptions): Promise<string> {
const worktreePath = path.join(worktreeBaseDir, sessionId);
// Fetch latest from origin
await execFile("git", ["fetch", "origin"], { cwd: repoPath });
// Create worktree with new branch from origin/defaultBranch
await execFile("git", [
"worktree", "add",
"-b", branchName,
worktreePath,
`origin/${defaultBranch}`,
], { cwd: repoPath });
return worktreePath;
}Git worktrees are lightweight (they share the .git object store) but provide complete filesystem isolation. Each agent has its own working directory, index, and branch.
Source: /tmp/ai-harness-repos/agent-orchestrator/packages/plugins/workspace-worktree/src/index.ts (lines 30-100)
Each agent runs in a separate tmux session with its own:
- PTY (pseudo-terminal)
- Environment variables
- Process tree
- Working directory
// runtime-tmux/src/index.ts, create method
async create(options: RuntimeCreateOptions): Promise<RuntimeHandle> {
await newSession({
name: tmuxSessionName,
startDir: options.workspacePath,
env: options.environment,
detached: true,
});
// Send the launch command
await sendKeys(tmuxSessionName, launchCommand);
return {
id: tmuxSessionName,
type: "tmux",
data: { createdAt: Date.now() },
};
}Source: /tmp/ai-harness-repos/agent-orchestrator/packages/plugins/runtime-tmux/src/index.ts (lines 20-55)
What IS isolated:
- Filesystem (separate worktrees, separate branches)
- Process (separate tmux sessions)
- Environment variables (set per session)
- Git state (separate index, HEAD, working tree)
What is NOT isolated:
- Network: All agents share the same network. One agent making excessive API calls affects others.
- Credentials: All agents share the same
ghCLI authentication, the same~/.claudeconfig, the same API keys. - CPU/Memory: No cgroups, no containers, no resource limits.
- Git remote: All worktrees push to the same remote. Branch name collisions are possible (though mitigated by the naming convention).
- Agent configuration directories: Claude Code stores per-project settings in
~/.claude/projects/. ThetoClaudeProjectPathfunction converts workspace paths to Claude's directory encoding, but multiple sessions for the same project could potentially interfere.
Branches are named by the tracker plugin:
// tracker-github/src/index.ts
branchName(issueId: string): string {
return `feat/issue-${issueId}`;
}
// tracker-linear/src/index.ts
branchName(issueId: string): string {
return `feat/${identifier}`; // e.g., feat/ENG-123
}This deterministic naming means two sessions for the same issue would conflict. The batch-spawn command includes deduplication logic to prevent this:
// spawn.ts, batch-spawn
// Check for existing sessions with the same issue
const existing = sessions.filter(s => s.issueId === issueId);
if (existing.length > 0) {
console.warn(`Skipping issue ${issueId}: already has session ${existing[0].id}`);
continue;
}Source: /tmp/ai-harness-repos/agent-orchestrator/packages/cli/src/commands/spawn.ts
The workspace plugin supports symlinking shared resources into worktrees:
// workspace-worktree/src/index.ts, postCreate method
for (const link of project.symlinks ?? []) {
// Path traversal guard
const resolved = path.resolve(worktreePath, link.target);
if (!resolved.startsWith(worktreePath)) {
throw new Error(`Symlink target escapes workspace: ${link.target}`);
}
await fs.symlink(link.source, resolved);
}This allows sharing large dependencies (like node_modules or build caches) across worktrees without duplicating them. The path traversal guard prevents symlinks from escaping the workspace directory.
Source: /tmp/ai-harness-repos/agent-orchestrator/packages/plugins/workspace-worktree/src/index.ts (postCreate method)
Confidence: High
The web dashboard provides a Kanban-style view of all sessions grouped by attention level:
// Dashboard.tsx, lines 24, 28-41
const KANBAN_LEVELS = ["working", "pending", "review", "respond", "merge"] as const;
const grouped = useMemo(() => {
const zones: Record<AttentionLevel, DashboardSession[]> = {
merge: [], // Ready to merge
respond: [], // Agent needs human input
review: [], // PR needs human review
pending: [], // Waiting for CI/other
working: [], // Agent actively coding
done: [], // Completed
};
for (const session of sessions) {
zones[getAttentionLevel(session)].push(session);
}
return zones;
}, [sessions]);Source: /tmp/ai-harness-repos/agent-orchestrator/packages/web/src/components/Dashboard.tsx (lines 24-41)
The dashboard exposes four actions:
- Send Message (
handleSend): Send a text message to a running agent viaPOST /api/sessions/:id/send - Kill Session (
handleKill): Terminate an agent with confirmation dialog viaPOST /api/sessions/:id/kill - Merge PR (
handleMerge): Merge a pull request viaPOST /api/prs/:number/merge - Restore Session (
handleRestore): Restore a killed/exited session viaPOST /api/sessions/:id/restore
// Dashboard.tsx, lines 50-86
const handleSend = async (sessionId: string, message: string) => {
const res = await fetch(`/api/sessions/${encodeURIComponent(sessionId)}/send`, {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({ message }),
});
};
const handleKill = async (sessionId: string) => {
if (!confirm(`Kill session ${sessionId}?`)) return;
// ...
};
const handleMerge = async (prNumber: number) => {
const res = await fetch(`/api/prs/${prNumber}/merge`, { method: "POST" });
};
const handleRestore = async (sessionId: string) => {
if (!confirm(`Restore session ${sessionId}?`)) return;
// ...
};The getAttentionLevel function (in @/lib/types) maps session state to human attention urgency. This drives both the dashboard layout and the dynamic favicon (showing counts of sessions needing attention).
The DynamicFavicon component updates the browser tab to show the project status at a glance, so a human can monitor multiple projects across browser tabs.
Humans are notified through multiple channels:
- Desktop: OS-native notifications (macOS
osascript, Linuxnotify-send) - Slack: Rich Block Kit messages to webhook URLs
- Composio: (mentioned in config but plugin not explored in detail)
- Webhook: Generic HTTP webhook
Notifications are routed by priority:
# agent-orchestrator.yaml.example
notificationRouting:
critical: [slack, desktop]
high: [slack, desktop]
normal: [slack]
low: [slack]Source: /tmp/ai-harness-repos/agent-orchestrator/agent-orchestrator.yaml.example
- Before spawn: Human (or orchestrator) decides which issues to assign
- During work: Human can send messages to guide the agent
- At PR creation: Human reviews the PR on GitHub
- At merge: Human (or auto-merge) decides when to merge
- On failure: Human can kill, restore, or send instructions
- Kill switch:
ao stopterminates everything
- No approval gates: There is no mechanism to require human approval before an agent takes a specific action (e.g., deploying, running tests, modifying security-sensitive files).
- No content filtering: Agent outputs are not screened before being committed or pushed.
- No rollback: If a PR is merged and breaks something, there is no automated rollback mechanism.
- Message-only intervention: The only way to influence a running agent is to send it a text message. There is no way to modify its system prompt, change its tools, or restrict its actions mid-session.
Confidence: High
The prompt-builder.ts composes agent prompts from three layers:
Layer 1 — Base Agent Prompt (hardcoded in prompt-builder.ts):
You are working on a software engineering task...
- Follow the project's existing patterns and conventions
- Create focused, well-scoped commits
- Open a PR when your work is ready for review
- If CI fails, investigate and fix
- If review feedback is received, address it
Layer 2 — Config-Derived Context (from agent-orchestrator.yaml):
Project: {projectName}
Repository: {repo}
Default Branch: {defaultBranch}
Tracker: {tracker.plugin}
Issue: {issueTitle} ({issueUrl})
{issue.body}
Layer 3 — User Rules (from agentRules / agentRulesFile):
{agentRules string}
{contents of agentRulesFile}
The composition is done in buildPrompt():
// prompt-builder.ts (simplified)
export function buildPrompt(options: PromptOptions): string | null {
const parts: string[] = [BASE_AGENT_PROMPT];
if (options.projectName) {
parts.push(`## Project Context\nProject: ${options.projectName}`);
}
if (options.issue) {
parts.push(`## Task\n${options.issue.title}\n${options.issue.body}`);
}
if (options.agentRules) {
parts.push(`## Project Rules\n${options.agentRules}`);
}
if (options.agentRulesFile) {
const content = readFileSync(options.agentRulesFile, "utf-8");
parts.push(`## Additional Rules\n${content}`);
}
return parts.join("\n\n");
}Source: /tmp/ai-harness-repos/agent-orchestrator/packages/core/src/prompt-builder.ts
The orchestrator receives a much richer prompt generated by orchestrator-prompt.ts. This prompt is essentially an operations manual:
// orchestrator-prompt.ts (key sections)
function generateOrchestratorPrompt(config, project): string {
return `
# Agent Orchestrator — Control Prompt
You are the orchestrator for project "${project.name}".
## Quick Start
To spawn an agent for an issue: ao spawn ${project.name} <issue-id>
## Available Commands
| Command | Description |
| ao spawn | Spawn a worker session |
| ao status | Show all sessions |
| ao send | Send message to session |
| ao session kill | Kill a session |
| ao session restore | Restore a session |
| ao review-check | Check PR review status |
## Configured Reactions
${formatReactions(config.reactions)}
## Common Workflows
### Bulk Issue Processing
1. ao batch-spawn ${project.name} issue1 issue2 issue3
2. ao status (monitor progress)
3. Review PRs as they come in
### Handling Stuck Agents
1. Check status: ao status
2. Send guidance: ao send <session> "Try approach X"
3. If still stuck: ao session kill <session>; ao spawn ...
`;
}Source: /tmp/ai-harness-repos/agent-orchestrator/packages/core/src/orchestrator-prompt.ts
When spawning a session, the tracker plugin generates context from the issue:
// tracker-github/src/index.ts
async generatePrompt(issueId: string, repo: string): Promise<string> {
const issue = await this.getIssue(issueId, repo);
return [
`# Issue #${issue.number}: ${issue.title}`,
`URL: ${issue.url}`,
`State: ${issue.state}`,
issue.labels.length ? `Labels: ${issue.labels.join(", ")}` : "",
"",
issue.body,
].filter(Boolean).join("\n");
}For Linear issues, the prompt includes more structured data:
// tracker-linear/src/index.ts
async generatePrompt(issueId: string): Promise<string> {
const issue = await this.getIssue(issueId);
return [
`# ${issue.identifier}: ${issue.title}`,
`URL: ${issue.url}`,
`State: ${issue.state}`,
`Priority: ${issue.priority}`,
issue.labels.length ? `Labels: ${issue.labels.join(", ")}` : "",
"",
issue.body,
].filter(Boolean).join("\n");
}Source: /tmp/ai-harness-repos/agent-orchestrator/packages/plugins/tracker-github/src/index.ts (lines 90-110)
Source: /tmp/ai-harness-repos/agent-orchestrator/packages/plugins/tracker-linear/src/index.ts
When the reaction engine sends messages to agents, it composes context-aware messages:
// lifecycle-manager.ts (reaction execution, simplified)
if (reaction.action === "send-to-agent") {
const message = reaction.message ?? getDefaultMessage(eventType);
await sessionManager.send(session.id, message);
}Default messages are event-specific, e.g.:
- CI failed: "CI checks are failing. Please investigate the failures and fix them."
- Changes requested: "Review feedback has been received. Please address the requested changes."
- Merge conflicts: "There are merge conflicts. Please resolve them."
- No conversation history: AO does not maintain or inject previous conversation context when sending messages to agents. Each message is stateless.
- No cross-session context: If Agent A discovers something relevant to Agent B, there is no mechanism to share that context.
- No dynamic context refresh: The agent's system prompt is set at spawn time and never updated. If the issue is updated on GitHub/Linear after spawning, the agent won't see the changes unless told explicitly.
- No context window management: AO does not track or manage the agent's context window usage. Long-running agents may lose their initial instructions as conversation history grows.
Confidence: High
The lifecycle manager implements a state machine with the following transitions:
spawning -> working (agent starts processing)
working -> pr_open (agent creates PR)
working -> needs_input (agent requests human input)
working -> stuck (agent appears stuck)
working -> errored (runtime dies unexpectedly)
pr_open -> ci_failed (CI checks fail)
pr_open -> review_pending (CI passes, awaiting review)
pr_open -> working (agent still working after PR creation)
ci_failed -> working (agent fixing CI issues)
ci_failed -> pr_open (CI re-run passes)
review_pending -> changes_requested (reviewer requests changes)
review_pending -> approved (reviewer approves)
changes_requested -> working (agent addressing feedback)
approved -> mergeable (CI passes + approved)
mergeable -> merged (PR merged)
merged -> cleanup -> done (workspace cleaned up)
needs_input -> working (human sends message)
stuck -> working (agent resumes)
ANY -> killed (human kills session)
ANY -> terminated (orchestrator terminates)
The determineStatus() function in lifecycle-manager.ts follows this priority order:
// lifecycle-manager.ts, determineStatus (simplified logic)
function determineStatus(session: Session): SessionStatus {
// 1. Runtime dead?
if (!session.runtimeHandle || !await runtime.isAlive(session.runtimeHandle)) {
return session.pr?.merged ? "done" : "errored";
}
// 2. Agent activity
const activity = await agent.getActivityState(session);
if (activity === "waiting_input") return "needs_input";
if (activity === "blocked") return "stuck";
if (activity === "exited") return session.pr ? "pr_open" : "done";
// 3. PR state
if (session.pr) {
const prState = await scm.getPRState(session.pr.number);
if (prState.merged) return "merged";
const ci = await scm.getCISummary(session.pr.number);
if (ci === "failing") return "ci_failed";
const review = await scm.getReviewDecision(session.pr.number);
if (review === "changes_requested") return "changes_requested";
if (review === "approved") {
const mergeable = await scm.getMergeability(session.pr.number);
if (mergeable.canMerge) return "mergeable";
return "approved";
}
return "review_pending";
}
// 4. Default
return activity === "active" ? "working" : "working";
}Source: /tmp/ai-harness-repos/agent-orchestrator/packages/core/src/lifecycle-manager.ts (determineStatus, approximately lines 100-200)
The Claude Code plugin provides two activity detection mechanisms:
Mechanism 1 — Terminal Output Parsing (deprecated):
// agent-claude-code/src/index.ts
classifyTerminalOutput(output: string): ActivityState {
// Look for prompt characters: ❯ > $ #
if (/[❯>$#]\s*$/.test(lastLine)) return "idle";
if (/permission/i.test(lastLine)) return "waiting_input";
return "active";
}Mechanism 2 — JSONL Introspection (preferred):
// agent-claude-code/src/index.ts, getActivityState
async getActivityState(session): Promise<ActivityState> {
// 1. Check if process is running
const processRunning = await this.isProcessRunning(session);
if (!processRunning) return "exited";
// 2. Read last JSONL entry from Claude's session file
const entry = await readLastJsonlEntry(sessionFile);
switch (entry.type) {
case "user":
case "tool_use":
case "progress":
return "active";
case "assistant":
case "summary":
case "result":
// Check idle threshold
if (Date.now() - entry.timestamp > readyThresholdMs) {
return "idle";
}
return "active"; // "ready" maps to "active" with threshold
case "permission_request":
return "waiting_input";
case "error":
return "blocked";
}
}The JSONL approach reads Claude Code's internal session files (stored in ~/.claude/projects/), parsing only the last 128KB to avoid reading potentially 100MB+ files:
// agent-claude-code/src/index.ts
async parseJsonlFileTail(filePath: string): Promise<JsonlEntry[]> {
const TAIL_BYTES = 128 * 1024; // 128KB
const stat = await fs.stat(filePath);
const start = Math.max(0, stat.size - TAIL_BYTES);
// Read from offset, split by newlines, parse each line as JSON
}Source: /tmp/ai-harness-repos/agent-orchestrator/packages/plugins/agent-claude-code/src/index.ts (lines 300-500)
The cleanup process is multi-step:
// session-manager.ts, cleanup method
async cleanup(sessionId: string): Promise<void> {
const session = await this.get(sessionId);
// Check prerequisites
if (session.pr) {
const prState = await scm.getPRState(session.pr.number);
if (!prState.merged) {
throw new Error("Cannot cleanup: PR not yet merged");
}
}
// 1. Destroy runtime (kill tmux session)
if (session.runtimeHandle) {
await runtime.destroy(session.runtimeHandle);
}
// 2. Destroy workspace (remove git worktree)
if (session.workspacePath) {
await workspace.destroy(session.workspacePath);
}
// 3. Archive metadata
await archiveMetadata(session.id);
}Notably, the workspace plugin does NOT delete the git branch when removing a worktree:
// workspace-worktree/src/index.ts, destroy method
// NOTE: Does NOT delete the branch (safety measure)
await execFile("git", ["worktree", "remove", "--force", worktreePath]);This is a safety measure — branches are kept in case they need to be referenced later.
Source: /tmp/ai-harness-repos/agent-orchestrator/packages/plugins/workspace-worktree/src/index.ts (destroy method)
Sessions can be restored from archive:
// session-manager.ts, restore method
async restore(sessionId: string): Promise<Session> {
// 1. Find archived metadata
const archived = await readArchivedMetadata(sessionId);
// 2. Validate restorability
if (!isRestorable(archived)) {
throw new SessionNotRestorableError(sessionId, reason);
}
// 3. Recreate workspace if needed
if (!await workspace.exists(archived.workspacePath)) {
await workspace.restore(archived);
}
// 4. Try agent's restore command (e.g., claude --resume <uuid>)
const restoreCmd = await agent.getRestoreCommand(archived);
// 5. Create new runtime with restore command
const handle = await runtime.create({
launchCommand: restoreCmd ?? agent.getLaunchCommand(archived),
workspacePath: archived.workspacePath,
});
// 6. Write new metadata
await writeMetadata(sessionId, { ...archived, runtimeHandle: handle });
return session;
}The Claude Code agent supports restoration via session UUID:
// agent-claude-code/src/index.ts, getRestoreCommand
async getRestoreCommand(session): Promise<string | null> {
// Find the Claude session UUID from JSONL files
const sessionUuid = await findSessionUuid(session.workspacePath);
if (!sessionUuid) return null;
return `claude --resume ${sessionUuid}`;
}Source: /tmp/ai-harness-repos/agent-orchestrator/packages/plugins/agent-claude-code/src/index.ts (getRestoreCommand method)
Confidence: Medium
The CI workflow runs on GitHub Actions:
# .github/workflows/ci.yml
jobs:
lint:
- pnpm lint
typecheck:
- pnpm --filter '!@composio/ao-web' build # Build non-web first
- pnpm --filter @composio/ao-web typecheck # Then check web
test:
- pnpm --filter '!@composio/ao-web' test
test-web:
- sudo apt-get install tmux # tmux needed for integration tests
- pnpm --filter @composio/ao-web testSource: /tmp/ai-harness-repos/agent-orchestrator/.github/workflows/ci.yml
The project uses TypeScript strict mode:
// tsconfig.json (root)
{
"compilerOptions": {
"strict": true,
"module": "Node16",
"moduleResolution": "Node16"
}
}The CLAUDE.md file codifies conventions:
.jsextensions in all imports (ESM requirement)node:prefix for Node.js builtinstypekeyword for type-only imports- Zod for runtime validation of external data
Source: /tmp/ai-harness-repos/agent-orchestrator/CLAUDE.md
Configuration is validated with Zod schemas:
// config.ts
const ProjectSchema = z.object({
repo: z.string(),
path: z.string().optional(),
defaultBranch: z.string().default("main"),
sessionPrefix: z.string().optional(),
tracker: z.object({
plugin: z.string(),
// ...
}).optional(),
scm: z.object({
plugin: z.string(),
}).optional(),
// ...
});
const ConfigSchema = z.object({
dataDir: z.string().optional(),
port: z.number().optional(),
defaults: DefaultsSchema.optional(),
projects: z.record(ProjectSchema),
notifiers: z.record(z.any()).optional(),
reactions: z.record(ReactionSchema).optional(),
});Source: /tmp/ai-harness-repos/agent-orchestrator/packages/core/src/config.ts
- No linting rules visible: The
pnpm lintcommand exists but the specific ESLint/Biome configuration was not explored. - No test coverage requirements: No coverage thresholds or coverage reporting observed.
- No integration test suite: The
packages/integration-testsdirectory exists but its contents were not fully explored. - No end-to-end tests: No Playwright, Cypress, or similar E2E testing framework observed.
- No API contract testing: The web API endpoints have no schema validation on responses.
Confidence: High
The CLAUDE.md file mandates:
"Shell commands: ALWAYS use execFile with explicit argument arrays, NEVER use exec with string interpolation. Always set timeouts for child processes. Never interpolate user input into shell commands."
This is consistently followed throughout the codebase. Every shell command uses execFile:
// tmux.ts
import { execFile } from "node:child_process";
export function listSessions(): Promise<string[]> {
return new Promise((resolve, reject) => {
execFile("tmux", ["list-sessions", "-F", "#{session_name}"],
{ timeout: 5000 },
(err, stdout) => { /* ... */ }
);
});
}exec is never used anywhere in the codebase. This is the single most important security measure — it eliminates an entire class of command injection vulnerabilities.
Source: /tmp/ai-harness-repos/agent-orchestrator/packages/core/src/tmux.ts
Source: /tmp/ai-harness-repos/agent-orchestrator/CLAUDE.md
Multiple layers of defense:
- Session ID validation:
validateSessionIduses regex/^[a-zA-Z0-9_-]+$/to reject path traversal characters.
// metadata.ts
export function validateSessionId(id: string): void {
if (!/^[a-zA-Z0-9_-]+$/.test(id)) {
throw new Error(`Invalid session ID: ${id}`);
}
}- Symlink target validation: The workspace plugin validates that symlink targets don't escape the workspace:
// workspace-worktree/src/index.ts
const resolved = path.resolve(worktreePath, link.target);
if (!resolved.startsWith(worktreePath)) {
throw new Error(`Symlink target escapes workspace: ${link.target}`);
}- URL encoding in API routes: Session IDs are encoded/decoded when used in URL paths.
Source: /tmp/ai-harness-repos/agent-orchestrator/packages/core/src/metadata.ts
The security CI workflow runs three checks:
# .github/workflows/security.yml
jobs:
gitleaks:
- uses: gitleaks/gitleaks-action@v2
with:
args: "--full-history" # Scan entire git history
dependency-review:
- uses: actions/dependency-review-action@v4
with:
fail-on-severity: moderate # Block moderate+ vulns
npm-audit:
- run: pnpm audit --audit-level high --prod # Strict on prod depsSource: /tmp/ai-harness-repos/agent-orchestrator/.github/workflows/security.yml
- No credential storage: AO does not store any credentials itself. It relies on ambient credentials (
gh auth,LINEAR_API_KEY,SLACK_WEBHOOK_URL). - Environment variable passing: Agent sessions receive environment variables via tmux
-eflags, which means they appear inpsoutput briefly during session creation. - Historical incident: The SECURITY.md documents a past token leak (OpenClaw token) that was detected and mitigated.
Source: /tmp/ai-harness-repos/agent-orchestrator/SECURITY.md
- No agent sandboxing: Agents have full filesystem and network access. A compromised agent could read credentials, exfiltrate code, or modify other worktrees.
- No output sanitization: Agent-generated code is committed directly. No static analysis, dependency scanning, or security review of generated changes.
- No authentication on web dashboard: The Next.js dashboard runs on localhost with no authentication. Anyone with network access to the port can view sessions, send messages, kill agents, and merge PRs.
- No HTTPS: Dashboard uses plain HTTP on localhost.
- No rate limiting on API endpoints: The web API has no rate limiting or abuse prevention.
Confidence: High
The reaction engine is the core automation mechanism. It maps events to actions:
# agent-orchestrator.yaml.example
reactions:
ci-failed:
trigger: ci.failing
action: send-to-agent
message: "CI checks are failing. Please investigate and fix."
retries: 2
escalation:
action: notify
after: "10m"
changes-requested:
trigger: review.changes_requested
action: send-to-agent
message: "Review feedback received. Please address the changes."
approved-and-green:
trigger: review.approved
condition: ci.passing
action: notify
message: "PR is approved and CI is green. Ready to merge."
agent-stuck:
trigger: agent.stuck
action: notify
priority: high
escalation:
action: notify
after: "15m"
priority: criticalSource: /tmp/ai-harness-repos/agent-orchestrator/agent-orchestrator.yaml.example
// lifecycle-manager.ts, executeReaction (simplified)
async executeReaction(session: Session, eventType: EventType, reaction: ReactionConfig): Promise<void> {
const key = `${session.id}:${reaction.name}`;
const attempts = this.reactionAttempts.get(key) ?? 0;
// Check escalation
if (reaction.escalation) {
const firstAttempt = this.reactionFirstAttempt.get(key);
const duration = firstAttempt ? Date.now() - firstAttempt : 0;
if (
(reaction.retries && attempts >= reaction.retries) ||
(reaction.escalation.after && duration > parseDuration(reaction.escalation.after))
) {
// Execute escalation action instead
return this.executeAction(session, reaction.escalation);
}
}
// Execute primary action
await this.executeAction(session, reaction);
this.reactionAttempts.set(key, attempts + 1);
}Source: /tmp/ai-harness-repos/agent-orchestrator/packages/core/src/lifecycle-manager.ts (executeReaction, approximately lines 250-330)
The Claude Code plugin installs a PostToolUse hook that monitors agent actions:
#!/bin/bash
# METADATA_UPDATER_SCRIPT (embedded in agent-claude-code/src/index.ts)
# Detects PR creation, branch switches, and PR merges
METADATA_FILE="$AO_METADATA_PATH"
case "$TOOL_NAME" in
"Bash")
# Detect: gh pr create
if echo "$TOOL_OUTPUT" | grep -q "github.com.*pull/"; then
PR_URL=$(echo "$TOOL_OUTPUT" | grep -o "https://github.com[^ ]*pull/[0-9]*")
PR_NUM=$(echo "$PR_URL" | grep -o "[0-9]*$")
echo "pr_number=$PR_NUM" >> "$METADATA_FILE"
echo "pr_url=$PR_URL" >> "$METADATA_FILE"
fi
# Detect: git checkout -b / git switch -c
if echo "$TOOL_INPUT" | grep -qE "git (checkout -b|switch -c)"; then
BRANCH=$(echo "$TOOL_INPUT" | grep -oE "(checkout -b|switch -c) [^ ]+" | awk '{print $NF}')
echo "branch=$BRANCH" >> "$METADATA_FILE"
fi
# Detect: gh pr merge
if echo "$TOOL_INPUT" | grep -q "gh pr merge"; then
echo "pr_merged=true" >> "$METADATA_FILE"
fi
;;
esacThis hook runs inside Claude Code's process and updates the session metadata file in real-time, without waiting for the next lifecycle poll.
Source: /tmp/ai-harness-repos/agent-orchestrator/packages/plugins/agent-claude-code/src/index.ts (METADATA_UPDATER_SCRIPT, approximately lines 30-80)
After creating a workspace, the system can run arbitrary commands:
# agent-orchestrator.yaml.example
projects:
my-project:
postCreate:
- "npm install"
- "npm run build"These are executed via execFile in the worktree directory after creation and symlinking.
The config loader applies sensible defaults if no reactions are configured:
// config.ts (applyDefaultReactions, simplified)
const DEFAULT_REACTIONS = {
"ci-failed": { trigger: "ci.failing", action: "send-to-agent" },
"changes-requested": { trigger: "review.changes_requested", action: "send-to-agent" },
"bugbot-comments": { trigger: "review.automated_comments", action: "send-to-agent" },
"merge-conflicts": { trigger: "pr.conflicts", action: "send-to-agent" },
"approved-and-green": { trigger: "review.approved", condition: "ci.passing", action: "notify" },
"agent-stuck": { trigger: "agent.stuck", action: "notify", priority: "high" },
"agent-needs-input": { trigger: "agent.needs_input", action: "notify" },
"agent-exited": { trigger: "agent.exited", action: "notify" },
"all-complete": { trigger: "orchestrator.all_complete", action: "notify" },
};Source: /tmp/ai-harness-repos/agent-orchestrator/packages/core/src/config.ts
The auto-merge.yaml example shows aggressive automation:
# examples/auto-merge.yaml
reactions:
auto-merge:
trigger: review.approved
condition: ci.passing
action: auto-merge
message: "Auto-merging approved PR with passing CI."This allows PRs to be merged automatically when they have both approval and passing CI, with no human confirmation step.
Source: /tmp/ai-harness-repos/agent-orchestrator/examples/auto-merge.yaml
Confidence: High
ao
├── init # Interactive setup wizard
├── start # Start orchestrator + dashboard
├── stop # Stop orchestrator + dashboard
├── status # Show session status table
├── spawn # Spawn a single agent session
├── batch-spawn # Spawn multiple sessions
├── send # Send message to a session
├── review-check # Check PR review status
├── dashboard # Open dashboard in browser
├── open # Open terminal for a session
└── session
├── ls # List sessions
├── kill # Kill a session
├── cleanup # Clean up completed sessions
└── restore # Restore a killed session
Source: /tmp/ai-harness-repos/agent-orchestrator/packages/cli/src/index.ts
The ao init command provides an interactive setup experience:
// packages/cli/src/commands/init.ts
// Detects environment:
// - git repo presence and remote URL
// - default branch
// - tmux availability
// - gh CLI and authentication
// - LINEAR_API_KEY presence
// - SLACK_WEBHOOK_URL presence
// - Project type (package.json, Cargo.toml, etc.)It has an --auto mode for non-interactive setup and a --smart flag that has a TODO for AI-powered rule generation based on the project structure.
Source: /tmp/ai-harness-repos/agent-orchestrator/packages/cli/src/commands/init.ts
The ao status command renders a rich terminal table:
Session Branch PR CI Review Threads Activity Age
fix-auth-1 feat/issue-42 #123 passing approved 0 active 2h
add-api-2 feat/issue-43 #124 failing pending 2 idle 1h
refactor-3 feat/issue-44 — — — — working 30m
Data is gathered in parallel for responsiveness:
// status.ts (simplified)
const sessions = await sessionManager.list();
const enriched = await Promise.all(
sessions.map(async (s) => {
const [prState, ci, review] = await Promise.all([
scm?.getPRState(s.pr?.number),
scm?.getCISummary(s.pr?.number),
scm?.getReviewDecision(s.pr?.number),
]);
return { ...s, prState, ci, review };
})
);The status command also has a fallback mode for when no config exists — it discovers tmux sessions directly:
// status.ts
// Fallback: discover tmux sessions matching ao- pattern
const tmuxSessions = await listTmuxSessions();
const aoSessions = tmuxSessions.filter(name => name.match(/^[a-f0-9]{12}-/));Source: /tmp/ai-harness-repos/agent-orchestrator/packages/cli/src/commands/status.ts
Strengths:
- Clean Commander.js structure with proper subcommands
- Parallel data fetching for responsive output
- Graceful degradation (fallback when config missing)
- Confirmation prompts for destructive operations (kill, restore)
- Summary reports for batch operations
Limitations:
- No color/formatting library (raw console.log)
- No progress indicators for long operations
- No
--jsonoutput flag for scripting - No shell completion support
- No
--dry-runfor spawn/batch-spawn (only for cleanup)
Confidence: Medium
The Claude Code plugin extracts cost data from agent session JSONL files:
// agent-claude-code/src/index.ts, extractCost
async extractCost(session): Promise<CostInfo | null> {
const entries = await this.parseJsonlFileTail(sessionFile);
let totalCostUsd = 0;
let inputTokens = 0;
let outputTokens = 0;
for (const entry of entries) {
if (entry.costUSD) {
totalCostUsd += entry.costUSD;
}
if (entry.usage) {
inputTokens += entry.usage.input_tokens ?? 0;
outputTokens += entry.usage.output_tokens ?? 0;
}
}
// Rough estimate if no costUSD field
if (totalCostUsd === 0 && (inputTokens > 0 || outputTokens > 0)) {
// Sonnet 4.5 pricing as default
totalCostUsd = (inputTokens * 3 + outputTokens * 15) / 1_000_000;
}
return { totalCostUsd, inputTokens, outputTokens };
}Source: /tmp/ai-harness-repos/agent-orchestrator/packages/plugins/agent-claude-code/src/index.ts (extractCost method)
- Token usage: Input and output token counts from JSONL entries
- Cost estimates: Either from explicit
costUSDfields or rough estimates using Sonnet 4.5 pricing - Session duration: Computed from
handle.data.createdAtin the runtime plugin
- No aggregated cost view: No total cost across all sessions, projects, or time periods
- No budget limits: No mechanism to set a maximum spend per session, project, or globally
- No cost alerts: No notification when spend exceeds a threshold
- No automatic shutoff: No kill-switch when costs escalate
- No per-agent model pricing: The rough estimate uses Sonnet 4.5 pricing regardless of which model the agent actually uses
- No API call counting: GitHub API calls (which have rate limits) are not tracked
- No cost display in CLI: The
ao statuscommand does not show cost information - No cost display on dashboard: The dashboard does not show per-session or aggregate costs
The dashboard does detect GitHub API rate limiting:
// Dashboard.tsx, lines 90-93
const anyRateLimited = useMemo(
() => sessions.some((s) => s.pr && isPRRateLimited(s.pr)),
[sessions],
);When rate-limited, a warning banner is shown explaining that PR data may be stale. This is a good UX touch but is reactive rather than preventive.
Confidence: High
| Dependency | Purpose | Version Constraint |
|---|---|---|
| Node.js | Runtime | >= 20 |
| pnpm | Package manager | 9.15.4 (exact) |
| tmux | Terminal multiplexer | Required |
| git | Version control | >= 2.25 (worktree support) |
| gh | GitHub CLI | Required for GitHub integration |
| TypeScript | Language | Strict mode, ESM |
| Next.js | Web dashboard | App Router |
| Commander.js | CLI framework | — |
| Zod | Schema validation | — |
The system has hard dependencies on external CLI tools:
- tmux: Required for process isolation. No fallback. Version checked at runtime via
isTmuxAvailable(). - git: Required for workspace management. Must support worktrees (Git 2.25+).
- gh: Required for GitHub integration (SCM + tracker). Must be authenticated (
gh auth status). - Claude Code CLI: Required for the primary agent plugin. Must be installed and configured.
- LINEAR_API_KEY: Required only if using the Linear tracker plugin
- SLACK_WEBHOOK_URL: Required only if using the Slack notifier plugin
- Composio SDK: Alternative transport for Linear integration
- iTerm2: Optional terminal integration for macOS
// package.json (root)
{
"scripts": {
"build": "turbo build",
"dev": "turbo dev",
"lint": "turbo lint",
"test": "turbo test",
"typecheck": "turbo typecheck",
"release": "changeset publish"
}
}The project uses Turborepo for monorepo build orchestration and Changesets for release management.
- macOS: Primary development platform. Desktop notifications use
osascript. - Linux: Supported. Desktop notifications use
notify-send. - Windows: Not explicitly supported. tmux is not available natively on Windows (would require WSL).
Confidence: High
The GitHub integration is the most developed external integration, implemented across two plugins:
scm-github (581 lines):
- PR detection:
gh pr list --head <branch> - PR state:
gh pr view --json state,title,number,url,additions,deletions,files - PR merge:
gh pr merge --squash --delete-branch - CI checks:
gh pr checks --json name,state,conclusion - CI summary: Fail-closed logic for open PRs
- Reviews:
gh pr view --json reviews,reviewDecision - Pending comments: GraphQL query for review thread resolution status
- Automated comments: REST API filtering by BOT_AUTHORS
- Mergeability: Composite check (state + mergeable + CI + reviews + conflicts + draft)
The fail-closed CI summary is notable:
// scm-github/src/index.ts, getCISummary
async getCISummary(prNumber: number, repo: string): Promise<CIStatus> {
try {
const checks = await this.getCIChecks(prNumber, repo);
// ... analyze checks
} catch (err) {
// For open PRs, fail closed — report "failing" on error
// This prevents auto-merge when we can't verify CI status
const prState = await this.getPRState(prNumber, repo);
if (prState.state === "open") {
return CI_STATUS.FAILING;
}
return CI_STATUS.NONE;
}
}Source: /tmp/ai-harness-repos/agent-orchestrator/packages/plugins/scm-github/src/index.ts
tracker-github (304 lines):
- Issue CRUD: get, create, update, close/reopen
- Issue listing with filters (state, label, assignee)
- Branch name generation:
feat/issue-{number} - Prompt generation from issue content
Source: /tmp/ai-harness-repos/agent-orchestrator/packages/plugins/tracker-github/src/index.ts
The Linear plugin (722 lines) is the second most developed integration:
Dual transport:
// tracker-linear/src/index.ts
// Method 1: Direct API
if (process.env.LINEAR_API_KEY) {
this.transport = "direct";
this.apiKey = process.env.LINEAR_API_KEY;
}
// Method 2: Composio SDK
else if (composioAvailable) {
this.transport = "composio";
}State mapping:
const STATE_MAP: Record<string, IssueState> = {
triage: "open",
backlog: "open",
unstarted: "open",
started: "in_progress",
completed: "closed",
canceled: "cancelled",
};Full GraphQL API: Issues, labels, teams, workflow states, assignees, comments.
Source: /tmp/ai-harness-repos/agent-orchestrator/packages/plugins/tracker-linear/src/index.ts
Rich Block Kit messages with structured formatting:
// notifier-slack/src/index.ts
async notify(event: NotificationEvent): Promise<void> {
const blocks = [
{
type: "header",
text: { type: "plain_text", text: event.title },
},
{
type: "section",
text: { type: "mrkdwn", text: event.body },
},
{
type: "context",
elements: [
{ type: "mrkdwn", text: `*Priority:* ${priorityEmoji(event.priority)} ${event.priority}` },
{ type: "mrkdwn", text: `*Session:* ${event.sessionId}` },
],
},
];
if (event.prUrl) {
blocks.push({
type: "actions",
elements: [{
type: "button",
text: { type: "plain_text", text: "View PR" },
url: event.prUrl,
}],
});
}
await fetch(this.webhookUrl, {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({ blocks }),
});
}Source: /tmp/ai-harness-repos/agent-orchestrator/packages/plugins/notifier-slack/src/index.ts
Platform-specific implementations:
// notifier-desktop/src/index.ts
if (process.platform === "darwin") {
// macOS: osascript
await execFile("osascript", [
"-e", `display notification "${body}" with title "${title}"${sound ? " sound name \"Ping\"" : ""}`,
]);
} else {
// Linux: notify-send
await execFile("notify-send", [
...(urgency === "critical" ? ["--urgency=critical"] : []),
title,
body,
]);
}Source: /tmp/ai-harness-repos/agent-orchestrator/packages/plugins/notifier-desktop/src/index.ts
Confidence: High
- Single machine: The entire system runs on one machine. No distributed execution.
- Unix-like OS: macOS or Linux required (tmux, POSIX shell commands).
- tmux installed: No alternative runtime is production-ready.
- Git 2.25+: Worktree support required.
- Node.js 20+: ESM module support and modern APIs.
- Agent CLI installed: At minimum, Claude Code CLI must be available.
- GitHub authentication:
gh auth loginmust be completed for GitHub features.
- Stable network: Agents need internet access for API calls; dashboard needs GitHub API access for enrichment.
- Sufficient disk space: Each worktree is a full checkout. Many concurrent sessions require proportional disk.
- API rate limits: GitHub API has 5000 requests/hour for authenticated users. With many sessions and 30s polling, this budget can be consumed quickly.
- Agent API keys: Claude API key, OpenAI API key, etc. must be configured in the agent's own config.
The system was designed for 10-50 concurrent sessions on a single developer machine. Evidence:
- Session list is loaded entirely into memory (no pagination)
- Dashboard renders all sessions in a single view
- Lifecycle polling is a single loop with no sharding
- Metadata is scanned by filesystem directory listing
- No connection pooling for GitHub API calls
Beyond 50 sessions, expect:
- GitHub API rate limiting (5000 req/hr shared across all sessions)
- Lifecycle poll cycles exceeding 30 seconds
- Dashboard becoming sluggish with many cards
- tmux session management overhead
The system assumes a specific project structure:
- Git repository with remote named
origin - A single default branch (main/master)
- Issues tracked in GitHub Issues or Linear
- PRs created on GitHub (no GitLab, Bitbucket, etc.)
- Squash merge strategy (hardcoded:
gh pr merge --squash --delete-branch)
Confidence: High
The spawn sequence has cascading cleanup:
Step 1 fails (issue validation): -> No cleanup needed
Step 2 fails (session ID): -> No cleanup needed
Step 3 fails (workspace creation): -> Delete session directory
Step 4 fails (post-create hooks): -> Destroy workspace + delete session directory
Step 5 fails (runtime creation): -> Destroy workspace + delete session directory
Step 6 fails (launch command): -> Destroy runtime + workspace + delete session directory
Step 7 fails (metadata write): -> Destroy runtime + workspace + delete session directory
Step 8 fails (post-launch setup): -> Destroy runtime + workspace + delete session directory
Each step's failure handler cleans up all previously completed steps. This is implemented with nested try/catch blocks.
Source: /tmp/ai-harness-repos/agent-orchestrator/packages/core/src/session-manager.ts (spawn method)
If a tmux session dies unexpectedly:
- Detection: Next lifecycle poll checks
runtime.isAlive()-> returns false - Status update: Session status set to "errored" (if no PR) or "pr_open" (if PR exists)
- Notification: "agent-exited" reaction fires, notifying the human
- Recovery: Human can
ao session restoreto restart from archive
The 30-second polling interval means up to 30 seconds can pass before a crash is detected.
The activity detection system identifies stuck agents:
// Stuck: agent process running but no JSONL activity for extended period
if (activity === "idle" && idleDuration > stuckThresholdMs) {
return "stuck";
}Default stuck threshold is not explicitly documented but the reaction config allows a after duration for escalation:
reactions:
agent-stuck:
trigger: agent.stuck
action: notify
escalation:
action: notify
after: "15m"
priority: criticalRate Limiting:
- PR enrichment has a 4-second timeout
- If enrichment fails, dashboard shows stale data with a rate-limit warning banner
- CI summary uses fail-closed: errors -> "failing" status (prevents false merges)
Network Failures:
- SCM calls use
execFilewith timeouts - Transient failures in lifecycle polling are caught by
Promise.allSettled - Failed polls are skipped; the next cycle retries
The flat-file metadata format is simple but fragile:
- Append-only: Multiple values for the same key are resolved by taking the last one
- No atomicity: If the process crashes mid-write, the file could be truncated
- No locking: Multiple writers (agent hook + lifecycle manager) could race
The PostToolUse hook (bash script) appends to the metadata file:
echo "pr_number=$PR_NUM" >> "$METADATA_FILE"While the lifecycle manager reads the file:
const metadata = parseMetadataFile(metadataPath);There is no file locking between these operations. In practice, this is unlikely to cause issues because:
- The hook and lifecycle manager write different keys
- The file is small (< 1KB typically)
- POSIX append semantics are usually atomic for small writes
But it is a theoretical correctness gap.
If worktree removal fails (e.g., locked files), the plugin falls back to rmSync:
// workspace-worktree/src/index.ts, destroy
try {
await execFile("git", ["worktree", "remove", "--force", worktreePath]);
} catch {
// Fallback: force-remove the directory
fs.rmSync(worktreePath, { recursive: true, force: true });
}After force removal, stale worktree entries remain in git's worktree list. The restore function handles this:
// workspace-worktree/src/index.ts, restore
await execFile("git", ["worktree", "prune"]); // Clean stale entriesSource: /tmp/ai-harness-repos/agent-orchestrator/packages/plugins/workspace-worktree/src/index.ts
The web service singleton has retry logic:
// services.ts
globalForServices._aoServicesInit = initServices().catch((err) => {
globalForServices._aoServicesInit = undefined; // Clear for retry
throw err;
});If initialization fails (e.g., config file missing), the cached promise is cleared so the next request triggers a fresh attempt rather than permanently returning the cached error.
Confidence: Medium
- Confirmation dialogs: Kill and restore operations require user confirmation in the dashboard.
- Fail-closed CI: Unknown CI status is treated as "failing" for open PRs.
- Session ID validation: Regex-based validation prevents path traversal.
- Symlink target validation: Prevents workspace escape.
- Shell injection prevention:
execFileeverywhere, neverexec. - Issue validation: Sessions cannot be spawned for non-existent issues.
- Duplicate detection: Batch spawn checks for existing sessions with the same issue.
- No approval gates before merge: Auto-merge has no additional safety check beyond CI + review.
- No diff size limits: Agents can create arbitrarily large PRs.
- No file restriction: Agents can modify any file in the repository, including CI configs, security policies, and deployment scripts.
- No branch protection enforcement: AO doesn't verify that branch protection rules are configured on the target branch.
- No code review requirements: Auto-merge can bypass the "requires review" setting if the GitHub config allows it.
- No cost limits: No budget ceiling per session or project.
- No concurrency limits: No maximum number of concurrent sessions.
- No time limits: Sessions can run indefinitely.
- No output validation: Generated code is not scanned for vulnerabilities or malicious content.
There is no permission model. The system runs with the credentials of the user who started it:
- Git operations use the user's SSH keys or HTTPS tokens
- GitHub API uses the user's
gh authsession - Claude Code uses the user's API key
- tmux runs as the current user
Any agent can perform any action the user can perform.
The audit trail consists of:
- Metadata files: Show session creation time, issue, branch, PR number
- Archived metadata: Preserved after session cleanup
- Git history: All agent commits are in the git log
- Claude Code JSONL: Complete agent conversation history
- tmux capture: Terminal output can be captured (but not automatically persisted)
There is no centralized audit log, no event store, and no structured logging of orchestration decisions.
Confidence: Medium
Several TODO markers indicate planned features:
- AI-powered init (
init.ts):
// --auto --smart mode
// TODO: AI-powered rule generation based on project structure- Custom plugin loading (
plugin-registry.ts):
// loadFromConfig() — delegates to loadBuiltins,
// reserved for future custom plugin loading-
Process runtime (
plugin-registry.ts): Listed as a built-in plugin but implementation not observed. This would allow running agents without tmux. -
Clone workspace (
plugin-registry.ts): Listed as a built-in but likely less developed than the worktree plugin. Would provide full repository clones instead of worktrees.
- Plugin slots for Terminal (iterm2, web): Suggests plans for richer terminal integration beyond basic tmux.
- Lifecycle plugin slot: Suggests plans for customizable state machines, possibly for different workflow patterns.
- Composio notifier: Integration with Composio's platform suggests a path toward SaaS deployment.
- Webhook notifier: Generic webhook support enables integration with any service.
- Multiple tracker support: GitHub + Linear suggests plans for Jira, Asana, etc.
| Component | Maturity | Evidence |
|---|---|---|
| Core types | High | 1084 lines, comprehensive, well-structured |
| Session manager | High | ~1100 lines, thorough error handling |
| Lifecycle manager | High | 587 lines, reaction engine, escalation |
| Config system | High | Zod validation, defaults, collision detection |
| Claude Code plugin | High | 786 lines, deep integration |
| GitHub SCM | High | 581 lines, fail-closed CI, GraphQL |
| Linear tracker | High | 722 lines, dual transport |
| tmux runtime | Medium | 184 lines, functional but basic |
| Web dashboard | Medium | Functional UI, basic SSE |
| CLI | Medium | Feature-complete but sparse UX |
| Other agent plugins | Low | Likely thin or placeholder |
| Process runtime | Low | Listed but not observed |
| Clone workspace | Low | Listed but not fully developed |
| Composio notifier | Unknown | Mentioned but not explored |
The project uses Changesets for release management, indicating it follows semver and publishes to npm under the @composio/ao-* namespace. The presence of .github/workflows/ci.yml and security.yml suggests active CI/CD.
Confidence: High
The eight-slot plugin system is clean and extensible. The PluginManifest + PluginModule pattern with type-safe registry is worth adopting:
interface PluginModule<T> {
manifest: { name: string; slot: string; version: string };
create: (ctx?: PluginContext) => T | Promise<T>;
}Why: It enables swapping implementations without touching core logic. Adding a new agent, runtime, or tracker is a self-contained operation.
Adaptation for Maestro: Consider adding a capabilities field to the manifest for feature-flag-based plugin selection, and a healthCheck() method for runtime validation.
The pattern of reporting "failing" when CI status is unknown for open PRs is a critical safety measure:
// On API error for open PRs: return "failing" not "none"Why: Prevents auto-merge of PRs when we can't verify CI status. This is a security-relevant design decision.
Adaptation for Maestro: Apply this pattern to all safety-critical status checks. When in doubt, assume the worst case.
The reaction engine pattern (event -> action, with retries and escalation) is composable and user-configurable:
reactions:
ci-failed:
trigger: ci.failing
action: send-to-agent
retries: 2
escalation:
action: notify
after: "10m"
priority: criticalWhy: It separates orchestration policy from orchestration mechanism. Users can customize behavior without modifying code.
Adaptation for Maestro: Add more complex conditions (boolean logic, state predicates), support for custom action types, and a reaction history log.
Using O_EXCL flag for race-condition-safe session creation:
await fs.open(sessionDir, O_CREAT | O_EXCL);Why: Prevents two concurrent spawn operations from creating sessions with the same ID.
Adaptation for Maestro: Use this pattern for any resource reservation that must be atomic.
SHA-256 hash of config path for globally unique directories:
createHash("sha256").update(configDir).digest("hex").slice(0, 12);Why: Prevents collisions between multiple projects/configurations on the same machine. Simple but effective.
The discipline of never using exec with string interpolation is worth codifying:
// ALWAYS this:
execFile("git", ["checkout", "-b", branchName]);
// NEVER this:
exec(`git checkout -b ${branchName}`);Adaptation for Maestro: Make this a lint rule. Block exec and execSync in ESLint config.
Reading Claude Code's JSONL session files for activity detection is clever but tightly coupled:
// Read last 128KB of JSONL, parse last entry type
const entry = await readLastJsonlEntry(sessionFile);Why to borrow: Much more accurate than terminal output parsing. Knows exactly what the agent is doing.
Modification needed: Abstract this behind the Agent interface more cleanly. Each agent plugin should expose standardized activity signals rather than having the orchestrator parse agent-specific file formats.
The base + config + user rules approach is sound but rigid:
Modification needed: Add support for:
- Template variables in prompts
- Conditional sections based on project type
- Prompt versioning and A/B testing
- Dynamic context injection (e.g., related PR context, dependency graph)
Git worktrees are efficient (shared object store) but have limitations:
Modification needed: Support both worktrees (for speed) and full clones (for complete isolation). Consider container-based isolation for stronger security boundaries.
The Kanban grouping by attention level (working/pending/review/respond/merge/done) is intuitive:
Modification needed: Make the attention levels configurable. Different teams may have different workflows and priority signals.
The key=value text file approach is too fragile for production:
- No atomicity guarantees
- No schema evolution support
- No query capability (must read all files to list sessions)
- Race conditions between writers
Alternative for Maestro: Use SQLite (embedded, zero-config, ACID) or a structured file format (JSON with atomic rename-based writes).
While pragmatic, tmux coupling creates issues:
- Not available on Windows
- Message passing is fragile (buffer sizes, timing)
- No structured communication channel
- Output capture is lossy (screen buffer limits)
Alternative for Maestro: Implement a process runtime that uses stdin/stdout for structured communication (JSON-RPC or similar), with tmux as an optional attachment layer for debugging.
30-second polling is too slow for responsive orchestration and too wasteful for idle systems:
Alternative for Maestro: Use an event-driven architecture with filesystem watches (inotify/FSEvents), agent-reported events (via a sidecar or callback), and webhook-based SCM notifications.
Running a web dashboard without any authentication is a security gap:
Alternative for Maestro: At minimum, implement localhost-only binding with a session token. Better: proper authentication with API keys or OAuth.
gh pr merge --squash --delete-branch is hardcoded:
Alternative for Maestro: Make merge strategy configurable per project (squash/merge/rebase, delete branch or not).
-
Simplicity wins for v1: AO chose the simplest possible implementation at every layer (files over databases, polling over events, CLI over API). This allowed rapid development and easy debugging.
-
Plugin architecture pays off early: Even in a young project, the ability to swap implementations is valuable. It enables both experimentation and user customization.
-
Safety must be default-on: Fail-closed CI, confirmation dialogs, and shell injection prevention are good defaults. Auto-merge should require explicit opt-in.
-
Agent-specific integration is necessary: Generic agent interfaces are not enough. Deep integration with the specific agent (like reading Claude Code JSONL) provides dramatically better observability.
-
The orchestrator-as-agent pattern is powerful: Using an AI agent to orchestrate other AI agents (the meta-agent pattern) leverages the agent's natural language understanding for flexible task management. But it requires a very good system prompt.
This analysis is part of a broader research effort analyzing multiple AI agent orchestration frameworks. Related documents in the /Users/jeffscottward/Github/research/ai-harness/Claude/v1/ directory include:
| Document | Relevance to This Analysis |
|---|---|
swe-bench-deep-analysis.md |
SWE-bench is the primary benchmark for evaluating coding agents like those orchestrated by AO. Comparison of evaluation methodologies. |
claude-code-deep-analysis.md |
Claude Code is AO's primary agent. Deep understanding of Claude Code's internals (JSONL format, session files, hooks) is essential for understanding AO's agent plugin. |
codex-deep-analysis.md |
Codex CLI is a supported agent in AO. Compare how AO integrates Codex vs Claude Code. |
aider-deep-analysis.md |
Aider is a supported agent in AO. Compare integration depth and activity detection approaches. |
opencode-deep-analysis.md |
OpenCode is a supported agent in AO. Compare plugin maturity. |
open-hands-deep-analysis.md |
OpenHands (formerly OpenDevin) provides container-based isolation. Compare with AO's worktree/tmux approach for security and resource isolation. |
bolt-diy-deep-analysis.md |
Bolt.diy is a web-based coding assistant. Compare the dashboard/UI patterns. |
maestro-architecture.md |
The target architecture document. This analysis directly informs what patterns to adopt, adapt, or avoid. |
-
Isolation Models: AO uses worktrees + tmux. OpenHands uses Docker containers. Each has tradeoffs between speed, security, and complexity. Maestro should support both.
-
Agent Communication: AO uses terminal message passing. Some frameworks use structured APIs. The hybrid approach (terminal for legacy agents, API for modern ones) may be optimal.
-
State Management: AO uses flat files. Most production systems use databases. The trade-off is operational simplicity vs. query capability and reliability.
-
Orchestration Patterns: AO's meta-agent pattern (AI orchestrating AI) vs. rule-based orchestration vs. human-in-the-loop. Each has different reliability/flexibility trade-offs.
-
Cost Management: All frameworks struggle with cost visibility and control. This is an area where Maestro can differentiate.
| File | Lines | Purpose |
|---|---|---|
/tmp/ai-harness-repos/agent-orchestrator/packages/core/src/types.ts |
1084 | Central type definitions |
/tmp/ai-harness-repos/agent-orchestrator/packages/core/src/session-manager.ts |
~1100 | Session CRUD operations |
/tmp/ai-harness-repos/agent-orchestrator/packages/core/src/lifecycle-manager.ts |
587 | State machine + reaction engine |
/tmp/ai-harness-repos/agent-orchestrator/packages/core/src/config.ts |
~400 | Config loading + validation |
/tmp/ai-harness-repos/agent-orchestrator/packages/core/src/plugin-registry.ts |
~100 | Plugin registration + lookup |
/tmp/ai-harness-repos/agent-orchestrator/packages/core/src/paths.ts |
~200 | Hash-based directory management |
/tmp/ai-harness-repos/agent-orchestrator/packages/core/src/metadata.ts |
~200 | Flat-file metadata management |
/tmp/ai-harness-repos/agent-orchestrator/packages/core/src/prompt-builder.ts |
~150 | Three-layer prompt composition |
/tmp/ai-harness-repos/agent-orchestrator/packages/core/src/orchestrator-prompt.ts |
~250 | Meta-agent system prompt |
/tmp/ai-harness-repos/agent-orchestrator/packages/core/src/tmux.ts |
~200 | Safe tmux wrappers |
/tmp/ai-harness-repos/agent-orchestrator/packages/core/src/utils.ts |
~150 | Shell escape, JSONL parsing |
| File | Lines | Purpose |
|---|---|---|
/tmp/ai-harness-repos/agent-orchestrator/packages/plugins/agent-claude-code/src/index.ts |
786 | Claude Code agent integration |
/tmp/ai-harness-repos/agent-orchestrator/packages/plugins/runtime-tmux/src/index.ts |
184 | tmux runtime implementation |
/tmp/ai-harness-repos/agent-orchestrator/packages/plugins/workspace-worktree/src/index.ts |
301 | Git worktree workspace |
/tmp/ai-harness-repos/agent-orchestrator/packages/plugins/scm-github/src/index.ts |
581 | GitHub SCM integration |
/tmp/ai-harness-repos/agent-orchestrator/packages/plugins/tracker-github/src/index.ts |
304 | GitHub Issues tracker |
/tmp/ai-harness-repos/agent-orchestrator/packages/plugins/tracker-linear/src/index.ts |
722 | Linear tracker integration |
/tmp/ai-harness-repos/agent-orchestrator/packages/plugins/notifier-desktop/src/index.ts |
~80 | OS desktop notifications |
/tmp/ai-harness-repos/agent-orchestrator/packages/plugins/notifier-slack/src/index.ts |
~150 | Slack webhook notifications |
| File | Lines | Purpose |
|---|---|---|
/tmp/ai-harness-repos/agent-orchestrator/packages/cli/src/index.ts |
~80 | CLI entry point |
/tmp/ai-harness-repos/agent-orchestrator/packages/cli/src/commands/spawn.ts |
~200 | Spawn + batch-spawn |
/tmp/ai-harness-repos/agent-orchestrator/packages/cli/src/commands/start.ts |
~150 | Start/stop orchestrator |
/tmp/ai-harness-repos/agent-orchestrator/packages/cli/src/commands/status.ts |
~200 | Status display |
/tmp/ai-harness-repos/agent-orchestrator/packages/cli/src/commands/session.ts |
~200 | Session subcommands |
/tmp/ai-harness-repos/agent-orchestrator/packages/cli/src/commands/init.ts |
~300 | Init wizard |
| File | Lines | Purpose |
|---|---|---|
/tmp/ai-harness-repos/agent-orchestrator/packages/web/src/lib/services.ts |
84 | Service singleton |
/tmp/ai-harness-repos/agent-orchestrator/packages/web/src/components/Dashboard.tsx |
272 | Main dashboard UI |
/tmp/ai-harness-repos/agent-orchestrator/packages/web/src/app/api/sessions/route.ts |
65 | Sessions API |
/tmp/ai-harness-repos/agent-orchestrator/packages/web/src/app/api/events/route.ts |
104 | SSE events API |
| File | Purpose |
|---|---|
/tmp/ai-harness-repos/agent-orchestrator/README.md |
Project overview |
/tmp/ai-harness-repos/agent-orchestrator/ARCHITECTURE.md |
Directory architecture |
/tmp/ai-harness-repos/agent-orchestrator/CLAUDE.md |
Development conventions |
/tmp/ai-harness-repos/agent-orchestrator/SECURITY.md |
Security policy |
/tmp/ai-harness-repos/agent-orchestrator/agent-orchestrator.yaml.example |
Full reference config |
/tmp/ai-harness-repos/agent-orchestrator/examples/simple-github.yaml |
Minimal config example |
/tmp/ai-harness-repos/agent-orchestrator/examples/auto-merge.yaml |
Auto-merge config example |
| Section | Confidence | Reasoning |
|---|---|---|
| 1. Design Philosophy | High | README, ARCHITECTURE.md, and code consistently support conclusions |
| 2. Core Architecture | High | All source files read and analyzed |
| 3. Harness Workflow | High | Spawn sequence traced through code |
| 4. Subagent Orchestration | High | Orchestrator prompt and communication code reviewed |
| 5. Multi-Agent & Parallelization | High | Lifecycle manager and batch-spawn code reviewed |
| 6. Isolation Model | High | Workspace and runtime plugins fully analyzed |
| 7. Human-in-the-Loop | High | Dashboard and API code reviewed |
| 8. Context Handling | High | Prompt builder and tracker plugins reviewed |
| 9. Session Lifecycle | High | State machine and activity detection fully traced |
| 10. Code Quality Gates | Medium | CI config reviewed but lint rules and test coverage not explored |
| 11. Security | High | SECURITY.md, CI workflows, and shell security patterns reviewed |
| 12. Hooks & Automation | High | Reaction engine and PostToolUse hook fully analyzed |
| 13. CLI & UX | High | All CLI commands reviewed |
| 14. Cost & Usage | Medium | Cost extraction code reviewed but display/alerting not found |
| 15. Tooling & Dependencies | High | package.json and imports reviewed |
| 16. External Integrations | High | All plugin code reviewed |
| 17. Operational Assumptions | High | Requirements documented and validated against code |
| 18. Failure Modes | High | Error handling paths traced through code |
| 19. Governance | Medium | Security measures documented but no formal governance framework |
| 20. Roadmap | Medium | Based on TODOs, plugin stubs, and architecture patterns |
| 21. Borrow/Adapt | High | Based on thorough analysis of all sections |
┌──────────────────────────────────────────────────────┐
│ Human Developer │
│ │
│ ┌──────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ ao CLI │ │ Dashboard │ │ GitHub UI │ │
│ │ │ │ (Next.js) │ │ │ │
│ └────┬─────┘ └──────┬───────┘ └──────┬───────┘ │
└───────┼────────────────┼────────────────┼────────────┘
│ │ │
▼ ▼ │
┌───────────────────────────────┐ │
│ AO Core │ │
│ │ │
│ ┌─────────────────────────┐ │ │
│ │ Session Manager │ │ │
│ │ (spawn/list/kill/etc) │ │ │
│ └────────────┬────────────┘ │ │
│ │ │ │
│ ┌────────────▼────────────┐ │ │
│ │ Lifecycle Manager │ │ │
│ │ (poll/react/escalate) │ │ │
│ └────────────┬────────────┘ │ │
│ │ │ │
│ ┌────────────▼────────────┐ │ │
│ │ Plugin Registry │ │ │
│ │ (8 slots, 16 plugins) │ │ │
│ └────────────┬────────────┘ │ │
└───────────────┼───────────────┘ │
│ │
┌───────────┼───────────┐ │
│ │ │ │
▼ ▼ ▼ ▼
┌────────┐ ┌────────┐ ┌─────────┐ ┌──────────┐
│ tmux │ │ git │ │ Claude │ │ GitHub │
│sessions│ │worktree│ │ Code │ │ API │
│ │ │ │ │ CLI │ │(gh CLI) │
└────────┘ └────────┘ └─────────┘ └──────────┘
┌──────────┐
│ spawning │
└────┬─────┘
│
▼
┌──────────┐
┌─────│ working │◄────────────────────┐
│ └────┬─────┘ │
│ │ │
▼ ▼ │
┌──────────┐ ┌──────────┐ ┌─────┴──────┐
│needs_input│ │ pr_open │ │changes_req │
└──────────┘ └────┬─────┘ └─────┬──────┘
│ │
┌─────┼─────┐ │
│ │ │ │
▼ ▼ ▼ │
┌────────┐ ┌──────────┐ ┌──────────┐ │
│ci_fail │ │rev_pend │ │ working │──┘
└───┬────┘ └────┬─────┘
│ │
│ ┌────┼────┐
│ │ │
│ ▼ ▼
│ ┌──────────┐ ┌──────────────┐
│ │ approved │ │changes_req │
│ └────┬─────┘ └──────────────┘
│ │
│ ▼
│ ┌──────────┐
│ │mergeable │
│ └────┬─────┘
│ │
│ ▼
│ ┌──────────┐
└►│ merged │
└────┬─────┘
│
▼
┌──────────┐
│ done │
└──────────┘
(Any state) ──► killed / terminated / errored / stuck
~/.agent-orchestrator/
│
├── a1b2c3d4e5f6-my-project/ # {hash}-{projectId}
│ ├── .origin # Original config path
│ ├── sessions/
│ │ ├── fix-auth-1/
│ │ │ ├── metadata # key=value flat file
│ │ │ └── prompt.md # Agent system prompt
│ │ ├── add-api-2/
│ │ │ ├── metadata
│ │ │ └── prompt.md
│ │ └── refactor-3/
│ │ ├── metadata
│ │ └── prompt.md
│ ├── archive/
│ │ ├── old-session_1706000000 # Archived metadata
│ │ └── old-session_1706100000
│ └── worktrees/
│ ├── fix-auth-1/ # Git worktree checkout
│ ├── add-api-2/
│ └── refactor-3/
│
└── f6e5d4c3b2a1-other-project/
├── sessions/
├── archive/
└── worktrees/
# agent-orchestrator.yaml
# Global settings
dataDir: "~/.agent-orchestrator" # Base data directory
worktreeDir: null # Override worktree location
port: 3000 # Dashboard port
# Default plugin selections
defaults:
runtime: tmux # Process runtime
agent: claude-code # AI coding agent
workspace: worktree # Code isolation strategy
notifiers: # Notification channels
- composio
- desktop
# Project definitions
projects:
my-project:
repo: "owner/repo" # GitHub repository
path: "/path/to/local/repo" # Local repository path
defaultBranch: main # Default branch name
sessionPrefix: "fix" # Session name prefix
tracker: # Issue tracker
plugin: github # or "linear"
# Linear-specific:
# teamId: "TEAM_ID"
scm: # Source code management
plugin: github
symlinks: # Shared resources in worktrees
- source: "/path/to/node_modules"
target: "node_modules"
postCreate: # Commands after workspace creation
- "npm install"
- "npm run build"
agentConfig: # Agent-specific configuration
model: "claude-sonnet-4-5-20250514"
agentRules: | # Inline rules for the agent
Follow TDD. Write tests first.
agentRulesFile: ".agent-rules.md" # External rules file
# Notification configuration
notifiers:
slack:
webhookUrl: "${SLACK_WEBHOOK_URL}"
desktop: {}
# Notification routing by priority
notificationRouting:
critical: [slack, desktop]
high: [slack, desktop]
normal: [slack]
low: [slack]
# Reaction rules
reactions:
ci-failed:
trigger: ci.failing
action: send-to-agent
message: "CI is failing. Please investigate and fix."
retries: 2
escalation:
action: notify
after: "10m"
priority: critical
changes-requested:
trigger: review.changes_requested
action: send-to-agent
message: "Review feedback received. Please address."
approved-and-green:
trigger: review.approved
condition: ci.passing
action: notify
message: "PR ready to merge."
agent-stuck:
trigger: agent.stuck
action: notify
priority: high
escalation:
action: notify
after: "15m"
priority: critical
auto-merge: # Optional: auto-merge
trigger: review.approved
condition: ci.passing
action: auto-mergeSource: /tmp/ai-harness-repos/agent-orchestrator/agent-orchestrator.yaml.example
End of analysis. Total sections: 22 (21 analysis areas + cross-links). All file paths reference the source repository at /tmp/ai-harness-repos/agent-orchestrator/.