Skip to content

Instantly share code, notes, and snippets.

@cniska
Last active March 6, 2026 19:58
Show Gist options
  • Select an option

  • Save cniska/fb4a55d88fc0e9c9e9b5f6615d1c8673 to your computer and use it in GitHub Desktop.

Select an option

Save cniska/fb4a55d88fc0e9c9e9b5f6615d1c8673 to your computer and use it in GitHub Desktop.
Acolyte Benchmarks

Acolyte Benchmarks

Measured comparisons of Acolyte against prominent open-source AI coding agents. All metrics are from source code analysis — no opinions, just counts.

Metrics extracted with benchmark.sh

Projects Compared

Project Language Description Source Lines Files Dependencies
Acolyte TypeScript CLI-first AI coding agent with lifecycle, guards, and evaluators 18,005 141 12 + 5
Aider Python AI pair programming in your terminal 25,880 106 480 + 313
OpenCode TypeScript Open-source AI coding agent (TUI/web/desktop) 207,748 1,042 171 + 76
Pi TypeScript Terminal coding agent harness with extensions 112,692 399 50 + 19
Goose Rust Extensible AI agent from Block with MCP integration 117,432 319 143 + 17
OpenHands Python AI-driven software development platform 120,856 699 163
Continue TypeScript AI code assistant for VS Code and JetBrains 229,431 1,458 186 + 164
Cline TypeScript Autonomous AI coding agent for VS Code 533,915 1,219 155 + 69
OpenClaw TypeScript Personal AI assistant with coding agent skill 628,159 3,551 112 + 46

Source lines exclude test files and generated code. Dependencies shown as runtime + dev.

Type Safety (TypeScript projects, per 1k source lines)

Metric Acolyte OpenCode Pi Cline Continue OpenClaw
as any 0.06 1.5 1.2 0.3 2.3 0.1
: any annotations 0.0 1.0 1.1 0.9 4.2 0.2
Non-null !. assertions 0.0
@ts-ignore / @ts-expect-error 0.0 0.2 0.0 0.1 0.4 0.0
Lint ignores (biome-ignore / eslint-disable) 0.1 0.0 0.0 0.0 0.2 0.2
: unknown usage 4.6 1.4 0.8 0.1 0.3 5.3

Acolyte has 1 total any (an FFI boundary for ast-grep). It uses unknown with explicit narrowing at 3–45x the rate of most other projects. OpenClaw also favors unknown heavily. Continue has the highest any density.

Type Safety (Python / Rust projects, per 1k source lines)

Metric Aider OpenHands
type: ignore 0.0 1.7
Any type usage 0.1 3.1
cast() calls 0.0 0.3
Metric Goose
unsafe 0.1
.unwrap() 11.2
.expect() 1.3

Aider is nearly zero on type escape hatches. Goose has a high .unwrap() density — potential panic sites at 11.2 per 1k lines.

Tech Debt (per 1k source lines)

Metric Acolyte Aider OpenCode Pi Goose OpenHands Continue Cline OpenClaw
TODO / FIXME / HACK 0.0 0.3 0.4 0.0 0.2 0.5 0.8 0.2 0.0
Comment lines 3.9 55.2 10.0 47.5 40.6 60.6 42.9 20.5 14.5

Zero tech debt markers. Low comment density reflects self-documenting code with external docs.

Test Quality

Metric Acolyte Aider OpenCode Pi Goose OpenHands Continue Cline OpenClaw
Test files 97 41 186 108 17 348 332 165 2,076
Test lines 16,129 12,321 37,040 32,572 4,726 137,765 82,421 44,423 431,818
Test / source ratio 0.90 0.48 0.18 0.29 0.04 1.14 0.36 0.08 0.69

Acolyte maintains a structured test taxonomy with four dedicated types: unit (*.test.ts), integration (*.int.test.ts), TUI visual regression (*.tui.test.ts), and performance (*.perf.test.ts). OpenHands leads on raw ratio. Goose and Cline have notably low test density.

Module Cohesion

Metric Acolyte Aider OpenCode Pi Goose OpenHands Continue Cline OpenClaw
Avg lines / file 128 244 199 283 368 172 157 438 176
Files > 500 lines 3 (2%) 14 (13%) 103 (9%) 50 (12%) 75 (23%) 54 (7%) 87 (5%) 69 (5%) 291 (8%)
Largest file 1,182 2,485 4,989 13,353 2,289 1,704 3,228 4,573 2,242
Barrel / index files 0 5 52 26 43 85 73 47 76

Acolyte has the smallest average file size, fewest large files, and zero barrel files. Flat src/ layout with small, focused modules.

Error Handling (TypeScript projects, per 1k source lines)

Metric Acolyte OpenCode Pi Cline Continue OpenClaw
.safeParse() calls 1.5 0.1 0.0 0.0 0.1 0.0
try { ... } blocks 6.6 1.3 3.7 2.3 3.8 4.9
.catch() calls 0.5 2.2 0.3 0.4 0.3 1.0

Acolyte validates at boundaries with Zod .safeParse() at 16x+ the rate of other projects rather than relying on exception-driven error handling.

GitHub Popularity

Metric Acolyte Aider OpenCode Pi Goose OpenHands Continue Cline OpenClaw
Stars 41.5k 117k 20.5k 32.5k 68.6k 31.7k 58.7k 268k
Forks 3,974 11,906 2,134 2,976 8,571 4,227 5,902 51,229
Open issues 1,410 6,414 21 391 359 1,140 778 12,607
Initial commit 2026-02-20 2023-05-09 2025-04-30 2025-08-09 2024-08-23 2024-03-13 2023-05-24 2024-07-06 2025-11-24

Acolyte's first commit is from 20 February 2026 (pre-launch, no public repo yet). Stars reflect community adoption, not code quality. OpenClaw and OpenCode dominate on stars — OpenClaw at 268k is the #11 most starred repository on all of GitHub. Pi has the fewest open issues by a wide margin. Aider and Continue are the oldest projects.

Summary

Dimension Acolyte Aider OpenCode Pi Goose OpenHands Continue Cline OpenClaw
Type safety Best Clean Weak Mid Unwrap-heavy Type-ignore-heavy Weakest Mid Good
Tech debt Zero Low Low Zero Low Mid Highest Low Zero
Test density High (0.90) Mid (0.48) Low (0.18) Low (0.29) Lowest (0.04) Highest (1.13) Mid (0.36) Low (0.08) High (0.68)
Module size Smallest (126) Mid (244) Mid (199) Large (282) Largest (368) Mid (172) Mid (157) Large (437) Mid (176)
Dependencies Lightest (17) Heaviest (793) Heavy (247) Light (69) Heavy (160) Heavy (163) Heavy (350) Heavy (224) Heavy (158)
Maturity Pre-launch Shipped Shipped Shipped Shipped Shipped Shipped Shipped Shipped

Acolyte leads on type safety, tech debt, module size, and dependency count while being the smallest codebase. The quality compounds from commit one because the tool enforces verification on every change.

#!/usr/bin/env bash
#
# Reproducible code quality benchmark extraction.
# Clones each repo to /tmp/acolyte-benchmarks/ and measures metrics.
#
# Usage: ./benchmark-all.sh
#
set -euo pipefail
WORKDIR="/tmp/acolyte-benchmarks"
mkdir -p "$WORKDIR"
# format: name|repo_url|language
declare -a PROJECTS=(
"aider|https://github.com/Aider-AI/aider.git|python"
"opencode|https://github.com/anomalyco/opencode.git|typescript"
"pi|https://github.com/badlogic/pi-mono.git|typescript"
"goose|https://github.com/block/goose.git|rust"
"openhands|https://github.com/All-Hands-AI/OpenHands.git|python"
"continue|https://github.com/continuedev/continue.git|typescript"
"cline|https://github.com/cline/cline.git|typescript"
"openclaw|https://github.com/openclaw/openclaw.git|typescript"
)
clone_or_update() {
local name="$1" url="$2"
local dir="$WORKDIR/$name"
if [ -d "$dir/.git" ]; then
echo " Updating $name..."
git -C "$dir" pull --ff-only --quiet 2>/dev/null || true
else
echo " Cloning $name..."
git clone --depth 1 --quiet "$url" "$dir"
fi
}
# --- find helpers (language-specific source/test file filters) ---
find_source_ts() {
find "$1" -type f \( -name '*.ts' -o -name '*.tsx' \) \
-not -path '*/node_modules/*' -not -path '*/.git/*' \
-not -path '*/dist/*' -not -path '*/build/*' -not -path '*/generated/*' \
-not -name '*.d.ts' \
-not -name '*.test.ts' -not -name '*.test.tsx' \
-not -name '*.spec.ts' -not -name '*.spec.tsx' \
-not -name '*.int.test.ts' -not -name '*.tui.test.ts' -not -name '*.perf.test.ts'
}
find_test_ts() {
find "$1" -type f \( \
-name '*.test.ts' -o -name '*.test.tsx' \
-o -name '*.spec.ts' -o -name '*.spec.tsx' \
-o -name '*.int.test.ts' -o -name '*.tui.test.ts' -o -name '*.perf.test.ts' \
\) -not -path '*/node_modules/*' -not -path '*/.git/*' -not -path '*/dist/*'
}
find_source_py() {
find "$1" -type f -name '*.py' \
-not -path '*/.git/*' -not -path '*/__pycache__/*' \
-not -path '*/test*/*' -not -path '*/migrations/*' -not -path '*/generated/*'
}
find_test_py() {
find "$1" -type f -name '*.py' -path '*/test*/*' \
-not -path '*/__pycache__/*' -not -path '*/.git/*'
}
find_source_rs() {
find "$1" -type f -name '*.rs' \
-not -path '*/.git/*' -not -path '*/target/*' \
-not -path '*/tests/*' -not -name '*_test.rs'
}
find_test_rs() {
find "$1" -type f -name '*.rs' \( -path '*/tests/*' -o -name '*_test.rs' \) \
-not -path '*/target/*' -not -path '*/.git/*'
}
# --- counting helpers ---
count_lines() {
# reads file list from stdin
xargs cat 2>/dev/null | wc -l | tr -d ' '
}
count_files() {
wc -l | tr -d ' '
}
grep_count() {
local pattern="$1"
{ xargs grep -c "$pattern" 2>/dev/null || true; } | awk -F: '{s+=$NF} END{print s+0}'
}
per_1k() {
local count="$1" total="$2"
if [ "$total" -eq 0 ]; then echo "0.0"; return; fi
awk "BEGIN{printf \"%.1f\", ($count / $total) * 1000}"
}
count_deps_ts() {
local dir="$1"
node -e "
const fs = require('fs');
const cp = require('child_process');
const files = cp.execSync('find \"$dir\" -name package.json -not -path \"*/node_modules/*\" -not -path \"*/.git/*\"')
.toString().trim().split('\n').filter(Boolean);
const runtime = new Set();
const dev = new Set();
for (const f of files) {
try {
const p = JSON.parse(fs.readFileSync(f, 'utf8'));
Object.keys(p.dependencies || {}).forEach(d => runtime.add(d));
Object.keys(p.devDependencies || {}).forEach(d => dev.add(d));
} catch {}
}
console.log(runtime.size + '|' + dev.size + '|' + (runtime.size + dev.size));
"
}
pytoml_deps() {
# $1 = toml file, $2 = "runtime" or "dev"
local toml="$1" kind="$2"
node -e 'const fs=require("fs"),t=fs.readFileSync(process.argv[1],"utf8"),k=process.argv[2];let m;if(k==="runtime"){m=t.match(/^dependencies\s*=\s*\[([\s\S]*?)\]/m)}else{m=t.match(/\[project\.optional-dependencies\]([\s\S]*?)(\n\[|$)/m)}if(!m){console.log(0)}else{console.log(Math.floor((m[1].match(/"/g)||[]).length/2))}' "$toml" "$kind"
}
count_deps_python() {
local dir="$1"
local runtime=0 dev=0
if [ -f "$dir/requirements.txt" ]; then
runtime=$({ grep -v '^#' "$dir/requirements.txt" | grep -v '^$' | grep -v '^-' | wc -l || echo 0; } | tr -d ' ')
elif [ -f "$dir/pyproject.toml" ]; then
runtime=$(pytoml_deps "$dir/pyproject.toml" runtime)
fi
for f in "$dir/requirements-dev.txt" "$dir/requirements/requirements-dev.txt"; do
if [ -f "$f" ]; then
dev=$({ grep -v '^#' "$f" | grep -v '^$' | grep -v '^-' | wc -l || echo 0; } | tr -d ' ')
break
fi
done
if [ "$dev" -eq 0 ] && [ -f "$dir/pyproject.toml" ]; then
dev=$(pytoml_deps "$dir/pyproject.toml" dev)
fi
echo "${runtime}|${dev}|$((runtime + dev))"
}
count_deps_rust() {
local dir="$1"
# Use node to parse TOML-like sections for unique dep names across workspace
node -e "
const cp = require('child_process');
const fs = require('fs');
const files = cp.execSync('find \"$dir\" -name Cargo.toml -not -path \"*/target/*\"')
.toString().trim().split('\n').filter(Boolean);
const runtime = new Set();
const dev = new Set();
for (const f of files) {
const text = fs.readFileSync(f, 'utf8');
let section = '';
for (const line of text.split('\n')) {
if (line.startsWith('[')) section = line;
else if (section === '[dependencies]' && /^[a-z_-]/.test(line)) {
runtime.add(line.split(/[\s=]/)[0]);
} else if (section === '[dev-dependencies]' && /^[a-z_-]/.test(line)) {
dev.add(line.split(/[\s=]/)[0]);
}
}
}
console.log(runtime.size + '|' + dev.size + '|' + (runtime.size + dev.size));
"
}
# --- main ---
echo "=== Cloning / updating repos ==="
for entry in "${PROJECTS[@]}"; do
IFS='|' read -r name url lang <<< "$entry"
clone_or_update "$name" "$url"
done
echo ""
echo "=== Extracting metrics ==="
echo ""
for entry in "${PROJECTS[@]}"; do
IFS='|' read -r name url lang <<< "$entry"
dir="$WORKDIR/$name"
echo "--- $name ($lang) ---"
# Source and test counts
case "$lang" in
typescript)
src_lines=$(find_source_ts "$dir" | count_lines)
src_files=$(find_source_ts "$dir" | count_files)
test_files_count=$(find_test_ts "$dir" | count_files)
test_lines_count=$(find_test_ts "$dir" | count_lines)
deps_raw=$(count_deps_ts "$dir")
;;
python)
src_lines=$(find_source_py "$dir" | count_lines)
src_files=$(find_source_py "$dir" | count_files)
test_files_count=$(find_test_py "$dir" | count_files)
test_lines_count=$(find_test_py "$dir" | count_lines)
deps_raw=$(count_deps_python "$dir")
;;
rust)
src_lines=$(find_source_rs "$dir" | count_lines)
src_files=$(find_source_rs "$dir" | count_files)
test_files_count=$(find_test_rs "$dir" | count_files)
test_lines_count=$(find_test_rs "$dir" | count_lines)
deps_raw=$(count_deps_rust "$dir")
;;
esac
IFS='|' read -r deps_runtime deps_dev deps_total <<< "$deps_raw"
test_ratio="0.00"
if [ "$src_lines" -gt 0 ]; then
test_ratio=$(awk "BEGIN{printf \"%.2f\", $test_lines_count / $src_lines}")
fi
avg_lines=0
if [ "$src_files" -gt 0 ]; then
avg_lines=$(awk "BEGIN{printf \"%d\", $src_lines / $src_files}")
fi
# Module cohesion metrics
files_over_500=0
largest_file_lines=0
largest_file_name=""
barrel_files=0
case "$lang" in
typescript)
find_source_ts "$dir" | while IFS= read -r f; do wc -l < "$f"; done | sort -rn > "$WORKDIR/.linecounts" 2>/dev/null || true
barrel_files=$(find_source_ts "$dir" | { grep -c '/index\.ts$' || echo 0; } | tr -d ' ')
;;
python)
find_source_py "$dir" | while IFS= read -r f; do wc -l < "$f"; done | sort -rn > "$WORKDIR/.linecounts" 2>/dev/null || true
barrel_files=$(find_source_py "$dir" | { grep -c '/__init__\.py$' || echo 0; } | tr -d ' ')
;;
rust)
find_source_rs "$dir" | while IFS= read -r f; do wc -l < "$f"; done | sort -rn > "$WORKDIR/.linecounts" 2>/dev/null || true
barrel_files=$(find_source_rs "$dir" | { grep -c '/mod\.rs$' || echo 0; } | tr -d ' ')
;;
esac
if [ -s "$WORKDIR/.linecounts" ]; then
largest_file_lines=$(head -1 "$WORKDIR/.linecounts" | tr -d ' ')
files_over_500=$(awk '$1 > 500' "$WORKDIR/.linecounts" | wc -l | tr -d ' ')
fi
files_over_500_pct=0
if [ "$src_files" -gt 0 ]; then
files_over_500_pct=$(awk "BEGIN{printf \"%d\", ($files_over_500 / $src_files) * 100}")
fi
echo " Source lines: $src_lines"
echo " Source files: $src_files"
echo " Avg lines/file: $avg_lines"
echo " Files > 500: $files_over_500 ($files_over_500_pct%)"
echo " Largest file: $largest_file_lines"
echo " Barrel files: $barrel_files"
# Initial commit date via GitHub API
repo_path=$(echo "$url" | sed 's|https://github.com/||;s|\.git$||')
initial_commit=$(gh api "repos/$repo_path" --jq '.created_at' 2>/dev/null | cut -d'T' -f1)
echo " Initial commit: $initial_commit"
echo " Dependencies: $deps_runtime runtime + $deps_dev dev = $deps_total total"
echo " Test files: $test_files_count"
echo " Test lines: $test_lines_count"
echo " Test/source: $test_ratio"
# Language-specific quality metrics
if [ "$lang" = "typescript" ]; then
as_any=$(find_source_ts "$dir" | grep_count "as any")
colon_any=$(find_source_ts "$dir" | grep_count ": any")
ts_ignore=$(find_source_ts "$dir" | grep_count "@ts-ignore\|@ts-expect-error")
lint_ignore=$(find_source_ts "$dir" | grep_count "eslint-disable\|biome-ignore")
unknown=$(find_source_ts "$dir" | grep_count ": unknown")
todo=$(find_source_ts "$dir" | grep_count "TODO\|FIXME\|HACK")
comments=$(find_source_ts "$dir" | grep_count '^\s*//')
safe_parse=$(find_source_ts "$dir" | grep_count '\.safeParse(')
try_blocks=$(find_source_ts "$dir" | grep_count 'try {')
catch_calls=$(find_source_ts "$dir" | grep_count '\.catch(')
echo " as any /1k: $(per_1k "$as_any" "$src_lines") ($as_any total)"
echo " : any /1k: $(per_1k "$colon_any" "$src_lines") ($colon_any total)"
echo " @ts-ignore /1k: $(per_1k "$ts_ignore" "$src_lines") ($ts_ignore total)"
echo " lint ignores /1k: $(per_1k "$lint_ignore" "$src_lines") ($lint_ignore total)"
echo " : unknown /1k: $(per_1k "$unknown" "$src_lines") ($unknown total)"
echo " TODO|FIXME /1k: $(per_1k "$todo" "$src_lines") ($todo total)"
echo " Comments /1k: $(per_1k "$comments" "$src_lines") ($comments total)"
echo " .safeParse /1k: $(per_1k "$safe_parse" "$src_lines") ($safe_parse total)"
echo " try {} /1k: $(per_1k "$try_blocks" "$src_lines") ($try_blocks total)"
echo " .catch() /1k: $(per_1k "$catch_calls" "$src_lines") ($catch_calls total)"
elif [ "$lang" = "python" ]; then
type_ignore=$(find_source_py "$dir" | grep_count "type: ignore")
any_type=$(find_source_py "$dir" | grep_count "Any")
cast_calls=$(find_source_py "$dir" | grep_count "cast(")
todo=$(find_source_py "$dir" | grep_count "TODO\|FIXME\|HACK")
comments=$(find_source_py "$dir" | grep_count '^\s*#')
echo " type: ignore /1k: $(per_1k "$type_ignore" "$src_lines") ($type_ignore total)"
echo " Any type /1k: $(per_1k "$any_type" "$src_lines") ($any_type total)"
echo " cast() /1k: $(per_1k "$cast_calls" "$src_lines") ($cast_calls total)"
echo " TODO|FIXME /1k: $(per_1k "$todo" "$src_lines") ($todo total)"
echo " Comments /1k: $(per_1k "$comments" "$src_lines") ($comments total)"
elif [ "$lang" = "rust" ]; then
unsafe=$(find_source_rs "$dir" | grep_count "unsafe")
unwrap=$(find_source_rs "$dir" | grep_count '\.unwrap()')
expect_calls=$(find_source_rs "$dir" | grep_count '\.expect(')
todo=$(find_source_rs "$dir" | grep_count "TODO\|FIXME\|HACK")
comments=$(find_source_rs "$dir" | grep_count '^\s*//')
echo " unsafe /1k: $(per_1k "$unsafe" "$src_lines") ($unsafe total)"
echo " .unwrap() /1k: $(per_1k "$unwrap" "$src_lines") ($unwrap total)"
echo " .expect() /1k: $(per_1k "$expect_calls" "$src_lines") ($expect_calls total)"
echo " TODO|FIXME /1k: $(per_1k "$todo" "$src_lines") ($todo total)"
echo " Comments /1k: $(per_1k "$comments" "$src_lines") ($comments total)"
fi
echo ""
done
echo "Done. All repos at $WORKDIR"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment