Skip to content

Instantly share code, notes, and snippets.

@cheeseonamonkey
Last active January 20, 2026 11:27
Show Gist options
  • Select an option

  • Save cheeseonamonkey/c26ba8f33344f2033613e26ebe6ae4a6 to your computer and use it in GitHub Desktop.

Select an option

Save cheeseonamonkey/c26ba8f33344f2033613e26ebe6ae4a6 to your computer and use it in GitHub Desktop.
scout.py -

Python code scout: metrics + locally-generated summaries.

It’s useful for AI-agent priming because it extracts code into a dense signal - enough context for an agent to start acting like it “read the code.”

The summaries are actually quite lit; runs locally on your CPU (the model is <200MB) and is good at explaining what the code does, not just what it’s called.

  • Scout on itself compresses 5,123 source tokens into 1,628.
  • In another larger project: 22,962 into just 6,372! The agent could describe the whole thing technically, predict bugs, and suggest refactors with no additional context from source.
python scout.py --help

usage: scout [-h] [--columns COLUMNS] [--list-columns] [--exclude-dirs EXCLUDE_DIRS] [--ai] [--json] [--mypy] [--jobs JOBS] [--cache CACHE] [--no-cache] [-q] [path]

Static analysis for AI agents

positional arguments:
  path                  Path to analyze

options:
  -h, --help            show this help message and exit
  --columns COLUMNS     Columns to display (default: name,lines,summary,h_vol,calls,h_bugs,cc,mi)
  --list-columns        List all possible columns
  --exclude-dirs EXCLUDE_DIRS
                        Extra dirs to exclude (comma-sep)
  --ai                  AI summaries per symbol
  --json                JSON output
  --mypy                Run mypy
  --jobs JOBS           Parallel workers
  --cache CACHE         Cache file
  --no-cache            Disable cache
  -q, --quiet           Suppress progress

examples:
  python scout.py --ai --columns "name,lines,summary,h_vol,calls,h_bugs,cc,mi" scout.py     # default columns

Output example:

Output example
❯ python scout.py  \
        --ai          \
        --columns "name,lines,summary,h_vol,h_bugs,cc,mi,calls"      \
        scout.py


Loading AI model...
╭──────────────────────────────────────────────────────────────────────────────────────────────────────────────── Project ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ Files: 1  Symbols: 34  LOC: 483 (374 source)                                                                                                                                                                                             │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
NAME                                                                                              LINES    SUMMARY                                                                                         H_VOL   H_BUGS  CC  MI      CALLS
metric(name: str, desc: str='')                                                                   25-30    Decorator to register a metric function as a new metric .                                       0       0.00    1   100.00  3
metrics_for_columns(columns: List) -> List                                                        32-40    Return metric names needed to produce requested columns .                                       4.80    0.00    4   73.90   4
_safe(default, fn)                                                                                42-44    Call a function and return the result if it fails .                                             0       0.00    2   100.00  1
_raw(code: str)                                                                                   46-47    Return the raw code .                                                                           0       0.00    1   100.00  2
m_loc(code: str, _)                                                                               50-50    Return the number of lines in the code .                                                        0       0.00    1   100.00  3
m_sloc(code: str, _)                                                                              53-55    Return the sloc of the code .                                                                   0       0.00    2   100.00  2
m_comments(code: str, _)                                                                          58-61    Return a dict of comments and the ratio of the comments .                                       11.60   0.00    2   75.30   4
m_blank(code: str, _)                                                                             64-66    Return the blank value of the node .                                                            0       0.00    2   100.00  2
m_cc(code: str, _)                                                                                69-71    Return the complexity of the code .                                                             0       0.00    2   100.00  3
m_mi(code: str, _)                                                                                74-75    Return the mi value of the given code .                                                         0       0.00    1   100.00  5
m_halstead(code: str, _)                                                                          78-83    Return the Halstead total of a sequence of words .                                              2.00    0.00    2   77.92   8
m_sig(code: str, node)                                                                            86-99    Generate a signature for a function .                                                           124.90  0.04    10  56.59   13
m_params(code: str, node)                                                                         102-104  Return the number of parameters in the function or async function .                             11.60   0.00    3   76.89   3
m_nesting(code: str, node)                                                                        107-114  Return a dict of the nesting level of the given node .                                          22.50   0.01    2   67.55   6
m_calls(code: str, node)                                                                          117-118  Returns the number of calls in the given code .                                                 0       0.00    3   100.00  4
m_branches(code: str, node)                                                                       121-122  Return the number of branches in the code .                                                     0       0.00    3   100.00  4
m_loops(code: str, node)                                                                          125-126  Return a dictionary of loops .                                                                  0       0.00    3   100.00  4
m_doc(code: str, node)                                                                            129-132  Return a dict with the number of lines and comments in the node s docstring .                   2.00    0.00    3   79.05   7
load_ai(quiet=False)                                                                              162-177  Load the AI model and return a function that can be used to summarize the sequence of tokens .  2.00    0.00    2   87.35   9
run_mypy(path: Path) -> Dict                                                                      182-191  Runs mypy and returns a dict of the number of errors warnings and issues .                      57.40   0.02    7   64.03   6
_sha1(b: bytes) -> str                                                                            198-198  Return the SHA - 1 hash of a bytes object .                                                     0       0.00    1   100.00  2
load_cache(p: Path) -> Dict                                                                       200-205  Load the cache from a JSON file .                                                               11.60   0.00    4   71.19   4
save_cache(p: Path, d: Dict)                                                                      207-209  Save a dictionary to a JSON file .                                                              0       0.00    2   100.00  2
should_skip(p: Path, excl: Set) -> bool                                                           214-215  Returns True if p should be skipped .                                                           4.80    0.00    2   88.42   1
extract_imports(tree: ast.AST) -> List                                                            217-225  Extract imports from an ast . AST .                                                             4.80    0.00    8   71.47   7
mod_to_path(root: Path, mod: str) -> Optional[Path]                                               227-231  Return the path of the module in root if it exists .                                            23.30   0.01    3   73.05   3
build_import_graph(root: Path, file_imports: Dict[str, List], files: Set) -> Dict[str, List]      233-240  Build a graph of import paths to files .                                                        48.10   0.02    9   65.20   11
analyze_file(path: str, columns: List) -> Dict                                                    245-269  Analyze a file and return a dict of metrics .                                                   66.40   0.02    12  55.13   27
get_all_columns() -> List                                                                         274-280  Returns a list of all possible column names .                                                   13.90   0.01    2   73.29   2
format_output(console: Console, symbols: List[Symbol], file_stats: List[FileStats], columns:      282-319  Formats the output of                                                                           99.90   0.03    20  66.67   26
List, show_summary: bool)                                                                                  FailureSummary .
format_json(symbols: List[Symbol], file_stats: List[FileStats], import_graph: Dict)               321-331  Formats a list of symbols and file stats into a JSON file .                                     0       0.00    5   100.00  6
main()                                                                                            339-480  Entry point for the command line tool .                                                         297.50  0.10    55  47.10   100
Symbol                                                                                            138-145  A basic syntax tree for a sequence of tokens .                                                  0       0.00    1   100.00  1
FileStats                                                                                         148-157  Statistics for a n - language language grammar .                                                0       0.00    1   100.00  1

Columns / metrics (compact reference)

Columns / metrics (compact reference)

Each row is a column name you can request via --columns. “Source” indicates where the value comes from.

Column(s) Meaning Source
file, name, kind, lines Symbol identity + file + line span Radon visitor metadata
summary Short AI description per symbol CodeT5 summarizer (--ai)
fan_in, fan_out File dependency counts (in/out degree) AST import parse → best-effort local import graph
loc Total lines in snippet splitlines()
sloc, blank, comments, comment_ratio Raw code stats radon.raw.analyze (+ ratio derived)
cc Cyclomatic complexity radon.complexity.cc_visit
mi Maintainability index (0–100) radon.metrics.mi_visit
h_vol, h_diff, h_effort, h_time, h_bugs Halstead totals radon.metrics.h_visit(...).total.*
sig, params Function signature + param count AST (ast.unparse, node.args)
nesting Max nesting depth AST walk (if/for/while/with/try)
calls, branches, loops Call/branch/loop counts AST walk (Call / If / For+While)
has_doc, doc_lines Docstring presence + line count AST (ast.get_docstring)

scout.py --list-columns prints everything supported by your build.

#!/usr/bin/env python3
"""scout.py — static analysis for AI agents (streamlined)"""
from __future__ import annotations
import argparse, ast, hashlib, json, os, subprocess, sys
from concurrent.futures import ProcessPoolExecutor, as_completed
from dataclasses import dataclass, field
from pathlib import Path
from typing import Any, Callable, Dict, List, Optional, Set
import radon.complexity as rcc
import radon.metrics as rm
import radon.raw as rr
from radon.visitors import Class as RC, Function as RF
from rich.console import Console
from rich.table import Table
from rich.panel import Panel
# ─────────────────────────────────────────────────────────────
# Metrics (computed on-demand based on --columns)
# ─────────────────────────────────────────────────────────────
MetricFn = Callable[[str, Optional[ast.AST]], Dict[str, Any]]
METRICS: Dict[str, tuple[MetricFn, str, List[str]]] = {} # name -> (fn, desc, output_keys)
def metric(name: str, desc: str = ""):
def wrap(fn: MetricFn):
keys = list(fn("", None).keys())
METRICS[name] = (fn, desc, keys)
return fn
return wrap
def metrics_for_columns(columns: List[str]) -> List[str]:
"""Return metric names needed to produce requested columns."""
needed = set()
for col in columns:
for name, (_, _, keys) in METRICS.items():
if col in keys:
needed.add(name)
break
return list(needed)
def _safe(default, fn):
try: return fn()
except Exception: return default
def _raw(code: str):
return _safe(None, lambda: rr.analyze(code))
@metric("loc", "Lines of code")
def m_loc(code: str, _): return {"loc": len(code.splitlines())}
@metric("sloc", "Source lines")
def m_sloc(code: str, _):
a = _raw(code)
return {"sloc": a.sloc if a else 0}
@metric("comments", "Comment lines + ratio")
def m_comments(code: str, _):
a = _raw(code)
if not a: return {"comments": 0, "comment_ratio": 0.0}
return {"comments": a.comments, "comment_ratio": round(a.comments / max(a.sloc, 1), 2)}
@metric("blank", "Blank lines")
def m_blank(code: str, _):
a = _raw(code)
return {"blank": a.blank if a else 0}
@metric("cc", "Cyclomatic complexity")
def m_cc(code: str, _):
blocks = _safe([], lambda: rcc.cc_visit(code))
return {"cc": blocks[0].complexity if blocks else 0}
@metric("mi", "Maintainability index (0-100)")
def m_mi(code: str, _):
return {"mi": round(float(_safe(0.0, lambda: rm.mi_visit(code, multi=True))), 2)}
@metric("halstead", "Halstead metrics")
def m_halstead(code: str, _):
h = _safe(None, lambda: rm.h_visit(code))
if not h: return {"h_vol": 0, "h_diff": 0, "h_effort": 0, "h_time": 0, "h_bugs": 0}
t = h.total
return {"h_vol": round(t.volume, 1), "h_diff": round(t.difficulty, 1),
"h_effort": round(t.effort, 1), "h_time": round(t.time, 1), "h_bugs": round(t.bugs, 3)}
@metric("sig", "Function signature")
def m_sig(code: str, node):
if not isinstance(node, (ast.FunctionDef, ast.AsyncFunctionDef)): return {"sig": ""}
def u(x): return _safe("", lambda: ast.unparse(x))
parts = []
for a in (node.args.args or []):
s = a.arg + (f": {u(a.annotation)}" if a.annotation else "")
parts.append(s)
defs = node.args.defaults or []
if defs:
base = len(parts) - len(defs)
for i, d in enumerate(defs):
if 0 <= base + i < len(parts): parts[base + i] += f"={u(d)}"
ret = f" -> {u(node.returns)}" if node.returns else ""
return {"sig": f"({', '.join(parts)}){ret}"}
@metric("params", "Parameter count")
def m_params(code: str, node):
if not isinstance(node, (ast.FunctionDef, ast.AsyncFunctionDef)): return {"params": 0}
return {"params": len(node.args.args or [])}
@metric("nesting", "Max nesting depth")
def m_nesting(code: str, node):
if not node: return {"nesting": 0}
inc = (ast.For, ast.While, ast.If, ast.With, ast.Try)
def depth(n, lvl=0):
if isinstance(n, inc): lvl += 1
kids = [depth(c, lvl) for c in ast.iter_child_nodes(n)]
return max([lvl] + kids) if kids else lvl
return {"nesting": depth(node)}
@metric("calls", "Function calls")
def m_calls(code: str, node):
return {"calls": sum(isinstance(n, ast.Call) for n in ast.walk(node)) if node else 0}
@metric("branches", "if/elif branches")
def m_branches(code: str, node):
return {"branches": sum(isinstance(n, ast.If) for n in ast.walk(node)) if node else 0}
@metric("loops", "for/while loops")
def m_loops(code: str, node):
return {"loops": sum(isinstance(n, (ast.For, ast.While)) for n in ast.walk(node)) if node else 0}
@metric("doc", "Docstring info")
def m_doc(code: str, node):
if not node: return {"has_doc": 0, "doc_lines": 0}
doc = _safe(None, lambda: ast.get_docstring(node))
return {"has_doc": int(bool(doc)), "doc_lines": len(doc.splitlines()) if doc else 0}
# ─────────────────────────────────────────────────────────────
# Data model
# ─────────────────────────────────────────────────────────────
@dataclass
class Symbol:
name: str
kind: str # function|class
file: str
start: int
end: int
metrics: Dict[str, Any] = field(default_factory=dict)
summary: Optional[str] = None
@dataclass
class FileStats:
file: str
symbols: int = 0
loc: int = 0
sloc: int = 0
comments: int = 0
blank: int = 0
imports: List[str] = field(default_factory=list)
fan_in: int = 0
fan_out: int = 0
# ─────────────────────────────────────────────────────────────
# AI summarization (CodeT5 small - very fast)
# ─────────────────────────────────────────────────────────────
def load_ai(quiet=False):
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
import torch
model_id = "Salesforce/codet5-base-multi-sum" # ~220MB, optimized for summarization
tok = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForSeq2SeqLM.from_pretrained(model_id)
model.eval()
console = Console(stderr=True) if not quiet else None
def summarize(code: str) -> str:
inputs = tok(code[:1024], return_tensors="pt", truncation=True, max_length=512)
with torch.inference_mode():
out = model.generate(**inputs, max_new_tokens=48)
return tok.decode(out[0], skip_special_tokens=True).strip()
return summarize, console
# ─────────────────────────────────────────────────────────────
# External tools (mypy only)
# ─────────────────────────────────────────────────────────────
def run_mypy(path: Path) -> Dict[str, Any]:
try:
r = subprocess.run(["mypy", str(path), "--no-error-summary"],
capture_output=True, text=True, timeout=60)
lines = (r.stdout or "").strip().splitlines()
errors = [l for l in lines if ": error:" in l]
warnings = [l for l in lines if ": warning:" in l]
return {"errors": len(errors), "warnings": len(warnings), "issues": errors + warnings}
except (subprocess.TimeoutExpired, FileNotFoundError):
return {"errors": -1, "warnings": -1, "issues": []}
# ─────────────────────────────────────────────────────────────
# Cache
# ─────────────────────────────────────────────────────────────
CACHE_VER = 3
def _sha1(b: bytes) -> str: return hashlib.sha1(b).hexdigest()
def load_cache(p: Path) -> Dict:
if not p.exists(): return {"v": CACHE_VER, "files": {}, "ai": {}}
try:
d = json.loads(p.read_text())
return d if d.get("v") == CACHE_VER else {"v": CACHE_VER, "files": {}, "ai": {}}
except: return {"v": CACHE_VER, "files": {}, "ai": {}}
def save_cache(p: Path, d: Dict):
try: p.write_text(json.dumps(d))
except: pass
# ─────────────────────────────────────────────────────────────
# Imports + graphs
# ─────────────────────────────────────────────────────────────
def should_skip(p: Path, excl: Set[str]) -> bool:
return any(x in excl for x in p.parts)
def extract_imports(tree: ast.AST) -> List[str]:
out = set()
for n in ast.walk(tree):
if isinstance(n, ast.Import):
for a in n.names: out.add(a.name)
elif isinstance(n, ast.ImportFrom):
mod = n.module or ""
for a in n.names: out.add(f"{mod}.{a.name}" if mod else a.name)
return sorted(out)
def mod_to_path(root: Path, mod: str) -> Optional[Path]:
rel = Path(*mod.split("."))
for cand in [root / f"{rel}.py", root / rel / "__init__.py"]:
if cand.exists(): return cand
return None
def build_import_graph(root: Path, file_imports: Dict[str, List[str]], files: Set[str]) -> Dict[str, List[str]]:
g = {f: set() for f in files}
for f, imps in file_imports.items():
for mod in imps:
p = mod_to_path(root, mod.split(".")[0]) or mod_to_path(root, mod)
if p and str(p) in files and str(p) != f:
g[f].add(str(p))
return {k: sorted(v) for k, v in g.items()}
# ─────────────────────────────────────────────────────────────
# Analysis (parallel-safe)
# ─────────────────────────────────────────────────────────────
def analyze_file(path: str, columns: List[str]) -> Dict[str, Any]:
p = Path(path)
src = p.read_text(errors="replace")
src_hash = _sha1(src.encode(errors="replace"))
tree = ast.parse(src)
lines = src.splitlines()
raw = _raw(src)
metric_names = metrics_for_columns(columns)
symbols = []
for b in _safe([], lambda: rcc.cc_visit(src)):
if not isinstance(b, (RF, RC)): continue
node = next((n for n in ast.walk(tree)
if getattr(n, "lineno", None) == b.lineno and getattr(n, "name", None) == b.name), None)
snippet = "\n".join(lines[b.lineno - 1:b.endline])
data = {}
for m in metric_names:
if m in METRICS: data.update(METRICS[m][0](snippet, node))
symbols.append({"name": b.name, "kind": "func" if isinstance(b, RF) else "class",
"file": str(p), "start": b.lineno, "end": b.endline, "metrics": data, "snippet": snippet})
stats = {"file": str(p), "symbols": len(symbols), "loc": len(lines),
"sloc": raw.sloc if raw else len(lines), "comments": raw.comments if raw else 0,
"blank": raw.blank if raw else 0, "imports": extract_imports(tree)}
return {"file": str(p), "hash": src_hash, "symbols": symbols, "stats": stats}
# ─────────────────────────────────────────────────────────────
# Output (rich)
# ─────────────────────────────────────────────────────────────
def get_all_columns() -> List[str]:
"""All possible column names."""
base = ["file", "name", "kind", "lines"]
metric_keys = []
for _, _, keys in METRICS.values():
metric_keys.extend(keys)
return base + metric_keys + ["fan_in", "fan_out", "summary"]
def format_output(console: Console, symbols: List[Symbol], file_stats: List[FileStats],
columns: List[str], show_summary: bool):
# Project summary
total_loc = sum(f.loc for f in file_stats)
total_sloc = sum(f.sloc for f in file_stats)
console.print(Panel(
f"[bold]Files:[/] {len(file_stats)} [bold]Symbols:[/] {len(symbols)} "
f"[bold]LOC:[/] {total_loc:,} ({total_sloc:,} source)",
title="Project", border_style="blue"))
# Build file stats lookup
fs_map = {f.file: f for f in file_stats}
# Build table
table = Table(show_header=True, header_style="bold cyan", box=None, pad_edge=False)
for col in columns:
table.add_column(col.upper(), overflow="fold")
for s in symbols:
fs = fs_map.get(s.file)
row = []
for col in columns:
if col == "file": row.append(s.file)
elif col == "name":
sig = s.metrics.get("sig", "")
row.append(f"{s.name}{sig}" if sig else s.name)
elif col == "kind": row.append(s.kind)
elif col == "lines": row.append(f"{s.start}-{s.end}")
elif col == "fan_in": row.append(str(fs.fan_in if fs else 0))
elif col == "fan_out": row.append(str(fs.fan_out if fs else 0))
elif col == "summary": row.append(s.summary or "")
elif col in s.metrics:
v = s.metrics[col]
row.append(f"{v:.2f}" if isinstance(v, float) else str(v))
else: row.append("")
table.add_row(*row)
console.print(table)
def format_json(symbols: List[Symbol], file_stats: List[FileStats], import_graph: Dict):
data = {
"project": {"files": len(file_stats), "symbols": len(symbols),
"loc": sum(f.loc for f in file_stats), "sloc": sum(f.sloc for f in file_stats)},
"files": [{"file": f.file, "loc": f.loc, "sloc": f.sloc, "symbols": f.symbols,
"fan_in": f.fan_in, "fan_out": f.fan_out} for f in file_stats],
"symbols": [{"name": s.name, "kind": s.kind, "file": s.file, "start": s.start,
"end": s.end, "metrics": s.metrics, "summary": s.summary} for s in symbols],
"import_graph": import_graph,
}
print(json.dumps(data, indent=2))
# ─────────────────────────────────────────────────────────────
# CLI
# ─────────────────────────────────────────────────────────────
DEFAULT_EXCL = {".venv", "__pycache__", ".git", ".tox", "node_modules", ".eggs", "build", "dist"}
DEFAULT_COLS = "name,lines,summary,h_vol,calls,h_bugs,cc,mi"
def main():
examples = """
examples:
python scout.py --ai --columns "name,lines,summary,h_vol,calls,h_bugs,cc,mi" scout.py # default columns
"""
ap = argparse.ArgumentParser(prog="scout", description="Static analysis for AI agents",
epilog=examples, formatter_class=argparse.RawDescriptionHelpFormatter)
ap.add_argument("path", nargs="?", default=".", help="Path to analyze")
ap.add_argument("--columns", default=DEFAULT_COLS, help=f"Columns to display (default: {DEFAULT_COLS})")
ap.add_argument("--list-columns", action="store_true", help="List all possible columns")
ap.add_argument("--exclude-dirs", help="Extra dirs to exclude (comma-sep)")
ap.add_argument("--ai", action="store_true", help="AI summaries per symbol")
ap.add_argument("--json", action="store_true", help="JSON output")
ap.add_argument("--mypy", action="store_true", help="Run mypy")
ap.add_argument("--jobs", type=int, default=max(os.cpu_count() or 2, 2), help="Parallel workers")
ap.add_argument("--cache", default=".scoutcache.json", help="Cache file")
ap.add_argument("--no-cache", action="store_true", help="Disable cache")
ap.add_argument("-q", "--quiet", action="store_true", help="Suppress progress")
args = ap.parse_args()
console = Console(stderr=True)
if args.list_columns:
console.print(", ".join(get_all_columns()))
return
path = Path(args.path)
if not path.exists(): sys.exit(f"Error: {path} not found")
columns = [c.strip() for c in args.columns.split(",") if c.strip()]
excl = DEFAULT_EXCL | (set(args.exclude_dirs.split(",")) if args.exclude_dirs else set())
cache = load_cache(Path(args.cache)) if not args.no_cache else {"v": CACHE_VER, "files": {}, "ai": {}}
# Find files
if path.is_file():
files, root = [path], path.parent
else:
root = path
files = [f for f in path.rglob("*.py") if not should_skip(f, excl)]
if not files: sys.exit("No Python files found")
files_s = [str(f) for f in files]
files_set = set(files_s)
# Check cache
to_run, results = [], {}
for f in files_s:
p = Path(f)
h = _sha1(p.read_bytes()) if p.exists() else ""
cached = cache["files"].get(f)
if cached and cached.get("hash") == h:
results[f] = cached
else:
to_run.append(f)
# Parallel analysis
if to_run:
if not args.quiet: console.print(f"Analyzing {len(to_run)} files...")
with ProcessPoolExecutor(max_workers=args.jobs) as ex:
futs = {ex.submit(analyze_file, f, columns): f for f in to_run}
for fut in as_completed(futs):
f = futs[fut]
try:
res = fut.result()
results[f] = res
if not args.no_cache:
cache["files"][f] = {k: v for k, v in res.items() if k != "symbols" or True}
# strip snippets for cache
cache["files"][f]["symbols"] = [{k: v for k, v in s.items() if k != "snippet"}
for s in res.get("symbols", [])]
except Exception as e:
if not args.quiet: console.print(f"[red]Error:[/] {f}: {e}")
# Load AI if needed
ai, ai_console, ai_cache = None, None, cache.get("ai", {})
if args.ai:
if not args.quiet: console.print("Loading AI model...")
ai, ai_console = load_ai(args.quiet)
# Build symbols + file stats
all_symbols, all_stats, file_imports = [], [], {}
for f in files_s:
res = results.get(f)
if not res: continue
st = res.get("stats", {})
fs = FileStats(file=st.get("file", f), symbols=st.get("symbols", 0), loc=st.get("loc", 0),
sloc=st.get("sloc", 0), comments=st.get("comments", 0),
blank=st.get("blank", 0), imports=st.get("imports", []))
file_imports[fs.file] = fs.imports
all_stats.append(fs)
# Symbols (need snippets for AI)
lines = Path(f).read_text(errors="replace").splitlines() if args.ai else None
for s in res.get("symbols", []):
summary = None
if args.ai and ai and lines:
snippet = "\n".join(lines[s["start"]-1:s["end"]])
h = _sha1(snippet.encode())
summary = ai_cache.get(h)
if not summary:
if not args.quiet: console.print(f" AI: {s['name']}...", end="\r")
summary = _safe(None, lambda: ai(snippet))
if summary: ai_cache[h] = summary
all_symbols.append(Symbol(name=s["name"], kind=s["kind"], file=s["file"],
start=s["start"], end=s["end"], metrics=s.get("metrics", {}),
summary=summary))
if args.ai: cache["ai"] = ai_cache
# Import graph + fan-in/out
import_graph = build_import_graph(root, file_imports, files_set)
fan_in = {f: 0 for f in files_set}
fan_out = {f: len(import_graph.get(f, [])) for f in files_set}
for src, dsts in import_graph.items():
for d in dsts: fan_in[d] = fan_in.get(d, 0) + 1
fs_map = {f.file: f for f in all_stats}
for f, fs in fs_map.items():
fs.fan_in, fs.fan_out = fan_in.get(f, 0), fan_out.get(f, 0)
if not args.no_cache: save_cache(Path(args.cache), cache)
# Output
out_console = Console() # stdout
if args.json:
format_json(all_symbols, all_stats, import_graph)
else:
format_output(out_console, all_symbols, all_stats, columns, args.ai)
# Mypy
if args.mypy:
r = run_mypy(path)
if args.json:
print(json.dumps({"mypy": r}, indent=2))
else:
style = "red" if r["errors"] > 0 else "green"
out_console.print(f"\n[bold]Mypy:[/] [{style}]{r['errors']} errors[/], {r['warnings']} warnings")
for issue in r.get("issues", [])[:10]:
out_console.print(f" {issue}")
if __name__ == "__main__":
main()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment