Skip to content

Instantly share code, notes, and snippets.

@bkase
Created January 4, 2026 09:25
Show Gist options
  • Select an option

  • Save bkase/0f941527ae24cb57af389daf1e533d27 to your computer and use it in GitHub Desktop.

Select an option

Save bkase/0f941527ae24cb57af389daf1e533d27 to your computer and use it in GitHub Desktop.
3-phases breakdown of Ian's request https://x.com/seagreen__/status/2007409001999245376?s=20

Phases

Phase 1 — Language core + toolchain + test harness (foundation)

Build the minimal compile-to-JS language end-to-end (lexer → parser → compiler → runtime), define the v0.1 spec, stand up Bun+TypeScript repo tooling (oxlint/oxfmt), and establish a serious testing spine (unit + a first e2e smoke test). Deliver a tiny “runner” web page and a build pipeline that can emit a single dist/index.html even if the UI is still barebones.

Phase 2 — In-browser “studio”: virtual filesystem + editor + live recompile + graphics sandbox

Implement the graphical computing environment (canvas-based), virtual filesystem (seeded with the project’s own sources), file explorer, and the live recompilation loop. Establish robust browser sandboxing for running user code safely. Grow e2e coverage around IDE workflows.

Phase 3 — “Product” layer: two small games + polish + documentation-grade code + final single-file build

Ship two small games written in the new language (or at least using its stdlib + graphics module), improve ergonomics (errors, traces, perf), add doc-quality in-code documentation, ensure everything works offline as a single index.html, and harden tests (including flake controls + CI gates).


Engineering Plan — Phase 1

0. Executive summary

We will create a tiny language (“Twig”, placeholder name) that compiles to JavaScript and runs in a browser with no FFI exposed to user programs (i.e., no direct “call arbitrary JS” escape hatch). The compiler and runtime will be written in TypeScript and bundled with Bun. We’ll establish strict correctness with Bun’s built-in test runner (TypeScript-supported, Jest-like) and snapshots for compiler output, plus a minimal Playwright smoke test to verify the browser runner loads and executes a sample program. (bun.com)

Phase 1 is explicitly about de-risking:

  • “Can we implement a language pipeline cleanly?”
  • “Can we run compiled output deterministically (tests)?”
  • “Can we emit a single index.html artifact from Bun tooling?”

1. Goals and non-goals

Goals (Phase 1)

  1. Language v0.1 implemented:

    • Lexer → parser → AST with spans
    • Compiler to JS
    • Minimal runtime + stdlib
    • Deterministic behavior for tests
  2. Tooling:

    • Bun workspace (TS-first)
    • oxlint for lint, oxfmt for format
  3. Tests:

    • Many unit tests for each layer (lexer/parser/compiler/runtime)
    • Snapshot (“golden”) tests for compiler output
    • Minimal e2e smoke test using Playwright (loads runner page, runs “hello”)
  4. Build artifact proof:

    • A build script that produces one dist/index.html containing everything needed to run the minimal system offline.
    • This is a proof that the final “single file” constraint is feasible early. (The full IDE arrives in later phases.)

Non-goals (Phase 1)

  • Full-feature language (types, macros, advanced modules, optimization passes)
  • Rich IDE (file tree, editor UX, live recompile UI) — Phase 2+
  • Two games — Phase 3
  • Sophisticated security sandbox — Phase 2+ (Phase 1 just establishes the execution strategy and constraints)

2. Key constraints translated into concrete engineering requirements

“No FFI”

Interpretation we will implement:

  • User programs cannot call arbitrary JavaScript APIs.
  • Compiled programs run against a sealed runtime object (e.g., __rt) that exposes only our standard library and (later) graphics/file APIs.
  • We do not implement constructs like js.call("document.querySelector", ...).

Enforcement strategy (Phase 1 baseline):

  • Generated JS references only:

    • local variables
    • __rt (runtime)
  • Generated JS executes in a wrapper that shadows common globals (window, document, globalThis, Function, eval) to reduce accidental/intentional escape. (Not perfect security, but aligned with “minimal system”; hardening later.)

“Pure bun as much as possible; TypeScript as much as possible”

  • Compiler, runtime, web runner, build scripts: TypeScript.
  • Use Bun bundler and Bun test runner as defaults. (bun.com)
  • For Playwright: we’ll use Bun for dependency management and scripts; but we will not force Playwright to run under Bun runtime if it causes instability (there are known Bun-runtime compatibility issues when forcing Bun execution). (GitHub)

“Use ox tools for lint and format”

  • Use oxlint for linting and oxfmt for formatting. (Oxc)

“Compile it to index.html”

  • Phase 1 will build a minimal dist/index.html single-file output.
  • We’ll implement a bundling+inlining build script using Bun.build() outputs (which are BuildArtifacts and can be read via .text()), then inject JS/CSS into an HTML template. (bun.com)

3. Proposed language v0.1 (minimal but future-friendly)

3.1 Syntax choice

Pick an s-expression syntax to minimize parser complexity and maximize extensibility:

Examples:

; hello
(print "hi")

; let + function
(let add (fn (a b) (+ a b)))
(print (add 1 2))

; if
(if (> x 10)
  (print "big")
  (print "small"))

Rationale:

  • Lexer+parser are straightforward.
  • “Special forms” are explicit.
  • We can grow features later without grammar wars.

3.2 Core forms (Phase 1)

  • Literals: number, string, boolean, nil

  • Identifier: foo, bar-baz (decide exact identifier charset in spec)

  • Call: (f a b c)

  • Special forms:

    • (do expr1 expr2 ... exprN) → sequencing
    • (let name expr) → immutable binding in current scope
    • (fn (arg1 arg2 ...) body) → function
    • (if cond then else) → expression if
  • Comments:

    • ; to end of line

Phase 1 deliberately excludes:

  • mutation / assignment
  • macros
  • pattern matching
  • user-defined types

3.3 Semantics (Phase 1)

  • Lexical scoping, closures.
  • Everything is an expression; do returns last expression.
  • Truthiness: only false and nil are falsy (or we can do JS-like; pick and document).

3.4 Error model

All compiler stages return structured diagnostics:

  • Diagnostic { kind, message, span, notes[] }
  • span references the source text by { start, end, line/col } computed from a line map.

4. High-level architecture

4.1 Repository layout

A clean “documentation-as-code” structure:

/src
  /lang
    ast.ts
    span.ts
    lexer.ts
    parser.ts
    printer.ts        (optional, for debugging + tests)
    diagnostics.ts
    compiler.ts
  /runtime
    index.ts
    std.ts
    values.ts         (tagging / runtime helpers)
  /host
    host.ts           (Host interface: readFile, now, random, etc.)
    bun-host.ts
    browser-host.ts   (stub in phase1; real use in phase2)
  /web
    template.html
    app.ts            (minimal runner UI: run one example program)
  /build
    build-single-html.ts
/tests
  lexer.test.ts
  parser.test.ts
  compiler.test.ts
  runtime.test.ts
  integration.test.ts
/e2e
  smoke.spec.ts

4.2 Interfaces to keep Phase 2 easy

Even though Phase 1 won’t ship a full IDE, we should design for it now.

Host abstraction

interface Host {
  readText(path: string): Promise<string | null>; // phase2: virtual FS
  now(): number;
  random(): number; // injectable deterministic PRNG for tests
}

Compiler API

type CompileResult =
  | { ok: true; js: string; diagnostics: Diagnostic[] }
  | { ok: false; js?: string; diagnostics: Diagnostic[] };

async function compileSource(source: string, opts: CompileOpts): Promise<CompileResult>;

Later, Phase 2 will extend this to compileModule(entryPath, host) and a module graph. We keep the types aligned so we don’t rewrite everything.


5. Compiler pipeline design (Phase 1)

5.1 Stages

  1. Lex: input string → tokens with spans
  2. Parse: tokens → AST with spans
  3. Lower: AST → “core AST” (optional in Phase 1; but we should reserve a file/module)
  4. Emit: core AST → JS string

5.2 JS output structure (for “no FFI” posture)

Generated JS is a module-like string that exports a single function:

export function __run(__rt) {
  "use strict";
  const window = undefined, document = undefined, globalThis = undefined;
  const Function = undefined, eval = undefined;
  // compiled code...
  return result;
}

The runtime provides a controlled API surface:

type Runtime = {
  std: StdLib;
  // phase2: gfx, fs, ui...
};

5.3 Execution strategy (tests + browser)

  • In unit/integration tests: evaluate generated JS via new Function or dynamic import from a data URL.
  • In browser runner: create a Blob URL from generated JS and import() it, then call __run(runtime).

6. Runtime + stdlib (Phase 1 subset)

6.1 Value model

We will keep values “JS-native” where possible:

  • numbers → JS number
  • strings → JS string
  • booleans → JS boolean
  • nil → null (or a unique symbol; pick and document)
  • functions → JS functions, but only those created by compiled code or stdlib

6.2 Stdlib v0.1 (small but usable)

Core

  • print(x) → append stringified output to a runtime buffer (and optionally console.log)
  • str(x) → string conversion
  • =, <, >, <=, >=

Math

  • +, -, *, /, %

Logic

  • and, or, not (or just rely on if + truthiness; choose explicit functions for clarity)

Lists (minimal)

  • list(a b c) → array
  • len(xs)
  • get(xs i) with bounds checks → nil or error (decide)
  • push(xs x) (returns new array, immutable style)

We should keep phase1 stdlib pure and deterministic, except print.

6.3 Determinism hooks for testing

  • runtime gets random() from Host; tests inject a seeded PRNG.
  • runtime gets now() from Host; tests can fix it.

7. Tooling and quality gates

7.1 Bun

We will rely on:

  • Bun test runner for unit/integration/snapshot tests. (bun.com)
  • Bun bundler/build APIs for producing browser bundles and the single-file artifact. (bun.com)
  • Bun can run HTML entrypoints in dev (useful later), but Phase 1 only needs a minimal runner and a build. (bun.com)

7.2 Lint + format (“ox tools”)

  • Add oxlint and oxfmt as dev dependencies and wire scripts. (Oxc)
  • Use .oxlintrc.json (json/jsonc supported) and keep rules sane for a compiler codebase. (Oxc)

Recommended scripts

  • bun run lintoxlint
  • bun run lint:fixoxlint --fix
  • bun run formatoxfmt .
  • bun run format:checkoxfmt --check . (verify flag support when implementing; if not available, alternative is running and checking git diff)

(We’ll keep formatting config minimal and documented in-repo.)


8. Build plan: emit a single dist/index.html in Phase 1

8.1 Why this in Phase 1?

Single-file build is a high-risk “integration constraint.” If we postpone it, we risk a painful Phase 3 surprise. Phase 1 will prove the mechanism with the minimal runner.

8.2 Strategy

  1. Use Bun.build() on src/web/app.ts (browser target, minify optional).

  2. From BuildOutput.outputs, pick:

    • the JS entry-point/chunk
    • optionally a CSS artifact (if we produce CSS as a separate file; we can also inline CSS by writing it in the HTML template directly)
  3. Read artifact content using await artifact.text() (supported on BuildArtifact). (bun.com)

  4. Inject JS into src/web/template.html:

    • <script type="module"> ...bundled JS... </script>
    • <style> ...css... </style>
  5. Write the resulting HTML to dist/index.html.

Result: a single file that can be opened directly.


9. Testing plan (Phase 1)

9.1 Unit tests (Bun)

Bun’s test runner supports TypeScript and snapshots, so we will lean on that heavily. (bun.com)

Lexer tests

  • tokenization of:

    • identifiers
    • numbers (including edge cases: -1, 1.0, .5 if allowed)
    • strings (escapes, unterminated)
    • parentheses
    • comments
  • span accuracy: start/end indexes correct

Parser tests

  • parse each syntactic form

  • precedence is irrelevant in s-exprs, but we test:

    • nesting
    • empty lists ()
    • error recovery: unexpected EOF, mismatched parens
  • spans: AST nodes carry correct ranges

Compiler tests

  • “golden” snapshot tests:

    • compile small inputs and snapshot emitted JS
    • keep snapshots stable by normalizing whitespace and deterministic temp names
  • semantic tests:

    • compile expression → execute → compare result
    • compile if, let, fn + closure capture

Runtime/std tests

  • print output buffer behavior
  • numeric ops correctness
  • list ops correctness and bounds behavior

9.2 Integration tests (Bun)

Test full pipeline:

  • compile sample programs (fibonacci, map/reduce, closure) and run them through runtime host.

  • verify both:

    • final returned value
    • captured printed output (exact strings)

9.3 E2E tests (Playwright) — smoke only in Phase 1

We want e2e “as we go,” but Phase 1 keeps it minimal:

  • Build dist/index.html

  • Serve it via a tiny HTTP server (recommended; avoids file:// quirks)

  • Playwright opens the page and asserts:

    • the page loads without console errors
    • the sample program output area contains expected text (e.g., “hi”)

Playwright usage is via its standard playwright test workflow. (playwright.dev)

Important runtime note (pragmatic Bun posture): There are documented issues when forcing Playwright to run under Bun’s runtime; we will not force that mode in Phase 1. We will run Playwright in its stable configuration (typically Node execution) while still using Bun for dependency management and scripting. (GitHub)

9.4 Test-writing workflow guidance

  • Every new language feature ships with:

    1. parser unit tests
    2. compiler snapshot
    3. runtime semantic test (compile+run)
  • Bugs get a “regression test first” rule.

9.5 Quality gates for Phase 1

  • bun test is green
  • bun run lint is green
  • bun run format:check is green
  • bun run build produces a single dist/index.html
  • bun run e2e passes smoke test

10. Concrete Phase 1 deliverables checklist

Deliverable A — Language v0.1 spec (repo docs)

  • docs/language-v0.1.md describing:

    • syntax
    • semantics
    • stdlib surface
    • error philosophy
  • Small examples (hello, let/fn/if)

Deliverable B — Compiler + runtime in TS

  • Clean module boundaries
  • Diagnostics with spans
  • Deterministic test harness host

Deliverable C — Minimal browser runner

  • template.html + app.ts:

    • compiles embedded example program string
    • runs it
    • shows output in the DOM (simple <pre id="output">)

Deliverable D — Single-file build script

  • build-single-html.ts (Bun-run) producing dist/index.html fully self-contained (inline JS/CSS)

Deliverable E — Tests

  • Unit tests for lexer/parser/compiler/runtime
  • Integration tests for compile+run
  • Playwright smoke test loads the page and checks expected output

11. Risks and mitigations (Phase 1)

Risk: “Single-file index.html bundling gets messy”

Mitigation:

  • De-risk immediately by implementing the inlining build step in Phase 1 using Bun.build() + BuildArtifact.text(). (bun.com)
  • Avoid external assets (images/fonts) in Phase 1; everything is inline code.

Risk: “Playwright + Bun runtime instability”

Mitigation:

  • Don’t force Bun runtime for Playwright execution; use the stable Playwright runner path.
  • Keep e2e smoke minimal in Phase 1; expand in Phase 2. (GitHub)

Risk: “Language design painted into a corner”

Mitigation:

  • Keep the compiler API and host abstraction future-proof (module graph later).
  • Adopt a tiny core language; add features only when demanded by Phase 2/3 needs.

12. Definition of Done for Phase 1

Phase 1 is complete when:

  1. We can write a small program in the new language and:

    • compile it to JS
    • execute it under Bun tests
    • execute it in the browser runner
  2. We can produce a single dist/index.html that runs offline.

  3. Lint/format/test/e2e gates are wired and green.


If you want, when you ask for Phase 2, I’ll write it as a similarly detailed engineering doc focused on:

  • virtual filesystem representation + seeding with source code
  • file explorer UI model
  • editor component plan (minimal textarea vs richer editor)
  • live compilation loop and sandbox strategy
  • graphics runtime API design and rendering loop
  • deeper Playwright coverage for IDE workflows

Engineering Plan — Phase 2

Main-thread studio with VFS + editor + live recompile + graphics sandbox (no Web Worker)

Phase 2 builds the “graphical computing environment” on top of the Phase 1 compiler/runtime, while honoring your updated constraint:

  • No Web Worker: user programs execute on the main thread.
  • UI stays responsive: execution is cooperative (step-budgeted) and yields to the event loop between slices.
  • Single-file output: still emits one offline-openable dist/index.html.

The big technical shift vs the earlier Phase 2 draft is the runner: instead of isolating user code in a Worker, we’ll make the runtime/compiled output structurally incapable of hogging the thread for long by running under a trampoline + step budget.


1. Goals and non-goals

Goals (Phase 2)

  1. Studio UI

    • File explorer (system + workspace)
    • Text editor
    • Output panel (stdout)
    • Diagnostics panel (compile/runtime errors)
    • Canvas panel (graphics)
    • Run controls: Run, Auto-run, Stop, Reset
  2. Virtual filesystem (VFS)

    • In-memory hierarchical FS
    • Seeded with the project’s own source code (so you can explore it)
    • Workspace persisted (browser storage)
    • Read-only system tree by default, with “Copy to workspace” affordance
  3. Live recompilation

    • Debounced compile on edit when Auto-run is enabled
    • Diagnostics with spans (file + line/col + excerpt)
    • Successful compile triggers a clean “restart”
  4. Graphics computing environment

    • Canvas-based gfx/* stdlib API
    • input/* for keyboard/mouse
    • gfx/on-frame for animation sketches
  5. Main-thread cooperative execution

    • Programs run in bounded slices (N steps per slice)
    • Yield between slices (setTimeout(0) / requestAnimationFrame) to keep UI alive
    • Stop button cancels the job
  6. Testing expands substantially

    • Lots of Bun unit tests (VFS, runner, scheduling, gfx state machine)
    • DOM/UI tests under Bun (Bun’s test runner supports “UI & DOM testing”) (Bun)
    • Playwright e2e covering core workflows (playwright test, --ui for debugging) (Playwright)

Non-goals (Phase 2)

  • Two “little games” (Phase 3)
  • Fancy editor features (syntax highlighting, multi-cursor, LSP, etc.)
  • Hardened security isolation from a malicious program (no Worker means we can only do “best effort” sandboxing)

2. Architecture overview

2.1 High-level layers

  1. UI Layer

    • Explorer, Editor, Output, Diagnostics, Canvas
  2. Studio Model

    • Pure state + actions (open file, edit buffer, run, etc.)
  3. Services

    • VfsService (overlay + persistence)
    • CompileService (compile entry file → runnable artifact)
    • RunService (cooperative executor + cancellation + on-frame loop)
  4. Runtime

    • stdlib modules: std/*, gfx/*, input/*, ref/*, fs/*
    • stdout buffer + diagnostic helpers
  5. Renderer

    • Canvas rendering + input event wiring

This structure keeps most code testable without a real browser.

2.2 Repo layout (Phase 2 additions)

/src
  /studio
    model.ts            # state + actions + reducer (pure)
    controller.ts       # wires UI events to services (thin)
    view.ts             # DOM creation + minimal component helpers
    debounce.ts
  /vfs
    path.ts
    vfs.ts
    seed.ts             # types
    persist.ts
  /runner
    step.ts             # Step/Thunk types, trampoline
    scheduler.ts        # yield strategy, cancellation, budgets
    run-service.ts      # run/stop/on-frame orchestration
  /runtime
    runtime.ts          # Runtime object + buffer
    std.ts
    ref.ts
    fs.ts
    gfx.ts
    input.ts
  /web
    template.html
    app.ts              # boots studio
  /build
    generate-seed.ts
    build-single-html.ts
/tests
  ...lots...
/e2e
  ...playwright specs...

3. The critical piece: main-thread cooperative execution

With no Worker, the only real way to prevent UI lock-ups is to ensure the language runtime never runs “unbounded” in one JS turn.

3.1 Execution strategy: trampoline + step budget

We will execute compiled programs using a trampoline:

  • The compiler emits code where evaluation proceeds via thunks (zero-arg functions).
  • Each thunk represents “one small step”.
  • The trampoline runs up to budget steps, then yields to the event loop, then continues.

Core types:

type Value = unknown;

type Done = { done: true; value: Value };
type Thunk = () => Step;
type Step = Thunk | Done;

Trampoline slice:

function runSlice(step: Thunk, budget: number): { next?: Thunk; done?: Value } {
  let current: Thunk = step;
  for (let i = 0; i < budget; i++) {
    const s = current();
    if (typeof s === "function") {
      current = s;
    } else {
      return { done: s.value };
    }
  }
  return { next: current };
}

3.2 Yield strategy

We need rendering opportunities, not just microtasks. So the yield strategy should schedule a macrotask or a frame:

  • For “one-shot” runs (press Run / auto-run compile success): yield via setTimeout(0) to let the browser paint.
  • For animation (gfx/on-frame): run the callback in a rAF tick and cap steps per frame. If it doesn’t finish inside the per-frame budget, we carry it over to the next frame (and optionally warn).

3.3 Why this implies a compiler constraint

To make step budgeting meaningful, the generated JS must avoid:

  • raw while(true) / for(;;) loops
  • deep JS recursion
  • calling user-defined functions through JS array methods that hide control flow

So in Phase 2, we enforce:

  • the language does not have built-in loop constructs yet (animation uses gfx/on-frame)
  • function calls are trampolined (no JS stack growth)

This is the main technical work in Phase 2’s language/runtime area: adjusting codegen so it can resume execution after a yield.

3.4 Cancellation

A “Stop” button should:

  • mark the current job as cancelled
  • prevent further slices from scheduling
  • clear animation loop callback(s)

Because we’re stepping, we can check job.cancelled at slice boundaries and optionally every N steps.

3.5 Runtime errors

Runner wraps each slice in try/catch:

  • On exception, stop the job and report a runtime diagnostic in the diagnostics panel.
  • Keep the UI alive.

4. Virtual filesystem design

4.1 Requirements

  • Two top-level trees:

    • /system/** — seeded, read-only
    • /workspace/** — user-editable, persisted
  • Deterministic directory listings

  • Strict path normalization (no .. escape)

4.2 Data model

Use an overlay filesystem:

  • SeedFS: Map<string, string> (path → file text)
  • UserFS: Map<string, string> + directory index
  • OverlayFS: resolves reads from UserFS first, then SeedFS

Operations:

interface VFS {
  readText(path: string): string | null;
  writeText(path: string, text: string): void;   // workspace only
  listDir(path: string): DirEntry[];
  stat(path: string): Stat | null;
  mkdir(path: string): void;
  remove(path: string): void;
  rename(from: string, to: string): void;
}

4.3 Persistence

Phase 2 baseline: persist UserFS to localStorage as JSON.

  • Pros: simple, no async, small code
  • Cons: size limit (but acceptable for this project’s goals)

We keep the persistence logic behind an interface so IndexedDB can replace it later without rewriting VFS.

4.4 Seeding with “its own source code”

Build step generates a module containing a manifest:

export const SEED_FILES: Record<string, string> = {
  "/system/src/runtime/runtime.ts": "....",
  ...
};

This makes the environment “self-inspecting” offline.


5. Studio UI (Explorer + Editor + Output + Canvas)

5.1 Layout

A simple three-pane layout:

  • Left: file explorer
  • Center: editor
  • Right: tabs or stacked panels: Output / Diagnostics / Canvas

Top toolbar:

  • Entry file (default /workspace/main.tw)
  • Run
  • Auto-run toggle
  • Stop
  • Reset (clears stdout + gfx state, restarts runtime without changing editor buffer)

5.2 Editor MVP

Use <textarea> (monospace) for Phase 2:

  • loads file content
  • Ctrl/Cmd+S saves to VFS
  • optionally show line numbers (not required, but helpful)

This keeps the system dependency-free and readable.

5.3 File explorer MVP

  • Expand/collapse directories
  • Click file to open
  • Create/rename/delete for /workspace/** only
  • System tree is read-only; opening works; writing triggers “Copy to workspace” flow

6. Live compilation loop

6.1 Triggers

  • On edit:

    • mark dirty
    • if Auto-run: debounce (e.g. 300ms), then compile + run
  • On Run:

    • save (optional policy: either auto-save current buffer or run buffer directly; choose one and test it)
    • compile + run
  • On Stop:

    • cancel current run job
  • On Reset:

    • clear runtime state + restart animation loop if program registers it again

6.2 Compile service

Phase 2 compile reads from VFS:

type CompileResult =
  | { ok: true; program: CompiledProgram; diagnostics: Diagnostic[] }
  | { ok: false; diagnostics: Diagnostic[] };

compileFromVfs(entryPath: string, vfs: VFS): CompileResult;

CompiledProgram is the thing the runner executes (e.g., a top-level Thunk, plus metadata for sourcemaps/diagnostics).

6.3 Diagnostics UX

Show:

  • file path
  • line:col
  • excerpt with caret underline
  • diagnostic kind: parse / compile / runtime

7. Graphics environment design (gfx/*)

Even on main thread, we keep gfx as a disciplined API rather than letting programs poke canvas directly.

7.1 Canvas host

  • One <canvas> element
  • Resizable via gfx/size

7.2 Command buffer

gfx/* functions enqueue drawing commands into a buffer:

type GfxCommand =
  | { op: "size"; w: number; h: number }
  | { op: "clear"; r: number; g: number; b: number; a: number }
  | { op: "fill"; r: number; g: number; b: number; a: number }
  | { op: "rect"; x: number; y: number; w: number; h: number }
  | ...

The runner flushes:

  • after each run slice yield (optional)
  • and at least once per rAF frame if an animation is active

This keeps rendering consistent and testable.

7.3 Animation: gfx/on-frame

Design:

  • gfx/on-frame registers a callback (fn (t) ...)

  • Studio has a rAF loop:

    • each frame calls the callback with t (ms since start)
    • callback executes with a per-frame step budget
    • flush gfx buffer once per frame

If the callback does not complete within budget:

  • we pause it and resume next frame
  • (optional) show “frame budget exceeded” warning in diagnostics

7.4 Input: input/*

We wire DOM events and maintain state:

  • mouse x/y
  • mouse down
  • set of keys down

Stdlib accessors:

  • input/mouse-x, input/mouse-y
  • input/mouse-down?
  • input/key-down? "ArrowLeft"

8. Runtime stdlib extensions for Phase 2

In addition to Phase 1 core stdlib:

8.1 ref/* (mutable cells for interactive programs)

  • ref/new x
  • ref/get r
  • ref/set r x

This avoids adding assignment syntax yet still allows stateful sketches and, later, games.

8.2 fs/* (controlled access to VFS)

We do not expose the real browser FS. This is still “no FFI”.

  • fs/read-text "/workspace/foo.tw" → string or nil
  • fs/write-text "/workspace/foo.tw" "..." → ok/nil
  • fs/list "/workspace" → list of entries

Enforce workspace-only writes.


9. Build pipeline: still emits one dist/index.html

9.1 Inputs that must be embedded

  • Studio app bundle (JS + CSS)
  • Seed file manifest (source code text)
  • Any default workspace files (like /workspace/main.tw sample)

9.2 Bun build approach

Use Bun.build() to bundle the browser app. (Bun) Bundler outputs are BuildArtifact objects (Blob-like), so we can read them and inline them into HTML. (Bun)

Key build config choices:

  • splitting: false (single output) (Bun)
  • minify: false (readability; smaller “documentation” gap)
  • sourcemap: "inline" in dev build (optional; controlled by env)

9.3 Single-file HTML emitter

build-single-html.ts:

  1. Read template.html
  2. Run Bun.build({ entrypoints: ["src/web/app.ts"], ... })
  3. Inline the JS bundle into <script type="module">...</script>
  4. Inline CSS similarly (or keep CSS inside JS if bundler does that)
  5. Write dist/index.html

10. Tooling and quality gates (Bun + oxlint + oxfmt)

10.1 Bun as the default toolchain

  • Tests: Bun’s test runner supports TypeScript and snapshot testing, plus UI/DOM testing. (Bun)
  • Watch mode available (bun test --watch) per Bun test docs. (Bun)

10.2 Linting: oxlint

  • oxlint --fix for safe automatic fixes. (Oxc)

10.3 Formatting: oxfmt

  • oxfmt --check for CI validation. (Oxc)

Suggested scripts:

  • lint: oxlint
  • lint:fix: oxlint --fix
  • format: oxfmt --write .
  • format:check: oxfmt --check . (Oxc)
  • test: bun test
  • test:watch: bun test --watch (Bun)
  • build: bun run src/build/build-single-html.ts
  • e2e: playwright test

11. Testing plan (Phase 2)

11.1 Unit tests (Bun) — heavy emphasis

Bun test runner supports TS + snapshots and explicitly calls out UI/DOM testing support. (Bun)

VFS tests

  • Path normalization and traversal rules
  • Overlay precedence correctness
  • Read-only enforcement for /system/**
  • Workspace mutations: create/rename/delete, directory listing stability
  • Persistence: serialize/deserialize; corrupted storage recovery

Studio model reducer tests (pure)

  • Open file updates buffer
  • Edit marks dirty
  • Save writes to VFS and clears dirty
  • Auto-run toggle changes behavior
  • Status transitions: compiling → ok/error, running → stopped/error

Cooperative scheduler / trampoline tests (core!)

  • Budget slicing: run a program that requires many steps and assert it completes across multiple slices
  • Yield behavior: with a tiny budget, assert the scheduler yields and resumes
  • Cancellation: start a long job, cancel, ensure it stops scheduling further slices
  • No stack blowup: recursive function compiled into thunks should not overflow JS stack

Runtime/std/gfx tests

  • print buffering
  • ref/* correctness
  • gfx command queue correctness
  • gfx/on-frame registration semantics (single callback vs multiple; define and test)

11.2 DOM/UI tests under Bun

Because Bun supports UI/DOM testing as a built-in test runner feature, we can run lightweight DOM tests for:

  • file explorer renders expected nodes
  • clicking a file opens it
  • Ctrl/Cmd+S triggers save
  • diagnostics panel shows errors

(Keep these tests narrow: wiring + rendering correctness, not “full browser fidelity”.)

11.3 E2E tests (Playwright)

Playwright’s standard workflow is playwright test, with --ui for interactive debugging. (Playwright)

E2E scenarios for Phase 2:

  1. Boot smoke

    • open served dist/index.html
    • assert explorer + editor + output + canvas exist
  2. Edit → live recompile → stdout updates

    • open /workspace/main.tw
    • type (print "hello")
    • assert output contains hello
  3. Compile error surfaced

    • introduce parse error (missing ))
    • assert diagnostics panel shows message + line/col
  4. Canvas deterministic draw

    • program draws a solid rect at known coordinates
    • sample canvas pixel at center; assert RGBA
  5. Stop button works

    • run a program that registers gfx/on-frame and increments a counter in stdout each frame
    • press Stop
    • assert counter stops increasing

Note on Bun runtime mode for Playwright

There are known issues reported when forcing Playwright to run under Bun’s runtime flag (e.g., hangs/segfaults). We should run Playwright in its stable default mode (Node) while still using Bun for installs/scripts. (GitHub)


12. Implementation sequence (Phase 2 milestones)

Milestone 2.1 — VFS + seed generation

  • Implement path normalization utilities + tests
  • Implement SeedFS + UserFS + OverlayFS + tests
  • Implement generate-seed.ts build step
  • Minimal explorer UI listing seeded files (read-only)

Milestone 2.2 — Editor MVP + open/save

  • Textarea editor
  • Open file from explorer
  • Save to /workspace + persistence
  • Add DOM tests for open/save wiring

Milestone 2.3 — CompileService from VFS + diagnostics panel

  • Compile entry file read from VFS
  • Display diagnostics with excerpts
  • Auto-run debounce plumbing

Milestone 2.4 — Main-thread RunService with cooperative scheduling

  • Implement trampoline engine + scheduler + cancellation
  • Adjust compiler backend (or codegen mode) so execution is thunked/stepwise
  • Run button executes program without freezing UI

Milestone 2.5 — Graphics + input + on-frame loop

  • Canvas panel
  • gfx command queue + flush
  • gfx/on-frame + rAF integration with per-frame budget
  • Input state from DOM

Milestone 2.6 — Single-file build hardening

  • Ensure dist/index.html contains everything
  • Ensure it runs offline (no network requests)
  • Add a build “smoke test” (optional): parse output and check required markers exist

Milestone 2.7 — Expand Playwright coverage

  • Add the five e2e tests above
  • CI-friendly headless mode
  • Add --ui debugging instructions for local dev

13. Risks and mitigations (no Worker edition)

Risk: You can’t truly preempt main-thread code

Mitigation:

  • Ensure language execution is always under trampoline control.
  • Avoid emitting JS constructs that can run unbounded without returning control.

Risk: Compilation itself can block the UI

Mitigation:

  • Debounce compile.
  • Keep language small.
  • If it becomes an issue later: split compile work into chunks and yield between pipeline stages (rarely needed for small projects).

Risk: Users can write “forever” programs

Mitigation:

  • Distinguish “top-level run must finish” vs “on-frame is allowed to be endless”.
  • Provide Stop button and a clear “running” indicator.

Risk: Single-file seed manifest becomes huge

Mitigation:

  • Only seed what’s needed for exploration:

    • /src/**, /docs/** (optional), maybe omit /tests/** in the shipped index, or provide a build flag to include/exclude.
  • Keep a manifest size report in build output so it’s visible.


14. Definition of Done (Phase 2)

Phase 2 is done when:

  1. dist/index.html (single file) opens offline and shows:

    • explorer + editor + stdout + diagnostics + canvas
  2. Editing /workspace/main.tw:

    • auto-compiles (when enabled)
    • runs on success (stdout/canvas updates)
    • shows diagnostics on error
  3. Graphics demo runs via gfx/on-frame without freezing UI.

  4. Stop works (cancels animation).

  5. Tests:

    • substantial Bun unit tests across VFS/model/scheduler/runtime/gfx
    • Playwright e2e covers at least the five workflows above using playwright test (Playwright)
    • oxlint and oxfmt checks are wired (oxlint --fix available; oxfmt --check in CI). (Oxc)

If you want, when we move to Phase 3, I’ll structure it around: (1) two small games built in the language using gfx/* + input/* + ref/*, (2) better in-app documentation tour (openable files + “start here”), and (3) tightening the runtime/compiler ergonomics (errors, performance, UX polish) while keeping the single-file constraint.

Engineering Plan — Phase 3

Two little games + “documentation as code” structure (clear, navigable, runnable)

Phase 3 is intentionally narrow:

  1. Ship two tiny games written in the new language (using gfx/*, input/*, ref/*, plus a couple small helper libs).
  2. Make the repo and the in-browser filesystem feel like a self-teaching system: the code is the documentation, and the docs are runnable.

Everything still compiles into one offline index.html and is browsable/editable via the studio.


1. Deliverables at the end of Phase 3

1. Two games (in-language)

  • Pong (classic paddles + ball + score)
  • Snake (grid snake + food + score)

Both:

  • run from the studio with a single click (or “Run current file”)

  • have a short How to play header comment inside the .tw file

  • are structured as a readable reference implementation for:

    • gfx/on-frame
    • input handling
    • state management with ref
    • deterministic update loops (fixed timestep or frame-delta)

2. Documentation-as-code structure

  • A numbered, curated docs directory that reads top-to-bottom
  • A set of runnable examples (small .tw programs) that the docs reference
  • A source tour that points you to the most important TypeScript files (compiler/runtime/studio) and tells you what to notice
  • A “Docs don’t rot” test suite: examples compile and (where feasible) run under Bun tests

3. Testing

  • Unit tests (Bun) for:

    • shared game math / collision helpers
    • snake movement + food placement determinism
    • pong collision + scoring invariants
  • Integration tests that compile & run game modules in a headless runtime with stubbed gfx/input

  • Playwright e2e flows:

    • launch each game
    • simulate a few inputs
    • assert visible scoreboard changes / canvas pixel changes
    • assert Stop works and UI stays responsive

2. Game selection rationale (why these two)

Pong

  • Minimal asset needs: rectangles + text.
  • Demonstrates continuous motion + collision response.
  • Great showcase for gfx/on-frame and per-frame dt.

Snake

  • Discrete grid logic + deterministic random placement (food).
  • Demonstrates “game state as data” and list operations.
  • Great showcase for fixed-timestep updates and using host-seeded RNG deterministically in tests.

These two together prove the platform is useful without needing fancy language features.


3. Required minimal language/runtime helpers (only what games need)

To keep game code readable (and documentation-grade), Phase 3 should add/standardize a couple small helpers. Keep these tiny and well-tested.

3.1 Minimal “record” helpers (for readable state)

If game state is only nested lists, it becomes hard to read. A tiny record/dict API makes code dramatically clearer.

Proposed stdlib additions (backed by plain JS objects internally):

  • rec/new k1 v1 k2 v2 ... → record/object
  • rec/get r k → value or nil
  • rec/set r k v → record (mutable or persistent; pick one and document)
  • rec/has? r k

This is still “no FFI” because the only way to interact is through stdlib.

Testing requirement: strict unit tests around behavior, including missing keys and key type constraints (strings only).

3.2 Math helpers

  • math/abs, math/min, math/max
  • math/clamp x lo hi
  • math/floor (for Snake grid)
  • math/rand-int n (uses host random; returns integer in [0, n))

3.3 List helpers (just enough for Snake)

Prefer keeping this minimal and orthogonal:

  • list/cons x xs (prepend)
  • list/head xs
  • list/tail xs
  • list/slice xs start end (or list/take xs n)
  • list/contains? xs x (requires equality semantics; for Snake positions we can store as strings "x,y" to simplify)

If you want to keep stdlib smaller, we can implement these in a shared game library file instead of stdlib—but then we want a way to reuse them cleanly (see includes below).

3.4 Minimal “include” mechanism (recommended for docs-quality)

To make the games readable, we want shared helpers in one place.

Add a compile-time special form:

  • (include "path") where path is a string literal
  • The compiler resolves relative to the including file, reads from VFS, parses, and inlines the AST.
  • No dynamic imports, no runtime loading, no cycles (detect and error).
  • This is not an FFI; it’s a compile-time convenience.

If you absolutely don’t want include yet, keep each game as a single file and accept some duplication. For “documentation as code,” I strongly recommend include.


4. Game architecture pattern (shared by Pong and Snake)

Make both games follow the same structure so they teach the system consistently.

4.1 The “three-function” pattern

Each game file exposes:

  • init() -> State
  • update(state, input, dtMs) -> State
  • render(state) -> nil (emits gfx commands)

Then the program wires it up:

  • stateRef = ref/new(init())

  • lastTRef = ref/new(0)

  • (gfx/on-frame (fn (t) ...)):

    • compute dt = clamp(t - lastT, 0, 33) (or similar)
    • snapshot input (read input/* once per frame)
    • state := update(state, input, dt)
    • render(state)

4.2 Input snapshot shape

Do not sprinkle input/key-down? throughout update; keep it centralized.

Example record:

  • input = rec/new "up" (input/key-down? "ArrowUp") "down" (input/key-down? "ArrowDown") "left" (input/key-down? "ArrowLeft") "right" (input/key-down? "ArrowRight") "w" (input/key-down? "w") "s" (input/key-down? "s")

That input object is easy to stub in tests.

4.3 Rendering convention

  • Always clear the screen (gfx/clear) each frame.
  • Draw UI text (score, instructions) as a consistent overlay.
  • Keep rendering side-effect-only; state changes happen in update.

4.4 Determinism conventions (for tests)

  • Snake’s food placement must use math/rand-int and the host RNG should be seedable in Bun tests.
  • Pong should not use randomness.
  • Both games should avoid relying on “real time” except through dtMs provided by on-frame, which tests can simulate.

5. Pong design and implementation details

5.1 Gameplay

  • Two paddles: left controlled by W/S, right by Up/Down.
  • Ball bounces off paddles and top/bottom walls.
  • If ball passes a side boundary, opposite player scores, ball resets to center.

5.2 State representation (record)

Example keys:

  • "w", "h": canvas size
  • "paddleH", "paddleW"
  • "p1y", "p2y": paddle y positions
  • "ballX", "ballY", "ballVx", "ballVy"
  • "score1", "score2"
  • "serving": 1 or 2 (optional)

5.3 Update logic (continuous)

  • Paddle movement:

    • p1y += speed * dt * (w? - s?)
    • clamp to [0, h - paddleH]
  • Ball movement:

    • ballX += ballVx * dt
    • ballY += ballVy * dt
  • Wall collision:

    • if ballY < 0 or ballY > h: invert ballVy, clamp inside bounds
  • Paddle collision:

    • treat paddles as AABBs; if intersect and ball moving toward paddle:

      • invert ballVx

      • optionally add “english” based on impact point:

        • ballVy += (ballY - paddleCenterY) * factor
      • clamp ballVy to avoid runaway speeds

  • Score:

    • if ballX < 0: score2++, reset ball
    • if ballX > w: score1++, reset ball

5.4 Render logic

  • gfx/size w h
  • clear background
  • draw center line (optional)
  • draw paddles + ball as rectangles
  • draw score text

5.5 “Doc header” inside the file

At the top of /system/games/01-pong.tw:

  • How to play
  • What to look for (update loop, collision)
  • Pointers to shared helpers used (e.g., include "../lib/math.tw")

5.6 Pong acceptance criteria

  • Ball bounces correctly
  • Scoring works
  • UI remains responsive
  • “Stop” stops animation
  • Code is readable and heavily commented at key points (not every line)

6. Snake design and implementation details

6.1 Gameplay

  • Grid-based snake moves at fixed tick (e.g., 120ms per step).
  • Arrow keys change direction (no immediate 180° reversal).
  • Eat food → grow by 1, score increases.
  • Hit wall or self → game over; press R to restart.

6.2 Fixed timestep update (important)

Snake should not advance one cell per frame (frame rate dependent). Use accumulator:

State keys:

  • "tickMs" (e.g., 120)
  • "accMs" accumulated ms since last move
  • "dir" one of "up"|"down"|"left"|"right"
  • "nextDir" (input wants to turn)
  • "snake" list of cells, head first (cells can be "x,y" strings for simplicity)
  • "food" cell string
  • "score"
  • "alive" boolean

Update(state, input, dt):

  • accMs += dt

  • process turn input → update nextDir

  • while accMs >= tickMs:

    • accMs -= tickMs

    • perform one grid step:

      • apply direction change rules

      • compute new head cell

      • if wall collision → alive=false

      • if self collision → alive=false

      • else:

        • prepend new head

        • if head == food:

          • score++
          • place new food using RNG avoiding snake cells
        • else:

          • remove last tail cell
  • return new state

Food placement:

  • use math/rand-int with a bounded attempt loop:

    • randomly pick cell
    • if not in snake → use
    • else retry (cap attempts; if full grid, win state)

6.3 Rendering

  • Compute cell size from canvas size / grid dims.
  • Draw background.
  • Draw snake head in brighter color (or same; keep minimal).
  • Draw food.
  • Draw score + instructions (“R to restart”).

6.4 Snake acceptance criteria

  • Deterministic tick-based movement.
  • No 180° immediate reversal.
  • Eating food grows snake.
  • Game over state and restart.
  • Fully testable update logic (see testing plan).

7. How the studio exposes the games (UX + doc friendliness)

Phase 3 should make the games and docs discoverable without needing external instructions.

7.1 Default startup behavior

On first load:

  • open /system/docs/00-start-here.md in the editor

  • show a “Quick links” section at the top of that file:

    • open Pong
    • open Snake
    • open Graphics tutorial
    • where to find compiler/runtime sources

7.2 Run current file (tiny but huge UX win)

Add a toolbar button:

  • Run current file

    • compiles and runs whatever file is currently open
    • if it’s read-only system file, it still runs (fine)
  • Keep existing “entryPath” mechanism as advanced usage, but make “Run current file” the primary flow for docs + examples + games.

7.3 “Fork to workspace” for safe editing

When user edits a /system/** file:

  • show non-intrusive banner:

    • “System files are read-only. [Fork to workspace]”
  • Fork action copies file to /workspace/... and opens the fork.

This makes exploration safe and encourages modification.


8. Documentation-as-code structure

This is the “clear structure” part: the filesystem itself is a guided textbook.

8.1 Seeded directory layout (in VFS)

Use numeric prefixes so the explorer naturally orders things:

/system
  /docs
    00-start-here.md
    01-how-to-run.md
    02-language-tour.md
    03-stdlib-tour.md
    04-graphics-tour.md
    05-studio-architecture-tour.md
    06-testing-tour.md
    07-games-tour.md

  /examples
    00-hello.tw
    01-functions.tw
    02-refs.tw
    03-graphics-basics.tw
    04-input-basics.tw
    05-animation-basics.tw

  /games
    00-readme.md
    lib/
      00-readme.md
      math.tw
      collision.tw
      input.tw
      render.tw
    01-pong.tw
    02-snake.tw

And separately, the “system source” is already seeded from your repo, e.g.:

  • /system/src/lang/*
  • /system/src/runtime/*
  • /system/src/studio/*

8.2 What goes in each doc file (content contracts)

Keep docs short, concrete, and file-path oriented. The system is a codebase, not a blog post.

00-start-here.md

  • What this is

  • 60-second checklist:

    • open Pong, press Run current file
    • open Snake, press Run current file
    • open graphics example, tweak numbers
  • How to fork a file to workspace

02-language-tour.md

  • Tiny language spec (s-exprs)

  • Links to examples:

    • /system/examples/00-hello.tw
    • /system/examples/01-functions.tw
  • Where the parser/compiler live in TS and what to read first

04-graphics-tour.md

  • gfx/* overview
  • runnable examples that draw and animate
  • how gfx/on-frame works

07-games-tour.md

  • game architecture pattern: init/update/render

  • explicit pointers:

    • “read Pong first”
    • “Snake shows fixed timestep + RNG”
  • how to run and how to modify

8.3 “Documentation in code” conventions (for TS and .tw)

Conventions that make the source teach itself:

Every file starts with a header comment answering:

  • What this module does
  • What it exports (or “top-level program”)
  • Invariants / assumptions
  • How to test it (path to test file)
  • For .tw programs: how to run it and controls

Keep helpers tiny and named

  • prefer clamp, aabb-intersect?, step-snake over inline arithmetic
  • name intermediate values; avoid “magic math” without a comment

Tests are part of documentation

  • test file names mirror module names
  • tests are written as readable behavioral specs

8.4 Executable docs (“examples are chapters”)

Every .md doc should link to at least one runnable .tw file that demonstrates the feature, so users can:

  1. open the example
  2. hit Run current file
  3. tweak it and see changes

This is what makes the environment “computing environment” rather than static docs.


9. Testing plan for Phase 3 (games + docs)

9.1 Unit tests (Bun) — shared helpers

  • math/clamp correctness
  • collision helpers (AABB intersections) correctness
  • deterministic RNG behaviors if math/rand-int is new/modified

9.2 Language-level integration tests (Bun) — game logic

Goal: verify gameplay rules without needing a browser canvas.

Approach:

  • compile the game module (or an extracted pure module) and evaluate it in Bun tests

  • provide a runtime with:

    • stubbed gfx that records commands (optional)
    • deterministic random() for Snake tests
  • call init, update, and inspect returned state (records)

Recommended: keep update pure and make it return the next state (record), so TS tests can assert:

  • Pong:

    • ball reflects off paddle
    • scoring increments and ball resets
  • Snake:

    • snake advances exactly one cell per tick
    • reversal is disallowed
    • food-eating grows snake and increments score
    • self-collision causes alive=false

9.3 “Docs don’t rot” tests

Add a test that compiles all files under:

  • /system/examples/*.tw
  • /system/games/*.tw

For animation examples, don’t run indefinitely:

  • run only a bounded number of frames/steps (your cooperative scheduler makes this easy)
  • ensure no runtime errors and at least one render flush occurred (for gfx examples)

9.4 Playwright e2e (browser) — minimal but meaningful

E2E scenarios:

  1. Pong launches

    • open /system/games/01-pong.tw
    • click Run current file
    • assert canvas changes from blank (pixel sample) or scoreboard text appears
  2. Pong input affects paddle

    • press ArrowUp
    • wait a few frames
    • verify paddle position changed (can be via visible text overlay like P2Y: or via pixel sample in paddle region)
  3. Snake launches and moves

    • open snake, run
    • wait ~500ms
    • verify snake head moved (again via overlay text like HEAD: x,y or via pixel sample)
  4. Snake eats food deterministically

    • run with a known seed (either default fixed seed in studio for e2e, or expose a “seed” input)
    • simulate direction changes
    • verify score increments
  5. Stop works

    • start a game
    • click Stop
    • verify scoreboard stops updating / animation halts

Reliability tip: For tests, render some state text (score/head position) to a dedicated DOM element (e.g., <pre id="hud">) so e2e can assert without brittle pixel sampling. That is not user-facing “FFI”—it’s just UI.


10. Implementation sequence (Phase 3 work plan)

Milestone 3.1 — Docs skeleton + quick links + Run current file

  • Create /system/docs/* and /system/examples/*
  • Ensure explorer ordering (numeric prefixes)
  • Add “Run current file” button
  • On startup, open /system/docs/00-start-here.md

Milestone 3.2 — Shared game library (tiny, readable)

  • Add /system/games/lib/* with:

    • math helpers (clamp, etc.)
    • collision helpers (pong)
    • input snapshot helpers
  • Unit tests for helpers

Milestone 3.3 — Pong

  • Implement init/update/render pattern
  • Add “How to play” header comment
  • Add Bun integration tests for collision + scoring invariants
  • Add Playwright “Pong launches” e2e

Milestone 3.4 — Snake

  • Implement fixed timestep accumulator
  • Deterministic RNG food placement (seeded)
  • Add Bun integration tests for movement/eating/game over
  • Add Playwright “Snake moves” + “Snake eats food (seeded)” e2e

Milestone 3.5 — Docs polish + “docs compile” gate

  • Ensure every doc references runnable examples
  • Add “compile all examples/games” test
  • Add /system/games/00-readme.md that lists controls and explains structure

11. Definition of Done (Phase 3)

Phase 3 is complete when:

  • There are two playable games in /system/games/ (Pong + Snake), runnable from the studio, responsive, with Stop working.

  • The seeded filesystem contains a numbered docs set that:

    • teaches usage in 5–10 minutes
    • points to runnable examples
    • points to source code locations for deeper reading
  • Tests:

    • Bun tests verify core game invariants and compile all examples
    • Playwright e2e verifies both games launch and respond to input
  • The final build remains a single dist/index.html containing docs, games, and system source.


If you want, next I can turn this Phase 3 plan into:

  • a concrete file-by-file outline (exact contents + key functions per file), and
  • a prioritized test list with specific test names and the minimal harness needed to run .tw game logic headlessly under Bun.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment