Skip to content

Instantly share code, notes, and snippets.

@regenrek
Last active January 14, 2026 22:53
Show Gist options
  • Select an option

  • Save regenrek/18a7b93010f928e7ed9a3a7bc5d89a28 to your computer and use it in GitHub Desktop.

Select an option

Save regenrek/18a7b93010f928e7ed9a3a7bc5d89a28 to your computer and use it in GitHub Desktop.
Codex Overdose
# Word Frequency Analysis of ~/.codex/sessions
This prompt performs an intent-based word frequency analysis of conversation content in ~/.codex/sessions, focusing on **what you're trying to communicate** rather than individual words.
---
## Instructions
Execute the following steps and provide **only the summary tables** with grouped intents.
## Step 1: Extract Conversations
```bash
find ~/.codex/sessions -type f -name "*.jsonl" -exec cat {} \; 2>/dev/null | \
jq -r 'select(.type == "response_item") |
select(.payload.role == "user" or .payload.role == "assistant") |
.payload.content[]? |
select(.type == "input_text" or .type == "text") |
.text' 2>/dev/null > /tmp/full_conversations.txt
```
## Step 2: Clean Conversations
```bash
grep -v "<user_instructions>" /tmp/full_conversations.txt | \
grep -v "<environment_context>" | \
grep -v "Generated by Ruler" | \
grep -v "^<!--" | \
grep -v "^\s*\*\s" | \
grep -v "^#\+" | \
grep -v "^\s*$" > /tmp/clean_conv.txt
```
## Step 3: Generate Word Frequency
```bash
cat /tmp/clean_conv.txt | \
tr -cs 'A-Za-z0-9\n' ' ' | \
tr 'A-Z' 'a-z' | \
tr -s ' ' '\n' | \
sort | uniq -c | sort -rn > /tmp/raw_words.txt
```
## Step 4: Filter System Noise
Remove these patterns:
**System Metadata**: tokens, input, output, window, credits, policy, limits, cached, usage, resets, timestamp, payload, event, msg
**Tech Stack**: react, node, rust, go, ts, js, tauri, popover, ui, api, server, modules, types, frontend, backend, cli
**Tooling**: tmux, shell, git, grep, find, cat, sed, awk, curl, wget, npm, pnpm, yarn, bun, cargo, docker
**File Operations**: file, files, src, path, paths, dir, dirs, repo, repos, root, folder, cwd
**Agent/System**: codex, skill, skills, agent, user, users, context, instructions, rules, model, provider, claude, gpt
**Stopwords**: the, and, a, or, to, for, in, with, if, that, it, of, is, you, not, no, when, from, at, on, be, as, so, any, all, we, can, do, are, this, them
---
## Intent Grouping Guidelines
When analyzing the remaining words, **group related words by intent** and sum their counts. Examples:
**Quality Standards** β†’ clean (6482) + solid (3576) + proper (1319) + correct (2493) + stable (1058) = **15,928**
- Intent name: "Demand Quality"
- Meaning: "Require high-quality, proper implementations"
**Technical Debt** β†’ tech debt (4482) + technical debt (59) + legacy (3204) + refactor (2940) = **10,685**
- Intent name: "Manage Technical Debt"
- Meaning: "Actively managing and reducing debt"
**Production Ready** β†’ production (1894) + ready (2215) = **4,109**
- Intent name: "Production Focus"
- Meaning: "Focus on deployable, production-grade solutions"
---
## Required Output (Summary Tables Only)
### Table 1: Overview Statistics
| Metric | Count |
|--------|-------|
| Session Files Analyzed | [count] |
| Conversation Lines | [count] |
| Unique Words (Raw) | [count] |
| After Noise Filter | [count] |
| Noise Removed | [count] ([%]) |
### Table 2: 🎯 Primary Intents (Top 8)
Group related words by what you're trying to accomplish
| Intent | Count | Meaning |
|--------|-------|---------|
| Build & Fix | 29,254 | "Build and fix things" |
| [intent name] | [total] | [what this communicates] |
| ... | ... | ... |
**Format rules:**
- Count column: Just the total number (clean, no breakdown)
- Meaning column: First-person quote about what this communicates
- Sort by Count descending
- Max 8 rows
### Table 3: ⚑ Quality Standards (Top 8)
Group words indicating how you want things done
| Intent | Count | Meaning |
|--------|-------|---------|
| Demand Quality | 15,928 | "Require high-quality implementations" |
| [intent name] | [total] | [what this communicates] |
| ... | ... | ... |
(Max 8 rows, sort by Count descending)
### Table 4: 🚨 Anti-Patterns to Avoid (Top 8)
Group words indicating what to reject
| Intent | Count | Meaning |
|--------|-------|---------|
| Reject Hacks | 4,267 | "No quick fixes or dirty solutions" |
| [intent name] | [total] | [what this communicates] |
| ... | ... | ... |
(Max 8 rows)
### Table 5: πŸ“ Constraints & Requirements (Top 8)
Group words indicating rules or boundaries
| Intent | Count | Meaning |
|--------|-------|---------|
| [intent name] | [total] | [what this communicates] |
| ... | ... | ... |
(Max 8 rows)
### Table 6: ⏱️ Time Horizon (Top 8)
Group words indicating urgency or timeline
| Intent | Count | Meaning |
|--------|-------|---------|
| Long-term Focus | 4,344 | "Prefer sustainable solutions" |
| Avoid Rushed | 1,398 | "Only 24% of long-term mentions" |
| [intent name] | [total] | [what this communicates] |
| ... | ... | ... |
(Max 8 rows)
### Table 7: πŸ’¬ Communication Style (Top 8)
Group words indicating how you interact
| Intent | Count | Meaning |
|--------|-------|---------|
| [intent name] | [total] | [what this communicates] |
| ... | ... | ... |
(Max 8 rows)
### Table 8: 🧠 Architecture Concerns (Top 8)
Group words indicating technical/architectural focus
| Intent | Count | Meaning |
|--------|-------|---------|
| State & Data Flow | 21,401 | "Heavy focus on state management and data flow" |
| Concurrency | 8,012 | "Significant concurrency concerns" |
| [intent name] | [total] | [what this communicates] |
| ... | ... | ... |
(Max 8 rows)
### Table 9: πŸ“Š Status & Progress (Top 8)
Group words indicating completion or progress
| Intent | Count | Meaning |
|--------|-------|---------|
| [intent name] | [total] | [what this communicates] |
| ... | ... | ... |
(Max 8 rows)
### Table 10: πŸ†• Emerging Patterns (Top 8)
New intent groupings discovered this run (words >100 that form new patterns)
| Intent | Count | Meaning |
|--------|-------|---------|
| [intent name] | [total] | [what this might indicate] |
| ... | ... | ... |
(Max 8 rows)
---
## Intent Grouping Rules
1. **Sum counts** for all words in the intent group
2. **Name the intent** based on what the words communicate (e.g., "Demand Quality" not "Quality Words")
3. **Write meaning** as a first-person quote: what this says about your communication style
4. **Count column**: Only the total number (e.g., 29,254)
5. **Sort by total count** descending
6. **Max 8 rows per table**
7. **Only create groups with total count > 500** (ignore smaller patterns)
---
## Formatting Requirements
- **Only output the 10 tables above**
- No explanations between tables
- Use markdown table format
- Include emoji icons in table headers
- Sort all tables by Count column (descending)
- Maximum 8 rows per table
- Count column: clean number only (no formula, no breakdown)
- Meaning column: quoted statement
- No horizontal rules between tables
- Compact output only
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment