bgauryy/LOCAL_TOOLS_TEST_PLAN.md

## LOCAL_TOOLS_TEST_PLAN.md

      
    Raw
  

              LOCAL_TOOLS_TEST_PLAN.md
            
          
    Local Tools Test Plan: AI Coding Assistants vs Octocode MCP


Comprehensive test plan comparing AI coding assistant internal tools with Octocode MCP local tools
Objective: Validate that Octocode local tools provide superior context, efficiency, output quality, token safety, and security compared to built-in tools in Claude Code and Cursor.


Tool Mapping Overview


#
Claude Code Tool
Cursor Tool
Octocode MCP Tool
Primary Use Case


1
Grep
grep
localSearchCode
Pattern search in code files


2
Bash(ls) / Glob
list_dir
localViewStructure
Directory listing and exploration


3
Read
read_file
localGetFileContent
Reading file contents


4
Glob
glob_file_search
localFindFiles
Finding files by name/pattern/metadata


Test Dimensions


Dimension
Description
Weight


Context Quality
Does the tool provide actionable, structured context for AI agents?
Critical


Efficiency
Speed, bulk operations, resource usage
High


Output Quality
Structured responses, metadata richness, usability
High


Token Safety
Output size control, pagination, LLM budget awareness
Critical


Security
Path validation, secret detection, access control
Critical


Error Handling
Graceful failures, helpful hints, recovery guidance
Medium


Research Context
Goal tracking, reasoning preservation, workflow continuity
Medium


Large File Handling
Character/line pagination, context-aware extraction, memory efficiency
Critical


Large Repository Scale
Search speed at scale, quality results in monorepos, resource management
Critical


Monorepo Awareness
Package detection, scoped operations, cross-package search
High


Test Suite A: Search Tools

Claude Code Tool: Grep

Cursor Tool: grep

Octocode Tool: localSearchCode


A1. Context Quality Tests


Test ID
Test Name
Motivation
Claude Code Grep Behavior
Cursor grep Behavior
Octocode Expected Behavior
Success Criteria


A1.1
Basic pattern search
Verify structured results vs raw output
Returns file paths or content with line numbers
Returns file:line:content plain text
Returns structured JSON with file path, line number, column, byte offset, match content
Octocode provides richer metadata


A1.2
Context lines display
Verify surrounding context quality
-A/-B/-C flags show context lines
-C N flag shows raw lines, no grouping
contextLines param provides smart grouping with omission markers
Context is organized and scannable


A1.3
Multi-file search results
Verify results organization across files
List of matches per file
Flat list of matches, no grouping
Grouped by file with match counts, file statistics
Results are navigable


A1.4
Empty results handling
Verify guidance on no matches
Empty results returned
Exit code 1, empty output, no guidance
status: empty with semantic hints for alternatives
Agent receives actionable guidance


A1.5
Match location precision
Verify byte-level accuracy
Line number with -n flag
Line number only
Line, column, byte offset, char offset
Enables precise navigation


A1.6
Regex pattern support
Verify complex pattern handling
Full regex via ripgrep
Basic regex support
Full PCRE/Perl regex with multiline support
Complex patterns work


A1.7
Case sensitivity control
Verify case handling options
-i flag for insensitive
-i flag for insensitive
caseSensitive, caseInsensitive, smartCase options
Flexible case handling


A1.8
File type filtering
Verify extension-based filtering
glob and type params
--include flag
type, include, exclude params
Easy file type targeting


A1.9
Research context tracking
Verify goal/reasoning preservation
No concept of research context
No concept of research context
mainResearchGoal, researchGoal, reasoning in output
Research continuity maintained


A1.10
Match statistics
Verify count and distribution info
output_mode: count for counts
Count requires separate -c flag
totalMatches, distribution by file included
Statistics built-in


A2. Efficiency Tests


Test ID
Test Name
Motivation
Test Scenario
Success Criteria


A2.1
Single pattern performance
Baseline search speed
Search pattern in 10,000 files
Response time < 500ms


A2.2
Bulk query efficiency
Validate parallel execution
5 different patterns vs 5 sequential calls
Bulk >= 3x faster than sequential


A2.3
Large directory handling
Memory efficiency under load
Search in 100MB directory
Peak memory < 50MB


A2.4
Ripgrep to grep fallback
Graceful degradation
Force ripgrep unavailability
Falls back to grep without crash


A2.5
Incremental results
Early termination capability
Stop after N matches
maxMatchesPerFile, maxFiles respected


A2.6
Pattern complexity scaling
Performance with complex regex
Simple vs complex regex patterns
Linear degradation, no timeout


A2.7
Concurrent bulk queries
Parallelization efficiency
5 queries executing simultaneously
CPU utilization balanced


A2.8
Cold vs warm cache
Subsequent query speed
Same query twice
Second query >= 2x faster


A3. Output Quality Tests


Test ID
Test Name
Motivation
Claude Code Grep Output
Cursor grep Output
Octocode Expected Output
Success Criteria


A3.1
Response structure
Verify consistent format
Text output with file paths/content
Plain text lines
Structured YAML/JSON with fields
Parseable, consistent schema


A3.2
Metadata richness
Verify useful metadata
Basic file/line info
None
File stats, match metadata, hints
Rich context provided


A3.3
Hint generation
Verify agent guidance
No hints
No hints
Dynamic hints based on results
Actionable next steps


A3.4
Status indication
Verify clear status
Tool success/failure
Exit code only
status: hasResults|empty|error
Clear success/failure


A3.5
Pagination info
Verify navigation data
head_limit/offset params
None
pagination object with page/total/hasMore
Enables continuation


A3.6
Warning messages
Verify edge case alerts
None
None
warnings array for truncation, fallback
Agent aware of limitations


A3.7
Error detail quality
Verify error helpfulness
Generic error messages
Generic error messages
Specific error with errorCode and hints
Debuggable errors


A3.8
Match highlighting
Verify match visibility
No highlighting
No highlighting
Match boundaries indicated
Easy to locate matches


A4. Token Safety Tests


Test ID
Test Name
Motivation
Risk Without Control
Octocode Mitigation
Success Criteria


A4.1
Large result set
Prevent token overflow
10K matches returned
matchesPerPage pagination
Output bounded


A4.2
Long line handling
Prevent single-line overflow
10KB line returned fully
matchContentLength truncation
Lines truncated


A4.3
Binary file exclusion
Prevent garbage output
Binary content included
binaryFiles: without-match default
Clean text only


A4.4
Many files matched
Prevent file count overflow
1000 files in response
filesPerPage pagination
Files paginated


A4.5
Deep context expansion
Prevent context bloat
Unlimited context lines
contextLines max limit
Context bounded


A4.6
Output size estimation
Proactive limit warning
No warning before overflow
Size estimation + hints before large output
Early warning


A4.7
Minified file handling
Prevent single-line megafiles
Huge minified JS searched
Detect and warn about minified content
Appropriate handling


A4.8
Total response size
Global output limit
Unbounded response
Response size cap with continuation
Response bounded


A5. Security Tests


Test ID
Test Name
Motivation
Attack Vector
Expected Behavior
Success Criteria


A5.1
Path traversal - basic
Prevent escape to parent
path: "../../etc/passwd"
Rejected with error
Path blocked


A5.2
Path traversal - encoded
Prevent encoded escape
path: "..%2F..%2Fetc"
Rejected with error
Encoded path blocked


A5.3
Path traversal - absolute
Prevent absolute escape
path: "/etc/passwd"
Rejected with error
Absolute outside workspace blocked


A5.4
Symlink resolution
Prevent symlink escape
Symlink pointing to /etc
Resolved and blocked
Symlink target validated


A5.5
Command injection - pattern
Prevent shell injection
pattern: "; rm -rf /"
Pattern escaped safely
No command execution


A5.6
Command injection - path
Prevent path injection
path: "file; cat /etc/passwd"
Path sanitized
No command execution


A5.7
Null byte injection
Prevent null truncation
path: "file\x00/etc/passwd"
Rejected
Null byte blocked


A5.8
Ignored path access
Prevent node_modules access
path: "node_modules"
Blocked by default
Ignored paths respected


A5.9
.git directory access
Prevent git data leak
path: ".git/config"
Blocked
Sensitive directories blocked


A5.10
Secret in pattern
Prevent secret exposure
Search result contains AWS key
Secret redacted in output
Secrets masked


A5.11
Unicode path manipulation
Prevent unicode tricks
Unicode lookalike characters
Normalized and validated
Unicode handled safely


A5.12
Very long path
Prevent buffer overflow
10KB path string
Rejected with limit error
Path length limited


A6. Error Handling Tests


Test ID
Test Name
Motivation
Error Scenario
Expected Behavior
Success Criteria


A6.1
Non-existent path
Graceful missing path
Path does not exist
Clear error message + suggestions
Helpful error


A6.2
Permission denied
Handle access errors
Read-protected file
Skip with warning, continue others
Graceful skip


A6.3
Invalid regex
Handle bad patterns
Malformed regex pattern
Parse error with position indicated
Debuggable error


A6.4
Timeout handling
Prevent hung queries
Search takes > 30s
Timeout with partial results
Graceful timeout


A6.5
Bulk partial failure
Isolate query failures
3/5 queries succeed
Successful queries return, failures isolated
Partial success


A6.6
Empty workspace
Handle empty directory
No files in path
Empty result with hint
Clear empty state


A6.7
Circular symlinks
Handle symlink loops
Symlink loop detected
Warning and skip
No infinite loop


A6.8
Encoding issues
Handle non-UTF8
Binary/unknown encoding
Skip or warn
Clean handling


Test Suite B: Directory Listing Tools

Claude Code Tool: Bash(ls) / Glob

Cursor Tool: list_dir

Octocode Tool: localViewStructure


B1. Context Quality Tests


Test ID
Test Name
Motivation
Claude Code Bash(ls)/Glob Behavior
Cursor list_dir Behavior
Octocode Expected Behavior
Success Criteria


B1.1
Basic listing
Verify output richness
ls output or glob patterns
Array of filenames only
Entries with type, size, extension, permissions
Rich metadata


B1.2
Recursive listing
Verify depth support
ls -R or Glob with **
Requires multiple calls
depth parameter for tree view
Single call for tree


B1.3
Type filtering
Verify filter capability
Manual filtering
No filtering
filesOnly, directoriesOnly params
Easy type filtering


B1.4
Extension filtering
Verify extension filter
Glob patterns (e.g., *.ts)
Manual post-filtering
extension, extensions params
Built-in extension filter


B1.5
Sorting options
Verify sort capability
ls flags (-t, -S)
Alphabetical only
sortBy: name, size, time, extension
Flexible sorting


B1.6
Size display
Verify human-readable sizes
ls -lh for sizes
No size info
humanReadable size formatting
4.2KB instead of 4301


B1.7
Modified time display
Verify timestamp access
ls -l shows timestamps
No timestamp
showFileLastModified option
Timestamps available


B1.8
Summary statistics
Verify aggregate info
Requires wc or counting
None
totalFiles, totalDirectories, summary
Quick overview


B1.9
Hidden file handling
Verify dotfile access
ls -a for dotfiles
May hide dotfiles
hidden flag to include/exclude
Controllable


B1.10
Pattern filtering
Verify glob support
Glob tool supports patterns
No pattern matching
pattern param for glob filter
Built-in glob


B2. Efficiency Tests


Test ID
Test Name
Motivation
Test Scenario
Success Criteria


B2.1
Large directory
Performance under scale
1000 files in directory
Response < 1s


B2.2
Deep recursion
Recursive performance
depth=3 on large tree
Response < 5s


B2.3
Bulk listing
Multiple directories at once
5 directories vs sequential
Bulk >= 2x faster


B2.4
Stats-only mode
Lightweight overview
Summary without full listing
Response < 100ms


B2.5
Filtered vs unfiltered
Filter performance
With vs without extension filter
Filtered same or faster


B2.6
Sort overhead
Sorting cost
Different sort options
Sorting < 10% overhead


B3. Output Quality Tests


Test ID
Test Name
Motivation
Claude Code Bash(ls)/Glob Output
Cursor list_dir Output
Octocode Expected Output
Success Criteria


B3.1
Tree visualization
Verify readable structure
Flat list from ls/glob
Flat list
Tree-like indented output
Visual hierarchy


B3.2
Entry annotations
Verify type markers
ls -F adds markers
None
[FILE], [DIR], [LINK] markers
Clear type indication


B3.3
Pagination info
Verify navigation
None
None
Page number, total pages, hasMore
Continuation enabled


B3.4
Hints generation
Verify next steps
None
None
Hints for deeper exploration
Agent guidance


B3.5
Empty directory
Verify empty handling
Empty output
Empty array
status: empty + hints
Clear empty state


B3.6
Truncation warning
Verify limit indication
No indication
No indication
Warning when truncated
Awareness of limits


B4. Token Safety Tests


Test ID
Test Name
Motivation
Risk Without Control
Octocode Mitigation
Success Criteria


B4.1
Directory with 10K files
Prevent massive output
All 10K entries returned
entriesPerPage pagination
Output bounded


B4.2
Very long filenames
Handle edge case
Long names bloat output
Filename truncation with ellipsis
Names bounded


B4.3
Deep recursion output
Prevent tree explosion
Full tree dumped
Depth limits + pagination
Tree bounded


B4.4
Detailed mode scaling
Control metadata size
All metadata included
details flag toggles verbosity
Controllable detail


B4.5
Pre-generation check
Proactive limit
Large output generated then rejected
Estimate size, error before generating
Early rejection


B4.6
Summary-only mode
Minimal token option
Full listing required
summary: true without entries
Ultra-light option


B5. Security Tests


Test ID
Test Name
Motivation
Attack Vector
Expected Behavior
Success Criteria


B5.1
Path traversal listing
Prevent escape
path: "../../../"
Rejected
Traversal blocked


B5.2
Symlink directory
Prevent link escape
Symlink to /etc
Resolved and blocked
Symlink validated


B5.3
Hidden sensitive dirs
Prevent leaks
List .aws, .ssh directories
Blocked by default
Sensitive dirs hidden


B5.4
Absolute path outside
Prevent arbitrary access
/etc or /root
Rejected
Outside workspace blocked


B5.5
Filename with secrets
Prevent credential leaks
File named password.txt content
Only name shown, no content
No content exposed


B5.6
Large depth DoS
Prevent resource exhaustion
depth: 100
Max depth enforced
Depth limited


B6. Error Handling Tests


Test ID
Test Name
Motivation
Error Scenario
Expected Behavior
Success Criteria


B6.1
Non-existent directory
Handle missing path
Directory doesn't exist
Clear error + suggestion
Helpful error


B6.2
Permission denied
Handle access errors
Read-protected directory
Error with explanation
Clear access error


B6.3
File path given
Handle wrong type
File path instead of directory
Error suggesting file read tool
Appropriate guidance


B6.4
Empty pattern match
Handle no glob matches
Pattern matches nothing
Empty result + hint
Clear empty state


B6.5
Bulk partial failure
Isolate failures
Some directories inaccessible
Successful ones return
Partial success


Test Suite C: File Content Tools

Claude Code Tool: Read

Cursor Tool: read_file

Octocode Tool: localGetFileContent


C1. Context Quality Tests


Test ID
Test Name
Motivation
Claude Code Read Behavior
Cursor read_file Behavior
Octocode Expected Behavior
Success Criteria


C1.1
Full file read
Verify metadata inclusion
Content with line numbers (cat -n format)
Raw content only
Content + totalLines + fileSize + pagination
Rich context


C1.2
Pattern extraction
Verify targeted reading
Manual line calculation
Manual line calculation
matchString with context lines
Easy targeting


C1.3
Line range reading
Verify range support
offset/limit params
offset/limit params
startLine/endLine with bounds checking
Clear line ranges


C1.4
Multiple patterns
Verify multi-match
Multiple read calls
Multiple read calls
Single call with multiple matchStrings
Efficient multi-match


C1.5
File metadata
Verify file info
None (content only)
None
path, contentLength, encoding, mimeType
File info included


C1.6
Partial read indication
Verify completeness status
No indication
No indication
isPartial flag with bounds
Clear partial status


C1.7
Line number preservation
Verify line reference
Line numbers included (cat -n)
No line numbers
Content with line numbers annotated
Lines referenceable


C1.8
Match highlighting
Verify match visibility
No highlighting
No highlighting
Match positions indicated
Matches locatable


C1.9
Context line control
Verify context flexibility
Fixed via offset/limit
Fixed context
matchStringContextLines adjustable
Flexible context


C1.10
Regex pattern support
Verify regex matching
Not supported
Not supported
matchStringIsRegex option
Regex enabled


C2. Efficiency Tests


Test ID
Test Name
Motivation
Test Scenario
Success Criteria


C2.1
Large file read
Single file performance
1MB file
Response < 200ms


C2.2
Minification savings
Token efficiency
JSON/YAML minification
20-50% size reduction


C2.3
Bulk file reads
Multiple files at once
5 files vs sequential
Bulk >= 3x faster


C2.4
Partial read efficiency
Range read speed
Read 100 lines from 10K file
Response < 50ms


C2.5
Pattern search speed
matchString performance
Find pattern in 1MB file
Response < 300ms


C2.6
Cache effectiveness
Repeated read speed
Same file twice
Second read >= 2x faster


C3. Output Quality Tests


Test ID
Test Name
Motivation
Claude Code Read Output
Cursor read_file Output
Octocode Expected Output
Success Criteria


C3.1
Structured response
Verify consistent format
Content with line numbers
Raw content string
Structured object with fields
Parseable format


C3.2
Content minification
Verify token efficiency
Full formatting preserved
Full formatting preserved
Optional minification for JSON/YAML
Smaller output


C3.3
Pagination info
Verify continuation
Via offset/limit params
None
charOffset, charLength, hasMore
Continuable


C3.4
Line bounds
Verify range reporting
Based on offset/limit
None
actualStartLine, actualEndLine
Range clarity


C3.5
Encoding info
Verify charset
Assumed UTF-8
Assumed UTF-8
Detected encoding reported
Encoding known


C3.6
Warning messages
Verify edge alerts
Truncation at 2000 lines
None
Warnings for truncation, bounds adjustment
Issues flagged


C3.7
Hints for next steps
Verify guidance
None
None
Hints for deeper reading, search
Agent guidance


C4. Token Safety Tests


Test ID
Test Name
Motivation
Risk Without Control
Octocode Mitigation
Success Criteria


C4.1
Large file full read
Prevent massive dump
10MB file returned
Size limit + error with guidance
Output bounded


C4.2
Character pagination
Incremental reading
Full content required
charLength/charOffset pagination
Paginated reading


C4.3
Many pattern matches
Prevent match explosion
Pattern matches 1000 times
Max matches + truncation
Matches bounded


C4.4
Long line handling
Single line overflow
100KB single line
Line truncation with indicator
Lines bounded


C4.5
Binary file content
Prevent garbage
Binary file requested
Detection + warning or rejection
Clean handling


C4.6
Context expansion limit
Prevent context bloat
Large context requested
Max context lines enforced
Context bounded


C4.7
Pre-read size check
Early rejection
Check size before reading
File size check + appropriate error
Early detection


C4.8
Minified file detection
Handle special files
1MB single-line JS
Minified detection + appropriate handling
Minified handled


C5. Security Tests


Test ID
Test Name
Motivation
Attack Vector
Expected Behavior
Success Criteria


C5.1
Path traversal
Prevent escape
../../../etc/shadow
Rejected
Traversal blocked


C5.2
Absolute path outside
Prevent arbitrary read
/etc/passwd
Rejected
Outside blocked


C5.3
Symlink to sensitive
Prevent link escape
Symlink to /etc/passwd
Resolved and blocked
Symlink validated


C5.4
.env file access
Prevent credential leak
.env or .env.local
Blocked or heavily redacted
Env files protected


C5.5
AWS credentials
Redact cloud creds
File with AWS keys
AKIAXXXXXXXX redacted
AWS keys masked


C5.6
API keys
Redact API credentials
File with API keys
Keys redacted
API keys masked


C5.7
Passwords in code
Redact hardcoded creds
password = "secret"
Value redacted
Passwords masked


C5.8
JWT tokens
Redact tokens
Bearer token in file
Token redacted
JWTs masked


C5.9
Private keys
Redact crypto keys
RSA/EC private key content
Key content redacted
Private keys masked


C5.10
GitHub tokens
Redact VCS tokens
ghp_xxxx or gho_xxxx
Token redacted
GitHub tokens masked


C5.11
Database URLs
Redact connection strings
postgresql://user:pass@host
Password portion redacted
DB creds masked


C5.12
Config files
Handle sensitive configs
.npmrc, .gitconfig with tokens
Tokens redacted
Config tokens masked


C5.13
Null byte in path
Prevent truncation
file.txt\x00/etc/passwd
Rejected
Null byte blocked


C5.14
Protocol injection
Prevent scheme abuse
file:///etc/passwd
Rejected
Protocol blocked


C6. Error Handling Tests


Test ID
Test Name
Motivation
Error Scenario
Expected Behavior
Success Criteria


C6.1
File not found
Handle missing file
Non-existent path
Clear error + suggestions
Helpful error


C6.2
Directory path
Handle wrong type
Directory instead of file
Error suggesting list tool
Appropriate guidance


C6.3
Permission denied
Handle access error
Read-protected file
Clear permission error
Access error clear


C6.4
Line range overflow
Handle invalid range
endLine > totalLines
Adjusted range + warning
Graceful adjustment


C6.5
Pattern not found
Handle no match
matchString not in file
Empty match result + hints
Clear no-match


C6.6
Invalid encoding
Handle charset issues
Non-UTF8 file
Warning + best-effort decode
Graceful handling


C6.7
Bulk partial failure
Isolate failures
Some files inaccessible
Successful reads return
Partial success


C6.8
Empty file
Handle edge case
0-byte file
Clear empty indication
Empty handled


Test Suite D: File Finding Tools

Claude Code Tool: Glob

Cursor Tool: glob_file_search

Octocode Tool: localFindFiles


D1. Context Quality Tests


Test ID
Test Name
Motivation
Claude Code Glob Behavior
Cursor glob_file_search Behavior
Octocode Expected Behavior
Success Criteria


D1.1
Basic file find
Verify output richness
Array of file paths sorted by mtime
Array of file paths only
Entries with type, size, permissions, modified
Rich metadata


D1.2
Modified time filter
Verify time-based search
Not supported
Not supported
modifiedWithin, modifiedBefore params
Time filtering works


D1.3
Size filter
Verify size-based search
Not supported
Not supported
sizeGreater, sizeLess params
Size filtering works


D1.4
Multiple patterns
Verify OR logic
Multiple calls required
Multiple calls required
names array with OR logic
Single call for multi


D1.5
Depth control
Verify recursion limit
Implicit via pattern
Unlimited or fixed
maxDepth, minDepth params
Depth controllable


D1.6
Type filtering
Verify type-based search
Files only
Files only
type: f (file), d (dir), l (link)
Type filtering works


D1.7
Permission filter
Verify access-based search
Not supported
Not supported
executable, readable, writable
Permission filtering works


D1.8
Regex pattern support
Verify regex names
Glob patterns only
Glob only
regex param for regex names
Regex enabled


D1.9
Path pattern support
Verify path matching
Glob patterns (e.g., **/*.ts)
Basic glob
pathPattern for full path matching
Path patterns work


D1.10
Empty detection
Verify empty file find
Not supported
Not supported
empty flag for zero-byte files
Empty files findable


D2. Efficiency Tests


Test ID
Test Name
Motivation
Test Scenario
Success Criteria


D2.1
Large filesystem
Performance at scale
50,000 files in tree
Response < 5s


D2.2
Bulk find queries
Multiple finds at once
5 patterns vs sequential
Bulk >= 2x faster


D2.3
Filtered vs unfiltered
Filter performance
With vs without filters
Filtered faster


D2.4
Deep vs shallow
Depth impact
maxDepth=1 vs maxDepth=10
Linear scaling


D2.5
Pattern complexity
Regex cost
Simple vs complex regex
Reasonable overhead


D3. Output Quality Tests


Test ID
Test Name
Motivation
Claude Code Glob Output
Cursor glob_file_search Output
Octocode Expected Output
Success Criteria


D3.1
Structured response
Verify format
Array of paths sorted by mtime
Array of paths
Structured entries with fields
Rich structure


D3.2
Metadata inclusion
Verify file info
Paths only
None
Size, type, modified, permissions
Metadata included


D3.3
Sorting support
Verify ordering
Modification time only
Modification time only
sortBy option
Flexible sorting


D3.4
Pagination info
Verify continuation
None
None
Page, total, hasMore
Continuable


D3.5
Hints generation
Verify guidance
None
None
Hints for refinement
Agent guidance


D3.6
Match count
Verify totals
Array length only
Array length only
totalFiles, distribution
Stats included


D4. Token Safety Tests


Test ID
Test Name
Motivation
Risk Without Control
Octocode Mitigation
Success Criteria


D4.1
Many matches
Prevent overflow
10K files returned
filesPerPage pagination
Output bounded


D4.2
Detail mode scaling
Control verbosity
All metadata for all files
details flag toggles
Controllable detail


D4.3
Pre-find estimation
Early rejection
Large result generated
Estimate count first
Early detection


D4.4
Limit enforcement
Hard cap on results
Unlimited finds
limit param enforced
Results capped


D4.5
Summary mode
Minimal output
Full listing required
Count-only option
Ultra-light option


D5. Security Tests


Test ID
Test Name
Motivation
Attack Vector
Expected Behavior
Success Criteria


D5.1
Path traversal
Prevent escape
path: "../../"
Rejected
Traversal blocked


D5.2
Absolute outside
Prevent arbitrary
/etc or /home
Rejected
Outside blocked


D5.3
Symlink escape
Prevent link attack
Symlink directory
Resolved and blocked
Symlink validated


D5.4
Sensitive patterns
Prevent leaks
name: "*.env*"
Excluded from results
Sensitive excluded


D5.5
Hidden directories
Prevent exposure
.ssh, .aws
Excluded by default
Hidden protected


D5.6
Regex DoS
Prevent ReDoS
Catastrophic backtracking regex
Timeout or rejection
ReDoS prevented


D6. Error Handling Tests


Test ID
Test Name
Motivation
Error Scenario
Expected Behavior
Success Criteria


D6.1
Non-existent path
Handle missing
Path doesn't exist
Clear error + suggestion
Helpful error


D6.2
Invalid pattern
Handle bad glob
Malformed glob pattern
Parse error message
Debuggable error


D6.3
Permission denied
Handle access
Unreadable directory
Skip with warning
Graceful skip


D6.4
No matches
Handle empty
Pattern matches nothing
Empty result + hints
Clear empty state


D6.5
Bulk partial failure
Isolate failures
Some paths inaccessible
Successful queries return
Partial success


D6.6
Invalid time format
Handle bad input
modifiedWithin: "invalid"
Parse error with format hint
Helpful parse error


Test Suite E: Integration & Workflow Tests


E1. End-to-End Workflows


Test ID
Test Name
Motivation
Workflow Steps
Success Criteria


E1.1
Code exploration flow
Validate full workflow
Find files -> Search patterns -> Read matches
All steps succeed with context


E1.2
Bug investigation flow
Validate debugging
Search error -> Find related files -> Read context
Investigation enabled


E1.3
Refactoring preparation
Validate refactor support
Find usages -> List structure -> Read implementations
Refactor info complete


E1.4
Security audit flow
Validate audit
Find sensitive files -> Search patterns -> Read for secrets
Audit enabled with redaction


E1.5
Documentation discovery
Validate doc search
Find markdown -> Search headings -> Read sections
Doc discovery works


E1.6
Dependency analysis
Validate import tracing
Search imports -> Find packages -> Read configs
Dep analysis enabled


E1.7
Test coverage flow
Validate test search
Find test files -> Search assertions -> Read test bodies
Test analysis works


E1.8
Configuration audit
Validate config discovery
Find configs -> Read settings -> Search for values
Config audit enabled


E2. Token Budget Compliance


Test ID
Test Name
Motivation
Budget Constraint
Workflow
Success Criteria


E2.1
Single tool budget
One tool fits budget
10K tokens
Complex search
Output < 10K tokens


E2.2
Session budget
Full session fits
25K tokens
Find + Search + Read
Total < 25K tokens


E2.3
Bulk operation budget
Bulk within limits
15K tokens
5 concurrent searches
Output < 15K tokens


E2.4
Iterative exploration
Multiple rounds fit
50K tokens
5 exploration rounds
Total < 50K tokens


E2.5
Pagination continuation
Continued exploration
10K per page
Multi-page results
Each page < 10K tokens


E3. Research Context Preservation


Test ID
Test Name
Motivation
Context Element
Success Criteria


E3.1
Goal tracking
Verify goal persists
mainResearchGoal
Goal in all responses


E3.2
Sub-goal tracking
Verify sub-goal persists
researchGoal
Sub-goal in all responses


E3.3
Reasoning tracking
Verify reasoning persists
reasoning
Reasoning in all responses


E3.4
Cross-tool context
Context across tools
Research context
Context maintained across tool switches


E3.5
Hint relevance
Verify contextual hints
Hints match research goal
Hints aligned with goals


Test Suite F: Performance Benchmarks


F1. Response Time Targets


Test ID
Operation
Tool
Input Scale
Target P50
Target P95
Target Max


F1.1
Pattern search
localSearchCode
10K files
300ms
800ms
2s


F1.2
Directory list
localViewStructure
1K entries
200ms
400ms
1s


F1.3
File read
localGetFileContent
1MB file
100ms
200ms
500ms


F1.4
File find
localFindFiles
50K files
1s
3s
5s


F1.5
Bulk search (5)
localSearchCode
5 queries
500ms
1.5s
3s


F1.6
Bulk read (5)
localGetFileContent
5 files
200ms
500ms
1s


F2. Memory Usage Targets


Test ID
Operation
Tool
Input Scale
Target Peak
Target Average


F2.1
Large search
localSearchCode
100MB directory
100MB
50MB


F2.2
Deep recursion
localViewStructure
depth=5 tree
50MB
25MB


F2.3
Large file
localGetFileContent
10MB file
30MB
15MB


F2.4
Many files
localFindFiles
100K files
100MB
50MB


F2.5
Bulk operations
All tools
5 concurrent
200MB
100MB


F3. Throughput Targets


Test ID
Operation
Tool
Target Throughput


F3.1
Sequential searches
localSearchCode
20 queries/second


F3.2
File reads
localGetFileContent
50 files/second


F3.3
Directory listings
localViewStructure
30 directories/second


F3.4
File finds
localFindFiles
10 queries/second


Test Suite G: Edge Cases & Stress Tests


G1. Edge Cases


Test ID
Test Name
Motivation
Edge Case
Expected Behavior
Success Criteria


G1.1
Empty file
Zero-byte file handling
0 bytes file
Read succeeds, empty content
No error


G1.2
Empty directory
Empty directory handling
0 files/dirs
List succeeds, empty results
No error


G1.3
Single character
Minimal content
1 byte file
Read succeeds
No error


G1.4
Very long filename
Filename edge
255 character name
Handled correctly
No truncation error


G1.5
Unicode filename
International names
Chinese filename
Handled correctly
Unicode works


G1.6
Spaces in path
Special characters
path with spaces/file.txt
Handled correctly
Spaces work


G1.7
Special characters
Path edge cases
file[1](2){3}.txt
Escaped correctly
Special chars work


G1.8
No extension
Extension-less file
Makefile, Dockerfile
Typed correctly
No extension works


G1.9
Multiple extensions
Complex extension
file.test.spec.ts
Correct extension detected
Multi-ext works


G1.10
Dotfile
Hidden files
.gitignore
Correctly typed
Dotfiles work


G2. Stress Tests


Test ID
Test Name
Motivation
Stress Factor
Success Criteria


G2.1
Maximum file count
Scale limits
100K files in search path
Completes without crash


G2.2
Maximum file size
Large file handling
100MB file
Handled with pagination


G2.3
Maximum directory depth
Deep nesting
50 levels deep
Completes with depth limit


G2.4
Maximum concurrent queries
Parallelism limits
10 concurrent bulk queries
All complete


G2.5
Rapid sequential calls
Rate handling
100 calls in 10 seconds
All succeed


G2.6
Mixed operations
Combined load
Search + List + Read + Find concurrent
All succeed


Test Suite H: Large File Handling


Focus: Character/line pagination, context-aware reading, memory efficiency for large files
Applicable Tools: localGetFileContent (Octocode) vs Read (Claude Code) vs read_file (Cursor)


H1. Large File Detection & Awareness


Test ID
Test Name
Motivation
File Size
Claude Code Read
Cursor read_file
Octocode Expected
Success Criteria


H1.1
File size detection
Awareness before reading
10MB file
Reads up to 2000 lines
Attempts full read
Reports fileSize before content
Size known upfront


H1.2
Line count detection
Know total lines
100K lines
No line count
No line count
Reports totalLines
Line count available


H1.3
Character count detection
Know total chars
5M chars
No char count
No char count
Reports totalChars
Char count available


H1.4
Large file warning
Proactive guidance
>1MB file
Truncates at 2000 lines
No warning
Warning + pagination hints
User informed


H1.5
Giant file rejection
Prevent catastrophic read
500MB file
Truncates output
May crash/timeout
Clear error + guidance
Safe rejection


H1.6
Memory estimation
Predict resource usage
50MB file
No estimation
No estimation
Estimated memory in metadata
Resource awareness


H2. Character-Based Pagination


Test ID
Test Name
Motivation
Scenario
Claude Code Behavior
Cursor Behavior
Octocode Expected
Success Criteria


H2.1
First page read
Initial chunk
10MB file, charLength=10000
Reads 2000 lines (no char control)
Full file or error
First 10K chars + pagination
Bounded first page


H2.2
Middle page read
Continue reading
charOffset=50000, charLength=10000
Uses offset/limit (line-based)
Not supported
Chars 50K-60K returned
Mid-file access


H2.3
Last page detection
Know when done
Near end of file
No indication
No indication
hasMore: false when complete
End detection


H2.4
Page boundary alignment
Clean breaks
charOffset=9999
Line-based, not char-based
May break mid-word
Attempts word boundary
Clean breaks


H2.5
UTF-8 boundary safety
Handle multi-byte
Offset lands mid-character
N/A (line-based)
May corrupt
Adjusts to char boundary
Safe UTF-8


H2.6
Continuation hint
Guide next page
After any page
No hint
No hint
nextCharOffset provided
Easy continuation


H2.7
Page size limits
Enforce max page
charLength=1000000
2000 line limit
Allowed
Max limit enforced
Page bounded


H2.8
Zero offset handling
Start from beginning
charOffset=0
Works (offset=0)
Works
Works with metadata
Clean start


H2.9
Negative offset rejection
Invalid input
charOffset=-100
Undefined
Undefined
Clear error
Invalid rejected


H2.10
Beyond-file offset
Past EOF
charOffset > fileSize
Undefined
Undefined
Empty with warning
EOF handled


H3. Line-Based Pagination


Test ID
Test Name
Motivation
Scenario
Claude Code Behavior
Cursor Behavior
Octocode Expected
Success Criteria


H3.1
Line range read
Specific lines
startLine=100, endLine=200
offset/limit params
offset/limit params
Lines 100-200 with numbers
Exact line range


H3.2
First N lines
File header
startLine=1, endLine=50
offset=0, limit=50
offset=0, limit=50
First 50 lines
Header access


H3.3
Last N lines
File footer
Last 100 lines
Not directly supported
Not supported
Tail functionality
Footer access


H3.4
Single line read
Precise line
startLine=500, endLine=500
offset + limit=1
Manual calculation
Exactly line 500
Single line


H3.5
Line overflow handling
Beyond total
endLine=1000000
Reads up to EOF
May error
Adjusts to totalLines + warning
Graceful adjustment


H3.6
Line number annotation
Reference lines
Any range
cat -n format includes numbers
No numbers
Lines prefixed with numbers
Numbers present


H3.7
Empty line handling
Preserve blanks
Range with empty lines
Included
Included
Empty lines preserved
Blanks kept


H3.8
Very long lines
Single huge line
1MB single line
Truncated at 2000 chars
Full line
Line truncated with marker
Line bounded


H3.9
Mixed line endings
CR/LF/CRLF
Mixed endings
May miscalculate
May miscalculate
Correct line count
Endings handled


H3.10
Continuation metadata
Next range info
After any range
None
None
nextStartLine provided
Easy continuation


H4. Context-Aware Large File Reading


Test ID
Test Name
Motivation
Scenario
Claude Code Behavior
Cursor Behavior
Octocode Expected
Success Criteria


H4.1
Pattern in large file
Find specific content
matchString in 50MB file
Use Grep tool instead
Not supported
Finds pattern + context
Pattern found


H4.2
Pattern context lines
Surrounding context
matchStringContextLines=10
Grep -C flag
Not supported
10 lines before/after
Context included


H4.3
Multiple pattern matches
All occurrences
Pattern appears 100 times
Grep returns all
N/A
First N matches + count
Matches bounded


H4.4
Pattern near file start
Beginning match
Match in first 100 lines
Grep handles
N/A
Correct context, no negative lines
Start bounded


H4.5
Pattern near file end
Ending match
Match in last 100 lines
Grep handles
N/A
Correct context, no overflow
End bounded


H4.6
No match in large file
Handle missing
Pattern not in 50MB file
Grep returns empty
N/A
status: empty + hints
Clear no-match


H4.7
Regex in large file
Complex pattern
Regex matchString
Grep supports regex
Not supported
Regex matching works
Regex enabled


H4.8
Case-insensitive match
Flexible matching
matchStringCaseSensitive=false
Grep -i flag
N/A
Case-insensitive search
Case flexible


H4.9
Context overlap handling
Adjacent matches
Two matches 5 lines apart
Grep shows all
N/A
Merged context, no duplication
Context merged


H4.10
Match distribution info
Where are matches
Pattern with many matches
Grep count mode
N/A
Distribution by line ranges
Distribution shown


H5. Large File Token Efficiency


Test ID
Test Name
Motivation
Scenario
Token Risk
Octocode Mitigation
Success Criteria


H5.1
JSON minification
Reduce JSON size
5MB JSON file
5M tokens
Minified output
30-50% reduction


H5.2
YAML minification
Reduce YAML size
2MB YAML file
2M tokens
Minified output
20-40% reduction


H5.3
Code minification
Reduce code size
1MB TypeScript
1M tokens
Whitespace normalized
10-20% reduction


H5.4
Minification toggle
Control minification
minified=false
Full output
Full formatting preserved
Toggle works


H5.5
Selective minification
File type aware
Mixed file types
Varies
Type-appropriate minification
Type-aware


H5.6
Token estimation
Budget awareness
Any large file
Unknown
Estimated tokens in metadata
Estimation provided


H5.7
Budget warning
Proactive alert
Output > 10K tokens
No warning
Warning before output
Early warning


H5.8
Chunked JSON
Split large JSON
50MB JSON
Massive
Split by structure
Structure-aware split


H5.9
Log file handling
Handle log patterns
100MB log file
Massive
Intelligent sampling
Sampled output


H5.10
Repeated content detection
Optimize redundancy
Highly repetitive
Bloated
Repetition collapsed
Redundancy removed


H6. Large File Performance


Test ID
Test Name
Motivation
File Size
Operation
Target Time
Target Memory


H6.1
Size detection only
Fast metadata
100MB file
Get size/lines
< 50ms
< 5MB


H6.2
First page read
Initial access
100MB file
First 10K chars
< 100ms
< 20MB


H6.3
Pattern search
Find in large
50MB file
matchString
< 500ms
< 100MB


H6.4
Line range extraction
Specific lines
100MB file
100 lines mid-file
< 200ms
< 30MB


H6.5
Full minification
Compress output
10MB JSON
Read + minify
< 1s
< 50MB


H6.6
Multiple patterns
Bulk match
50MB file, 5 patterns
All patterns
< 2s
< 150MB


H6.7
Streaming read
Continuous pages
100MB file, 10 pages
Sequential reads
< 3s total
< 50MB peak


H6.8
Concurrent large files
Parallel access
5x 20MB files
Bulk read
< 2s
< 200MB


H7. Large File Edge Cases


Test ID
Test Name
Motivation
Edge Case
Expected Behavior
Success Criteria


H7.1
Single line file
No newlines
10MB single line
Handled with truncation
No hang


H7.2
Binary-like content
Non-text patterns
File with binary segments
Detection + warning
Clean handling


H7.3
Empty large file
Sparse file
Reported 1GB, actually empty
Correct detection
Size accurate


H7.4
Unicode heavy file
Multi-byte heavy
10MB emoji file
Correct char count
Chars accurate


H7.5
Growing file
File being written
File changing during read
Warning + snapshot
Consistent read


H7.6
Compressed content
Gzip/encoded
.gz or base64 content
Warning, no decode
No expansion


H7.7
Null bytes
Binary markers
Text with null bytes
Filtered or warned
Clean output


H7.8
Extremely long lines
Code dumps
100K char single line
Line truncated
Line bounded


H7.9
Millions of lines
Line count scale
10M lines, 500MB
Line count correct
Count accurate


H7.10
Nested JSON depth
Deep structure
100 levels nested
Handles gracefully
No stack overflow


Test Suite I: Large Repository Handling


Focus: Fast searching at scale, quality results in monorepos, resource efficiency
Applicable Tools: All local tools (localSearchCode, localViewStructure, localFindFiles, localGetFileContent)


I1. Large Repository Detection & Awareness


Test ID
Test Name
Motivation
Repository Scale
Claude Code Behavior
Cursor Behavior
Octocode Expected
Success Criteria


I1.1
Repository size detection
Know scale upfront
100K files
No awareness
No awareness
File count in metadata
Scale known


I1.2
Directory depth detection
Know nesting
50 levels deep
No awareness
No awareness
Max depth reported
Depth known


I1.3
Total size estimation
Know data volume
5GB total
No awareness
No awareness
Total size in metadata
Volume known


I1.4
Monorepo detection
Identify structure
packages/, apps/
No detection
No detection
Monorepo pattern detected
Structure detected


I1.5
Language distribution
Know tech stack
Mixed languages
No info
No info
Language breakdown
Stack known


I1.6
Hot path identification
Frequently used
Build artifacts heavy
No info
No info
Size distribution by path
Hot paths known


I2. Large Repository Search Performance


Test ID
Test Name
Motivation
Scale
Pattern
Target Time
Success Criteria


I2.1
Simple pattern search
Baseline speed
50K files
function
< 2s
Fast baseline


I2.2
Complex regex search
Regex overhead
50K files
Complex regex
< 5s
Acceptable regex


I2.3
Rare pattern search
Needle in haystack
100K files
Unique string
< 3s
Fast sparse match


I2.4
Common pattern search
High match count
50K files
import
< 3s
Fast common match


I2.5
Multi-pattern bulk
Parallel patterns
50K files, 5 patterns
Varied
< 5s total
Efficient bulk


I2.6
Incremental search
Early termination
100K files
First 10 matches
< 500ms
Fast early stop


I2.7
Filtered search
With type filter
50K files, *.ts only
Any pattern
< 1.5s
Filter speedup


I2.8
Path-scoped search
Subdirectory only
100K files, src/ only
Any pattern
< 1s
Scope speedup


I2.9
Exclusion patterns
Skip directories
100K files, skip node_modules
Any pattern
< 2s
Exclusion works


I2.10
Cold vs warm search
Cache benefit
50K files
Same pattern twice
2nd < 50% of 1st
Cache effective


I3. Large Repository Search Quality


Test ID
Test Name
Motivation
Scenario
Claude Code Grep
Cursor grep
Octocode Expected
Success Criteria


I3.1
Result relevance ranking
Most useful first
Common pattern
File order
Random order
Relevance-based hints
Ranked results


I3.2
Match distribution info
Where are matches
Pattern across repo
File list
Flat list
Distribution by path/file
Distribution shown


I3.3
File importance hints
Guide exploration
Many matches
No guidance
No guidance
Hints for key files
Key files highlighted


I3.4
Duplicate detection
Avoid redundancy
Copied files
All shown
All shown
Duplicates noted
Duplicates flagged


I3.5
Test vs source distinction
Code vs test
Pattern in both
Mixed together
Mixed together
Source/test separation hint
Distinction clear


I3.6
Generated file detection
Skip generated
Build output matches
Included
Included
Generated files flagged
Generated noted


I3.7
Stale file detection
Freshness info
Old vs new matches
No dates
No dates
Modified dates included
Freshness visible


I3.8
Context completeness
Useful context
Matches need context
Context via -A/-B/-C
Raw lines
Smart context boundaries
Context useful


I3.9
Cross-reference hints
Related searches
Pattern found
No suggestions
No suggestions
Related pattern hints
Cross-refs provided


I3.10
Confidence scoring
Match quality
Varied matches
All equal
All equal
Match confidence indicated
Confidence shown


I4. Large Repository Directory Listing


Test ID
Test Name
Motivation
Scale
Claude Code Bash(ls)/Glob
Cursor list_dir
Octocode Expected
Success Criteria


I4.1
Root overview
Quick orientation
50K file repo
ls output or glob
Full flat list
Summary + top entries
Quick overview


I4.2
Deep tree exploration
Navigate structure
depth=3
Multiple calls or ls -R
Multiple calls
Single call, paginated
Single call works


I4.3
Monorepo packages
List packages
packages/*
Manual navigation
Manual navigation
Package detection + list
Packages listed


I4.4
Size-sorted listing
Find large paths
Large repo
ls -lS
No sorting
Sorted by size desc
Large paths first


I4.5
Recently modified
Find active areas
Large repo
ls -lt
No time sort
Sorted by modified
Recent first


I4.6
Empty directory skip
Hide empty
Sparse structure
All shown
All shown
Empty dirs noted/hidden
Empty handled


I4.7
Symlink handling
Handle links
Many symlinks
May follow
May follow
Links noted, not followed deep
Links safe


I4.8
Permission issues
Handle restricted
Mixed permissions
May error
May error
Skipped with warning
Restricted handled


I4.9
Very deep paths
Handle nesting
100 levels
May timeout
May timeout
Depth-limited with hint
Depth bounded


I4.10
Wide directories
Many children
10K files in one dir
All returned
All returned
Paginated
Wide paginated


I5. Large Repository File Finding


Test ID
Test Name
Motivation
Scale
Pattern
Target Time
Success Criteria


I5.1
Find by name
Specific file
100K files
package.json
< 1s
Fast name find


I5.2
Find by extension
Type search
100K files
*.ts
< 2s
Fast type find


I5.3
Find by size
Large files
100K files
> 1MB
< 3s
Fast size find


I5.4
Find by date
Recent files
100K files
modified < 7d
< 3s
Fast date find


I5.5
Combined filters
Multi-criteria
100K files
*.ts, > 10KB, recent
< 3s
Fast combined


I5.6
Find in subtree
Scoped find
100K files, src/
Any pattern
< 1s
Scope speedup


I5.7
Find with exclusions
Skip paths
100K files
Skip node_modules, dist
< 2s
Exclusion works


I5.8
Regex filename
Complex names
50K files
Regex pattern
< 3s
Regex works


I5.9
Bulk find patterns
Multiple patterns
100K files, 5 patterns
Varied
< 5s
Bulk efficient


I5.10
Find empty files
Zero-byte
100K files
empty=true
< 2s
Empty found


I6. Large Repository Resource Management


Test ID
Test Name
Motivation
Scale
Operation
Target Memory
Target CPU


I6.1
Memory during search
RAM efficiency
100K files
Full search
< 200MB peak
< 50% avg


I6.2
Memory during listing
RAM for tree
50K entries
depth=3 list
< 100MB peak
< 30% avg


I6.3
Memory during find
RAM for find
100K files
Pattern find
< 150MB peak
< 40% avg


I6.4
CPU during search
CPU efficiency
100K files
Complex search
N/A
< 80% peak


I6.5
Concurrent operations
Parallel efficiency
50K files
5 concurrent ops
< 300MB total
Balanced


I6.6
Idle resource release
Cleanup after ops
Any
After completion
< 20MB retained
Cleanup works


I6.7
Rate limiting
Prevent overload
Rapid requests
100 ops/minute
Stable
Rate controlled


I6.8
Graceful degradation
Under load
High concurrent
20 parallel
All complete
No crashes


I7. Large Repository Token Budget


Test ID
Test Name
Motivation
Scale
Operation
Token Risk
Mitigation
Success Criteria


I7.1
Search result cap
Prevent overflow
10K matches
Pattern search
100K tokens
Pagination
Output < 10K tokens


I7.2
Directory tree cap
Prevent tree dump
50K entries
depth=3 list
200K tokens
Pagination
Output < 10K tokens


I7.3
Find results cap
Prevent file dump
10K matches
File find
50K tokens
Pagination
Output < 10K tokens


I7.4
Bulk operation cap
Combined limit
Mixed
5 operations
500K tokens
Per-op limits
Total < 25K tokens


I7.5
Summary mode
Minimal output
Large repo
Any operation
High
Summary only
Output < 2K tokens


I7.6
Count-only mode
Stats without data
Large repo
Search/find
High
Counts only
Output < 500 tokens


I7.7
Progressive disclosure
Expand on demand
Large results
Any
Varies
Start minimal
Controllable expansion


I7.8
Hint-based exploration
Guide not dump
Large repo
Any
Varies
Hints over data
Hints actionable


I8. Large Repository Monorepo Support


Test ID
Test Name
Motivation
Monorepo Type
Scenario
Expected Behavior
Success Criteria


I8.1
Package detection
Identify packages
npm workspaces
List packages
Packages enumerated
Packages found


I8.2
Cross-package search
Search all packages
Lerna/Yarn
Pattern search
Results grouped by package
Grouped results


I8.3
Package-scoped search
Single package
Any
Search in packages/foo
Scoped correctly
Scope works


I8.4
Shared dependencies
Common node_modules
npm workspaces
Find shared
Hoisted deps detected
Hoisting aware


I8.5
Package relationships
Dep graph hints
Any
List packages
Inter-package deps noted
Deps shown


I8.6
Root vs package config
Config hierarchy
Any
Find configs
Root + package configs
Hierarchy shown


I8.7
Build output detection
Skip dist/build
Any
Search code
Generated excluded
Generated skipped


I8.8
Test file organization
Test location
Various
Find tests
Tests grouped by package
Tests organized


I8.9
Changelog navigation
Version history
Any
Find changelogs
Package changelogs listed
Changelogs found


I8.10
Documentation search
Doc discovery
Any
Find docs
Package docs listed
Docs found


I9. Large Repository Error Resilience


Test ID
Test Name
Motivation
Error Scenario
Expected Behavior
Success Criteria


I9.1
Partial permission
Some files restricted
Mixed permissions
Accessible files returned
Partial success


I9.2
Corrupted files
Bad files in repo
Unreadable files
Skipped with warning
Corruption handled


I9.3
Missing symlinks
Broken symlinks
Dangling links
Skipped with warning
Broken links handled


I9.4
Very long paths
Path length limits
500+ char paths
Truncated or skipped
Long paths handled


I9.5
Special characters
Unusual filenames
Unicode/special chars
Escaped correctly
Special chars handled


I9.6
Concurrent modification
Files changing
Active development
Warning + snapshot
Concurrent handled


I9.7
Disk space issues
Low disk
During operation
Graceful error
Disk errors handled


I9.8
Network paths
Mounted drives
NFS/SMB paths
Warning about latency
Network paths noted


I9.9
Large binary files
Non-text in repo
Binary blobs
Skipped in search
Binaries handled


I9.10
Recursive symlinks
Symlink loops
Circular links
Loop detection
Loops prevented


Acceptance Criteria Summary


P0 - Must Pass (Critical)


Category
Test IDs
Requirement


Security - Path Validation
A5.1-A5.8, B5.1-B5.4, C5.1-C5.3, D5.1-D5.3
All path traversal attacks blocked


Security - Secret Detection
C5.4-C5.14
All credential types redacted


Token Safety - Pagination
A4.1, A4.4, B4.1, C4.1, D4.1, H2.1-H2.6, I7.1-I7.4
Large outputs paginated


Token Safety - Limits
A4.2, A4.5, B4.3, C4.2-C4.4, D4.4, H5.6-H5.7
Output size bounded


Large File Safety
H1.4-H1.5, H2.7, H3.8, H4.3
Large file access controlled


Large Repo Safety
I6.1-I6.4, I7.1-I7.8
Large repo resource bounded


Error Handling - Basic
A6.1-A6.3, B6.1-B6.3, C6.1-C6.3, D6.1-D6.3, I9.1-I9.5
Graceful error messages


P1 - Should Pass (High Priority)


Category
Test IDs
Requirement


Context Quality
A1.1-A1.5, B1.1-B1.4, C1.1-C1.6, D1.1-D1.4
Structured, rich context


Output Quality
A3.1-A3.5, B3.1-B3.4, C3.1-C3.5, D3.1-D3.4
Consistent, useful output


Efficiency - Bulk
A2.2, B2.3, C2.3, D2.2
Bulk faster than sequential


Performance
F1.1-F1.6, H6.1-H6.6, I2.1-I2.7
Response times within targets


Hint Generation
A1.4, A3.3, B3.4, C3.7, D3.5
Actionable hints provided


Large File Pagination
H2.1-H2.10, H3.1-H3.10
Char/line pagination works


Large File Context
H4.1-H4.10
Context-aware reading


Large Repo Search
I2.1-I2.10, I3.1-I3.10
Fast, quality search


P2 - Nice to Have (Medium Priority)


Category
Test IDs
Requirement


Research Context
A1.9, E3.1-E3.5
Research goals preserved


Advanced Filtering
A1.7, B1.3-B1.5, C1.9-C1.10, D1.2-D1.7
Advanced filter options


Integration Workflows
E1.1-E1.8
End-to-end workflows succeed


Edge Cases
G1.1-G1.10, H7.1-H7.10
Edge cases handled


Memory Efficiency
F2.1-F2.5, I6.1-I6.8
Memory within targets


Large File Efficiency
H5.1-H5.10, H6.1-H6.8
Token-efficient file reading


Monorepo Support
I8.1-I8.10
Monorepo patterns detected


P3 - Bonus (Low Priority)


Category
Test IDs
Requirement


Stress Tests
G2.1-G2.6
Stress tests pass


Throughput
F3.1-F3.4
Throughput targets met


Fallback Mechanisms
A2.4
Graceful degradation


Large Repo Resilience
I9.1-I9.10
Error resilience at scale


Advanced Large File
H1.1-H1.6
Proactive large file detection


Test Execution Plan


Phase 1: Security & Critical


Execute all P0 tests
All security tests must pass
All token safety tests must pass
Large file/repo safety controls validated

Phase 2: Quality & Performance


Execute all P1 tests
Context quality validated
Performance benchmarks met
Large file pagination working
Large repo search quality confirmed

Phase 3: Integration & Edge Cases


Execute all P2 tests
Integration workflows validated
Edge cases handled
Monorepo support validated
Large file efficiency optimized

Phase 4: Stress & Optimization


Execute all P3 tests
Stress tests validated
Large repo resilience confirmed
Final optimization


Expected Outcomes


Dimension
Claude Code Built-in Tools
Cursor Built-in Tools
Octocode MCP Tools
Winner


Context Quality
Text output with line numbers
Raw text output
Structured JSON/YAML with metadata
Octocode


Efficiency
Sequential calls
Sequential only
Bulk parallel operations
Octocode


Output Quality
Text with basic formatting
Plain text
Structured with hints
Octocode


Token Safety
2000 line limit, truncation
Unbounded output
Paginated with limits
Octocode


Security
Basic OS-level
Basic OS-level
Multi-layer validation + secret redaction
Octocode


Error Handling
Generic errors
Generic errors
Contextual errors with guidance
Octocode


Research Context
None
None
Goal/reasoning tracking
Octocode


Large File Handling
Line-based offset/limit
Full dump or manual ranges
Char/line pagination + context search
Octocode


Large Repo Performance
Linear slowdown
Linear slowdown
Optimized search + smart pagination
Octocode


Monorepo Support
No awareness
No awareness
Package detection + scoped operations
Octocode


Test Suite Summary


Suite
Focus Area
Test Count
Tools Compared


A
Search Tools
50+
Grep/grep vs localSearchCode


B
Directory Listing
32
Bash(ls)/Glob/list_dir vs localViewStructure


C
File Content
50+
Read/read_file vs localGetFileContent


D
File Finding
36
Glob/glob_file_search vs localFindFiles


E
Integration Workflows
16
All tools combined


F
Performance Benchmarks
20
All tools benchmarked


G
Edge Cases & Stress
16
All tools stress tested


H
Large File Handling
70+
Read/read_file vs localGetFileContent


I
Large Repository
90+
All tools at scale


ACTUAL TEST RESULTS: Claude Code vs Octocode MCP Local Tools


Execution Date: 2026-01-01
Repository: Linux Kernel (100K+ files)
Model: Claude Opus 4.5


Test 1: Code Search Comparison

Query: Find mutex_lock in kernel/sched/ (C files only, with context)


Metric
Octocode localSearchCode
Claude Code Grep
Winner


Files Found
8 files
8 files (via -C context)
Tie


Total Matches
31 matches
31 matches (verified)
Tie


Response Structure
Structured JSON with byte/char offsets
Plain text with grep-style output
Octocode


Metadata Richness
Line, column, byteOffset, charOffset, matchCount per file
Line numbers only
Octocode


Pagination
Built-in (page 1/1, filesPerPage, hasMore)
head_limit/offset params
Octocode


Context Display
Truncated match values with byte boundaries
Full context with -C flag
Claude Code


Hints/Guidance
18 actionable hints for next steps
None
Octocode


Research Context
researchGoal, reasoning preserved
None
Octocode


Winner: Octocode - Richer metadata, pagination info, and actionable hints. Claude Code has simpler output but lacks structured data.

Test 2: Directory Structure Exploration

Query: Explore drivers/net/ at depth 2


Metric
Octocode localViewStructure
Claude Code Glob
Winner


Total Files
595 files (reported)
~100 files (truncated)
Octocode


Total Directories
191 directories (reported)
Not provided
Octocode


Total Size
11,164,674 bytes (reported)
Not provided
Octocode


Output Format
Tree-style with [FILE] and [DIR] markers
Flat file path list
Octocode


File Sizes
Per-file sizes (e.g., "88.9KB")
Not provided
Octocode


Pagination
Page 1/40, entriesPerPage=20
Truncated with warning
Octocode


Sorting
Controllable (name, size, time)
Modification time only
Octocode


Depth Control
depth param (1-5)
Implicit via glob pattern
Octocode


Winner: Octocode - Comprehensive structure view with statistics. Claude Code Glob is simpler but lacks metadata and truncates output.

Test 3: File Content Reading

Query: Read include/linux/sched.h focusing on struct task_struct


Metric
Octocode localGetFileContent
Claude Code Read
Winner


Pattern Matching
matchString with context extraction
Not supported (line offset only)
Octocode


Content Returned
11,889 chars with omission markers
First 100 lines (as requested)
Octocode


Total Lines
2,444 (reported)
Not reported
Octocode


Match Ranges
13 match ranges identified
Not applicable
Octocode


Token Efficiency
Smart extraction with ...N lines omitted...
Full lines returned
Octocode


Warnings
"Pattern matched 94 lines. Truncated to first 50"
None
Octocode


Line Numbers
Included with content
Cat-style N-> format
Tie


Multimodal
Text only
Images, PDFs, Notebooks
Claude Code


Winner: Octocode - Smart pattern extraction with context is far more efficient. Claude Code wins for multimodal files (images, PDFs).

Test 4: File Finding

Query: Find .c files in net/ directory


Metric
Octocode localFindFiles
Claude Code Glob
Winner


Files Found
1,000 files (reported total)
~100 files (truncated)
Octocode


Metadata Per File
path, type, size, permissions, modified date
path only
Octocode


Pagination
Page 1/67, filesPerPage=15
Truncated with warning
Octocode


Time Filtering
modifiedWithin, modifiedBefore
Not supported
Octocode


Size Filtering
sizeGreater, sizeLess
Not supported
Octocode


Sorting
Modification time (with options)
Modification time
Tie


Permission Filtering
executable, readable, writable
Not supported
Octocode


Details Toggle
details flag
Not applicable
Octocode


Winner: Octocode - Rich metadata and filtering options. Claude Code Glob is simpler but lacks forensic capabilities.

Overall Summary


Test Category
Claude Code Tool
Octocode Tool
Winner
Margin


Code Search
Grep
localSearchCode
Octocode
Large


Directory Exploration
Glob/Bash
localViewStructure
Octocode
Large


File Content Reading
Read
localGetFileContent
Octocode
Large


File Finding
Glob
localFindFiles
Octocode
Large


Claude Code Exclusive Advantages


Feature
Advantage


Multimodal Files
Native support for images, PDFs, Jupyter notebooks


Agent Delegation
Task tool with Explore, Plan, and General-purpose agents


Shell Access
Full Bash for git, build, arbitrary commands


Multiline Regex
Cross-line pattern matching with multiline: true


Simplicity
Lower learning curve for simple queries


Octocode Exclusive Advantages


Feature
Advantage


Structured Output
JSON with byte/char offsets, match metadata


Pagination
Built-in pagination with page info and hasMore


Research Context
researchGoal, reasoning, mainResearchGoal tracking


Actionable Hints
Dynamic hints for next exploration steps


Token Efficiency
Smart content extraction with omission markers


Rich Metadata
File sizes, permissions, timestamps, match counts


Forensic Filtering
Time-based, size-based, permission-based file discovery


Bulk Operations
Multiple queries in single call with parallel execution


Recommendations


Use Case
Recommended Tool
Reason


Quick code search
Claude Code Grep
Simple, fast, familiar output


Deep code research
Octocode localSearchCode
Rich metadata, hints, pagination


Codebase structure overview
Octocode localViewStructure
Summary stats, tree view, sizes


Reading specific file sections
Octocode localGetFileContent
Smart pattern extraction


Reading images/PDFs
Claude Code Read
Native multimodal support


Finding files by metadata
Octocode localFindFiles
Time/size/permission filters


Finding files by pattern
Either
Both effective for glob patterns


Complex multi-step research
Claude Code Task (Explore)
Agent delegation is powerful


Shell operations
Claude Code Bash
Full shell access


Token-constrained contexts
Octocode tools
Better pagination and efficiency


Test Plan Version: 3.0 (Unified Edition)
Last Updated: 2026-01-01
Total Test Cases: 380+
#	Claude Code Tool	Cursor Tool	Octocode MCP Tool	Primary Use Case
1	`Grep`	`grep`	`localSearchCode`	Pattern search in code files
2	`Bash(ls)` / `Glob`	`list_dir`	`localViewStructure`	Directory listing and exploration
3	`Read`	`read_file`	`localGetFileContent`	Reading file contents
4	`Glob`	`glob_file_search`	`localFindFiles`	Finding files by name/pattern/metadata
Dimension	Description	Weight
Context Quality	Does the tool provide actionable, structured context for AI agents?	Critical
Efficiency	Speed, bulk operations, resource usage	High
Output Quality	Structured responses, metadata richness, usability	High
Token Safety	Output size control, pagination, LLM budget awareness	Critical
Security	Path validation, secret detection, access control	Critical
Error Handling	Graceful failures, helpful hints, recovery guidance	Medium
Research Context	Goal tracking, reasoning preservation, workflow continuity	Medium
Large File Handling	Character/line pagination, context-aware extraction, memory efficiency	Critical
Large Repository Scale	Search speed at scale, quality results in monorepos, resource management	Critical
Monorepo Awareness	Package detection, scoped operations, cross-package search	High
Test ID	Test Name	Motivation	Claude Code `Grep` Behavior	Cursor `grep` Behavior	Octocode Expected Behavior	Success Criteria
A1.1	Basic pattern search	Verify structured results vs raw output	Returns file paths or content with line numbers	Returns `file:line:content` plain text	Returns structured JSON with file path, line number, column, byte offset, match content	Octocode provides richer metadata
A1.2	Context lines display	Verify surrounding context quality	`-A/-B/-C` flags show context lines	`-C N` flag shows raw lines, no grouping	`contextLines` param provides smart grouping with omission markers	Context is organized and scannable
A1.3	Multi-file search results	Verify results organization across files	List of matches per file	Flat list of matches, no grouping	Grouped by file with match counts, file statistics	Results are navigable
A1.4	Empty results handling	Verify guidance on no matches	Empty results returned	Exit code 1, empty output, no guidance	`status: empty` with semantic hints for alternatives	Agent receives actionable guidance
A1.5	Match location precision	Verify byte-level accuracy	Line number with `-n` flag	Line number only	Line, column, byte offset, char offset	Enables precise navigation
A1.6	Regex pattern support	Verify complex pattern handling	Full regex via ripgrep	Basic regex support	Full PCRE/Perl regex with multiline support	Complex patterns work
A1.7	Case sensitivity control	Verify case handling options	`-i` flag for insensitive	`-i` flag for insensitive	`caseSensitive`, `caseInsensitive`, `smartCase` options	Flexible case handling
A1.8	File type filtering	Verify extension-based filtering	`glob` and `type` params	`--include` flag	`type`, `include`, `exclude` params	Easy file type targeting
A1.9	Research context tracking	Verify goal/reasoning preservation	No concept of research context	No concept of research context	`mainResearchGoal`, `researchGoal`, `reasoning` in output	Research continuity maintained
A1.10	Match statistics	Verify count and distribution info	`output_mode: count` for counts	Count requires separate `-c` flag	`totalMatches`, `distribution` by file included	Statistics built-in
Test ID	Test Name	Motivation	Test Scenario	Success Criteria
A2.1	Single pattern performance	Baseline search speed	Search pattern in 10,000 files	Response time < 500ms
A2.2	Bulk query efficiency	Validate parallel execution	5 different patterns vs 5 sequential calls	Bulk >= 3x faster than sequential
A2.3	Large directory handling	Memory efficiency under load	Search in 100MB directory	Peak memory < 50MB
A2.4	Ripgrep to grep fallback	Graceful degradation	Force ripgrep unavailability	Falls back to grep without crash
A2.5	Incremental results	Early termination capability	Stop after N matches	`maxMatchesPerFile`, `maxFiles` respected
A2.6	Pattern complexity scaling	Performance with complex regex	Simple vs complex regex patterns	Linear degradation, no timeout
A2.7	Concurrent bulk queries	Parallelization efficiency	5 queries executing simultaneously	CPU utilization balanced
A2.8	Cold vs warm cache	Subsequent query speed	Same query twice	Second query >= 2x faster
Test ID	Test Name	Motivation	Claude Code `Grep` Output	Cursor `grep` Output	Octocode Expected Output	Success Criteria
A3.1	Response structure	Verify consistent format	Text output with file paths/content	Plain text lines	Structured YAML/JSON with fields	Parseable, consistent schema
A3.2	Metadata richness	Verify useful metadata	Basic file/line info	None	File stats, match metadata, hints	Rich context provided
A3.3	Hint generation	Verify agent guidance	No hints	No hints	Dynamic hints based on results	Actionable next steps
A3.4	Status indication	Verify clear status	Tool success/failure	Exit code only	`status: hasResults\|empty\|error`	Clear success/failure
A3.5	Pagination info	Verify navigation data	`head_limit`/`offset` params	None	`pagination` object with page/total/hasMore	Enables continuation
A3.6	Warning messages	Verify edge case alerts	None	None	`warnings` array for truncation, fallback	Agent aware of limitations
A3.7	Error detail quality	Verify error helpfulness	Generic error messages	Generic error messages	Specific error with errorCode and hints	Debuggable errors
A3.8	Match highlighting	Verify match visibility	No highlighting	No highlighting	Match boundaries indicated	Easy to locate matches
Test ID	Test Name	Motivation	Risk Without Control	Octocode Mitigation	Success Criteria
A4.1	Large result set	Prevent token overflow	10K matches returned	`matchesPerPage` pagination	Output bounded
A4.2	Long line handling	Prevent single-line overflow	10KB line returned fully	`matchContentLength` truncation	Lines truncated
A4.3	Binary file exclusion	Prevent garbage output	Binary content included	`binaryFiles: without-match` default	Clean text only
A4.4	Many files matched	Prevent file count overflow	1000 files in response	`filesPerPage` pagination	Files paginated
A4.5	Deep context expansion	Prevent context bloat	Unlimited context lines	`contextLines` max limit	Context bounded
A4.6	Output size estimation	Proactive limit warning	No warning before overflow	Size estimation + hints before large output	Early warning
A4.7	Minified file handling	Prevent single-line megafiles	Huge minified JS searched	Detect and warn about minified content	Appropriate handling
A4.8	Total response size	Global output limit	Unbounded response	Response size cap with continuation	Response bounded
Test ID	Test Name	Motivation	Attack Vector	Expected Behavior	Success Criteria
A5.1	Path traversal - basic	Prevent escape to parent	`path: "../../etc/passwd"`	Rejected with error	Path blocked
A5.2	Path traversal - encoded	Prevent encoded escape	`path: "..%2F..%2Fetc"`	Rejected with error	Encoded path blocked
A5.3	Path traversal - absolute	Prevent absolute escape	`path: "/etc/passwd"`	Rejected with error	Absolute outside workspace blocked
A5.4	Symlink resolution	Prevent symlink escape	Symlink pointing to /etc	Resolved and blocked	Symlink target validated
A5.5	Command injection - pattern	Prevent shell injection	`pattern: "; rm -rf /"`	Pattern escaped safely	No command execution
A5.6	Command injection - path	Prevent path injection	`path: "file; cat /etc/passwd"`	Path sanitized	No command execution
A5.7	Null byte injection	Prevent null truncation	`path: "file\x00/etc/passwd"`	Rejected	Null byte blocked
A5.8	Ignored path access	Prevent node_modules access	`path: "node_modules"`	Blocked by default	Ignored paths respected
A5.9	.git directory access	Prevent git data leak	`path: ".git/config"`	Blocked	Sensitive directories blocked
A5.10	Secret in pattern	Prevent secret exposure	Search result contains AWS key	Secret redacted in output	Secrets masked
A5.11	Unicode path manipulation	Prevent unicode tricks	Unicode lookalike characters	Normalized and validated	Unicode handled safely
A5.12	Very long path	Prevent buffer overflow	10KB path string	Rejected with limit error	Path length limited
Test ID	Test Name	Motivation	Error Scenario	Expected Behavior	Success Criteria
A6.1	Non-existent path	Graceful missing path	Path does not exist	Clear error message + suggestions	Helpful error
A6.2	Permission denied	Handle access errors	Read-protected file	Skip with warning, continue others	Graceful skip
A6.3	Invalid regex	Handle bad patterns	Malformed regex pattern	Parse error with position indicated	Debuggable error
A6.4	Timeout handling	Prevent hung queries	Search takes > 30s	Timeout with partial results	Graceful timeout
A6.5	Bulk partial failure	Isolate query failures	3/5 queries succeed	Successful queries return, failures isolated	Partial success
A6.6	Empty workspace	Handle empty directory	No files in path	Empty result with hint	Clear empty state
A6.7	Circular symlinks	Handle symlink loops	Symlink loop detected	Warning and skip	No infinite loop
A6.8	Encoding issues	Handle non-UTF8	Binary/unknown encoding	Skip or warn	Clean handling
Test ID	Test Name	Motivation	Claude Code `Bash(ls)`/`Glob` Behavior	Cursor `list_dir` Behavior	Octocode Expected Behavior	Success Criteria
B1.1	Basic listing	Verify output richness	ls output or glob patterns	Array of filenames only	Entries with type, size, extension, permissions	Rich metadata
B1.2	Recursive listing	Verify depth support	`ls -R` or `Glob` with `**`	Requires multiple calls	`depth` parameter for tree view	Single call for tree
B1.3	Type filtering	Verify filter capability	Manual filtering	No filtering	`filesOnly`, `directoriesOnly` params	Easy type filtering
B1.4	Extension filtering	Verify extension filter	Glob patterns (e.g., `*.ts`)	Manual post-filtering	`extension`, `extensions` params	Built-in extension filter
B1.5	Sorting options	Verify sort capability	`ls` flags (-t, -S)	Alphabetical only	`sortBy`: name, size, time, extension	Flexible sorting
B1.6	Size display	Verify human-readable sizes	`ls -lh` for sizes	No size info	`humanReadable` size formatting	4.2KB instead of 4301
B1.7	Modified time display	Verify timestamp access	`ls -l` shows timestamps	No timestamp	`showFileLastModified` option	Timestamps available
B1.8	Summary statistics	Verify aggregate info	Requires wc or counting	None	`totalFiles`, `totalDirectories`, `summary`	Quick overview
B1.9	Hidden file handling	Verify dotfile access	`ls -a` for dotfiles	May hide dotfiles	`hidden` flag to include/exclude	Controllable
B1.10	Pattern filtering	Verify glob support	Glob tool supports patterns	No pattern matching	`pattern` param for glob filter	Built-in glob
Test ID	Test Name	Motivation	Test Scenario	Success Criteria
B2.1	Large directory	Performance under scale	1000 files in directory	Response < 1s
B2.2	Deep recursion	Recursive performance	depth=3 on large tree	Response < 5s
B2.3	Bulk listing	Multiple directories at once	5 directories vs sequential	Bulk >= 2x faster
B2.4	Stats-only mode	Lightweight overview	Summary without full listing	Response < 100ms
B2.5	Filtered vs unfiltered	Filter performance	With vs without extension filter	Filtered same or faster
B2.6	Sort overhead	Sorting cost	Different sort options	Sorting < 10% overhead