pedramamini/json-structure.py

## json-structure.py
#!/usr/bin/env python3
"""
JSON Structure Analyzer and Reducer
===================================

A powerful command-line tool for analyzing, reducing, and exploring JSON file structures.
Designed to help developers understand large JSON datasets by providing multiple modes
of interaction: structure reduction for LLM analysis, hierarchical size analysis, and
interactive terminal-based exploration.

WHAT THIS PROGRAM DOES
======================

This tool addresses the common problem of working with large JSON files that exceed
context limits for Large Language Models (LLMs) or are simply too large to understand
at a glance. It provides three main modes of operation:

1. **Structure Reduction Mode (Default)**
   - Reduces JSON files while preserving structure and examples
   - Truncates long strings to manageable lengths (1000 chars)
   - Reduces arrays and array-like dictionaries to N representative items
   - Detects and handles GUID-based dictionaries intelligently
   - Perfect for feeding to LLMs while maintaining structural understanding

2. **Size Analysis Mode (-a/--analyzer)**
   - Provides hierarchical analysis of JSON structure with byte sizes
   - Shows size distribution at specified depth levels
   - Helps identify the largest data sections quickly
   - Displays results in a tree format with human-readable sizes

3. **Interactive TUI Mode (-t/--tui)**
   - Terminal User Interface for real-time JSON exploration
   - Split-panel design: tree navigation + value display
   - Keyboard-driven navigation with expand/collapse functionality
   - Syntax highlighting and smart size formatting
   - Perfect for understanding complex nested structures

KEY FEATURES
============

• **Smart Dictionary Detection**: Distinguishes between semantic dictionaries and
  array-like dictionaries (sequential keys) or GUID-based collections
• **Configurable Depth Control**: Control how many items to keep in arrays/dicts
• **Memory Efficient**: Lazy loading and smart truncation strategies
• **Multiple Output Formats**: JSON, tabular analysis, or interactive exploration
• **Size-Aware Processing**: All operations consider actual byte sizes
• **Cross-Platform**: Works on any system with Python 3.6+ and terminal support

USAGE EXAMPLES
==============

# Basic structure reduction (keep 1 item per array/dict)
json-structure large-data.json

# Keep more examples for better structure understanding
json-structure -d 5 large-data.json

# Analyze size distribution at different depths
json-structure -a 1 large-data.json    # Top-level only
json-structure -a 3 large-data.json    # 3 levels deep

# Interactive exploration
json-structure -t large-data.json

# Combined usage for comprehensive analysis
json-structure -d 3 -a 2 api-response.json

CODE ORGANIZATION
=================

Function Hierarchy and Edit Locations:

**CORE DATA PROCESSING**
├── truncate_string()              → Modify string truncation behavior
├── get_json_size()               → Change size calculation method
├── format_size() / format_size_kb() → Adjust size display formats
└── reduce_json_structure()        → **Main reduction logic - edit for new reduction rules**

**SMART DETECTION SYSTEM**
├── is_array_like_dict()          → Modify sequential key detection (0,1,2...)
└── is_guid_like_dict()           → **Edit GUID/UUID pattern recognition**

**SIZE ANALYSIS MODE**
├── analyze_json_structure()      → Change depth-based analysis logic
└── print_analysis()              → Modify tabular output format

**INTERACTIVE TUI MODE**
├── TreeNode class                → Edit node behavior and data structure
│   ├── build_children()          → Modify lazy loading logic
│   ├── toggle_expand()           → Change expand/collapse behavior
│   └── get_display_name()        → Adjust node naming
├── JSONTreeTUI class             → **Main TUI interface - edit for UI changes**
│   ├── init_colors()             → Modify color scheme
│   ├── draw_tree()               → Change left panel tree rendering
│   ├── draw_value_panel()        → **Edit right panel value display**
│   ├── format_value_for_display() → Modify value formatting logic
│   └── handle_input()            → **Add new keyboard shortcuts**
└── run_tui()                     → TUI initialization and error handling

**APPLICATION ENTRY POINT**
└── main()                        → **Edit for new command-line options**
    ├── Argument parsing          → Add new CLI flags
    ├── Mode selection           → Route to different processing modes
    └── Error handling           → Modify error messages and exit codes
"""

import json
import sys
import argparse
import curses
import math
from typing import Dict, List, Any, Tuple, Optional

def truncate_string(value, max_length=1000):
    """Truncate strings to keep them manageable while preserving content examples"""
    if isinstance(value, str) and len(value) > max_length:
        return value[:max_length] + "... (truncated)"
    return value

def is_array_like_dict(data):
    """
    Detect if a dictionary is actually an array in disguise.
    Returns True if keys are sequential integers starting from 0 or 1.
    """
    if not isinstance(data, dict) or len(data) < 3:  # Need at least 3 items to be worth reducing
        return False

    keys = list(data.keys())

    # Check if all keys are string representations of integers
    try:
        int_keys = [int(k) for k in keys]
    except (ValueError, TypeError):
        return False

    # Sort the integer keys
    int_keys.sort()

    # Check if they form a sequence starting from 0 or 1
    if int_keys == list(range(len(int_keys))):  # 0, 1, 2, 3...
        return True
    elif int_keys == list(range(1, len(int_keys) + 1)):  # 1, 2, 3, 4...
        return True

    return False

def is_guid_like_dict(data):
    """
    Detect if a dictionary has GUID-like keys (UUIDs, hashes, etc.).
    Returns True if keys appear to be auto-generated identifiers.
    """
    if not isinstance(data, dict) or len(data) < 3:
        return False

    keys = list(data.keys())

    # Check for UUID pattern (8-4-4-4-12 hex digits with dashes)
    uuid_pattern_count = 0
    hash_like_count = 0

    for key in keys:
        if not isinstance(key, str):
            continue

        # UUID pattern: xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
        if len(key) == 36 and key.count('-') == 4:
            parts = key.split('-')
            if (len(parts) == 5 and
                len(parts[0]) == 8 and len(parts[1]) == 4 and
                len(parts[2]) == 4 and len(parts[3]) == 4 and len(parts[4]) == 12):
                try:
                    # Check if all parts are hex
                    for part in parts:
                        int(part, 16)
                    uuid_pattern_count += 1
                    continue
                except ValueError:
                    pass

        # Hash-like pattern: long hex strings (20+ chars, all hex)
        if len(key) >= 20 and len(key) <= 64:
            try:
                int(key, 16)
                hash_like_count += 1
                continue
            except ValueError:
                pass

    # If most keys match GUID/hash patterns, consider it GUID-like
    total_keys = len(keys)
    guid_ratio = (uuid_pattern_count + hash_like_count) / total_keys

    return guid_ratio >= 0.8  # 80% of keys are GUID-like

def reduce_json_structure(data, max_items=1):
    """
    Reduce JSON structure by:
    1. Truncating long strings to 1000 chars
    2. Reducing arrays to max_items representative elements
    3. Detecting array-like and GUID-like dictionaries and reducing them too
    4. Preserving all other object keys and structure

    Args:
        data: The JSON data to reduce
        max_items: Maximum number of items to keep in arrays and array-like dicts
    """
    if isinstance(data, dict):
        # Check if this dictionary is actually an array in disguise or has GUID-like keys
        if is_array_like_dict(data):
            # Treat it like an array - keep first max_items entries
            keys = sorted(data.keys(), key=int)
            reduced_data = {}

            # Keep first max_items entries
            for i, key in enumerate(keys[:max_items]):
                reduced_data[key] = reduce_json_structure(data[key], max_items)

            # Add indicator if we truncated
            if len(keys) > max_items:
                reduced_data["..."] = f"(array-like dict had {len(keys)} total entries)"

            return reduced_data
        elif is_guid_like_dict(data):
            # GUID-like dictionary - keep first max_items entries
            keys = list(data.keys())
            reduced_data = {}

            # Keep first max_items entries
            for i, key in enumerate(keys[:max_items]):
                reduced_data[key] = reduce_json_structure(data[key], max_items)

            # Add indicator if we truncated
            if len(keys) > max_items:
                reduced_data["..."] = f"(GUID-like dict had {len(keys)} total entries)"

            return reduced_data
        else:
            # Regular dictionary - keep all keys
            reduced_data = {}
            for key, value in data.items():
                reduced_data[key] = reduce_json_structure(value, max_items)
            return reduced_data

    elif isinstance(data, list):
        if not data:
            return []

        # Keep first max_items elements
        reduced_items = []
        for i in range(min(len(data), max_items)):
            reduced_items.append(reduce_json_structure(data[i], max_items))

        # Add metadata about the original array size if we truncated
        if len(data) > max_items:
            reduced_items.append(f"... (array had {len(data)} total items)")

        return reduced_items

    else:
        # Apply string truncation to leaf values
        return truncate_string(data)

def get_json_size(data):
    """Get the byte size of JSON data when serialized"""
    return len(json.dumps(data, separators=(',', ':')).encode('utf-8'))

def format_size(size_bytes):
    """Format size in human-readable format"""
    if size_bytes < 1024:
        return f"{size_bytes}B"
    elif size_bytes < 1024 * 1024:
        return f"{size_bytes/1024:.1f}KB"
    else:
        return f"{size_bytes/(1024*1024):.1f}MB"

def format_size_kb(size_bytes):
    """Format size in KB with appropriate decimal precision for TUI"""
    kb = size_bytes / 1024.0
    if kb < 0.1:
        return f"{kb:.3f}KB"
    elif kb < 1.0:
        return f"{kb:.2f}KB"
    elif kb < 10.0:
        return f"{kb:.1f}KB"
    else:
        return f"{kb:.0f}KB"

class TreeNode:
    """Represents a node in the JSON tree structure"""
    def __init__(self, key: str, value: Any, path: str = "", parent: Optional['TreeNode'] = None):
        self.key = key
        self.value = value
        self.path = path
        self.parent = parent
        self.children: List['TreeNode'] = []
        self.expanded = False
        self.size = get_json_size(value)
        self.data_type = self._get_data_type()
        self.count = self._get_count()

    def _get_data_type(self) -> str:
        """Get the data type string for display"""
        if isinstance(self.value, dict):
            return "dict"
        elif isinstance(self.value, list):
            return "array"
        elif isinstance(self.value, str):
            return "string"
        elif isinstance(self.value, (int, float)):
            return "number"
        elif isinstance(self.value, bool):
            return "boolean"
        elif self.value is None:
            return "null"
        else:
            return "unknown"

    def _get_count(self) -> int:
        """Get count for display (length for containers, char count for strings)"""
        if isinstance(self.value, (dict, list)):
            return len(self.value)
        elif isinstance(self.value, str):
            return len(self.value)
        else:
            return 1

    def build_children(self):
        """Build child nodes if not already built"""
        if self.children or not isinstance(self.value, (dict, list)):
            return

        if isinstance(self.value, dict):
            for key, val in self.value.items():
                child_path = f"{self.path}.{key}" if self.path else key
                child = TreeNode(str(key), val, child_path, self)
                self.children.append(child)
        elif isinstance(self.value, list):
            for i, val in enumerate(self.value):
                child_path = f"{self.path}[{i}]"
                child = TreeNode(f"[{i}]", val, child_path, self)
                self.children.append(child)

    def toggle_expand(self):
        """Toggle expansion state"""
        if isinstance(self.value, (dict, list)) and len(self.value) > 0:
            self.expanded = not self.expanded
            if self.expanded:
                self.build_children()

    def get_display_name(self) -> str:
        """Get the display name for this node"""
        if self.key == "<root>":
            return "<root>"
        return self.key

    def is_expandable(self) -> bool:
        """Check if this node can be expanded"""
        return isinstance(self.value, (dict, list)) and len(self.value) > 0

class JSONTreeTUI:
    """Terminal User Interface for JSON tree visualization"""

    def __init__(self, data: Any, filename: str):
        self.root = TreeNode("<root>", data)
        self.filename = filename
        self.current_node = 0
        self.scroll_offset = 0
        self.visible_nodes: List[TreeNode] = []
        self.total_size = get_json_size(data)

        # Color pairs
        self.COLOR_NORMAL = 1
        self.COLOR_SELECTED = 2
        self.COLOR_DICT = 3
        self.COLOR_ARRAY = 4
        self.COLOR_STRING = 5
        self.COLOR_NUMBER = 6
        self.COLOR_BOOLEAN = 7
        self.COLOR_NULL = 8
        self.COLOR_SIZE = 9
        self.COLOR_STATUS = 10

    def init_colors(self):
        """Initialize color pairs"""
        curses.start_color()
        curses.use_default_colors()

        curses.init_pair(self.COLOR_NORMAL, curses.COLOR_WHITE, -1)
        curses.init_pair(self.COLOR_SELECTED, curses.COLOR_BLACK, curses.COLOR_WHITE)
        curses.init_pair(self.COLOR_DICT, curses.COLOR_CYAN, -1)
        curses.init_pair(self.COLOR_ARRAY, curses.COLOR_YELLOW, -1)
        curses.init_pair(self.COLOR_STRING, curses.COLOR_GREEN, -1)
        curses.init_pair(self.COLOR_NUMBER, curses.COLOR_BLUE, -1)
        curses.init_pair(self.COLOR_BOOLEAN, curses.COLOR_MAGENTA, -1)
        curses.init_pair(self.COLOR_NULL, curses.COLOR_RED, -1)
        curses.init_pair(self.COLOR_SIZE, curses.COLOR_WHITE, -1)
        curses.init_pair(self.COLOR_STATUS, curses.COLOR_BLACK, curses.COLOR_CYAN)

    def get_type_color(self, data_type: str) -> int:
        """Get color pair for data type"""
        color_map = {
            "dict": self.COLOR_DICT,
            "array": self.COLOR_ARRAY,
            "string": self.COLOR_STRING,
            "number": self.COLOR_NUMBER,
            "boolean": self.COLOR_BOOLEAN,
            "null": self.COLOR_NULL,
        }
        return color_map.get(data_type, self.COLOR_NORMAL)

    def collect_visible_nodes(self, node: TreeNode, depth: int = 0) -> List[Tuple[TreeNode, int]]:
        """Collect all visible nodes with their depth"""
        result = [(node, depth)]

        if node.expanded:
            for child in node.children:
                result.extend(self.collect_visible_nodes(child, depth + 1))

        return result

    def get_tree_prefix(self, depth: int, is_last: bool, parent_prefixes: List[bool]) -> str:
        """Generate tree prefix with ASCII characters"""
        if depth == 0:
            return ""

        prefix = ""
        for i in range(depth - 1):
            if parent_prefixes[i]:
                prefix += "│   "
            else:
                prefix += "    "

        if is_last:
            prefix += "└── "
        else:
            prefix += "├── "

        return prefix

    def draw_tree(self, stdscr):
        """Draw the tree structure with split panel layout"""
        height, width = stdscr.getmaxyx()
        tree_height = height - 2  # Reserve space for status bar

        # Calculate panel widths (60% tree, 40% value panel)
        tree_width = int(width * 0.6)
        value_width = width - tree_width - 1  # -1 for separator

        # Collect visible nodes
        visible_with_depth = self.collect_visible_nodes(self.root)
        self.visible_nodes = [node for node, _ in visible_with_depth]

        # Calculate scroll bounds
        if self.current_node >= len(self.visible_nodes):
            self.current_node = len(self.visible_nodes) - 1
        if self.current_node < 0:
            self.current_node = 0

        # Adjust scroll offset
        if self.current_node < self.scroll_offset:
            self.scroll_offset = self.current_node
        elif self.current_node >= self.scroll_offset + tree_height:
            self.scroll_offset = self.current_node - tree_height + 1

        # Clear screen
        stdscr.clear()

        # Draw tree panel
        for i in range(tree_height):
            node_index = self.scroll_offset + i
            if node_index >= len(visible_with_depth):
                break

            node, depth = visible_with_depth[node_index]

            # Determine if this is the last child at its level
            is_last = True
            if node.parent:
                siblings = node.parent.children
                is_last = siblings[-1] == node

            # Build parent prefixes for tree drawing
            parent_prefixes = []
            current = node.parent
            while current and current.parent:  # Skip root
                if current.parent.children[-1] != current:
                    parent_prefixes.insert(0, True)
                else:
                    parent_prefixes.insert(0, False)
                current = current.parent

            # Generate tree prefix
            tree_prefix = self.get_tree_prefix(depth, is_last, parent_prefixes)

            # Expansion indicator
            if node.is_expandable():
                expand_char = "▼" if node.expanded else "▶"
            else:
                expand_char = " "

            # Node display name
            display_name = node.get_display_name()

            # Size information
            size_str = format_size_kb(node.size)

            # Type and count info
            type_info = f"({node.data_type}"
            if node.count > 1 or node.data_type in ["dict", "array"]:
                type_info += f", {node.count}"
            type_info += ")"

            # Build display line
            line = f"{tree_prefix}{expand_char} {display_name} {size_str} {type_info}"

            # Truncate to fit tree panel
            if len(line) > tree_width - 1:
                line = line[:tree_width - 4] + "..."

            # Determine colors
            if node_index == self.current_node:
                attr = curses.color_pair(self.COLOR_SELECTED) | curses.A_BOLD
            else:
                attr = curses.color_pair(self.get_type_color(node.data_type))

            try:
                stdscr.addstr(i, 0, line, attr)
            except curses.error:
                pass  # Ignore errors from writing to edge of screen

        # Draw vertical separator
        for i in range(tree_height):
            try:
                stdscr.addstr(i, tree_width, "│", curses.color_pair(self.COLOR_NORMAL))
            except curses.error:
                pass

        # Draw value panel
        self.draw_value_panel(stdscr, tree_width + 1, value_width, tree_height)

        # Draw status bar
        self.draw_status_bar(stdscr)

        stdscr.refresh()

    def draw_value_panel(self, stdscr, start_x, panel_width, panel_height):
        """Draw the value panel showing the current node's content"""
        if self.current_node >= len(self.visible_nodes):
            return

        current_node = self.visible_nodes[self.current_node]

        # Panel header
        header = f" Value: {current_node.get_display_name()} "
        if len(header) > panel_width:
            header = header[:panel_width - 3] + "..."

        try:
            stdscr.addstr(0, start_x, header,
                         curses.color_pair(self.COLOR_STATUS) | curses.A_BOLD)
        except curses.error:
            pass

        # Format the value for display
        value_lines = self.format_value_for_display(current_node.value, panel_width - 2)

        # Display value lines
        for i, line in enumerate(value_lines[:panel_height - 2]):  # Leave space for header
            display_line = line[:panel_width - 1]  # Ensure it fits

            # Determine color based on content
            attr = curses.color_pair(self.COLOR_NORMAL)
            if line.strip().startswith('"') and line.strip().endswith('"'):
                attr = curses.color_pair(self.COLOR_STRING)
            elif line.strip() in ['true', 'false']:
                attr = curses.color_pair(self.COLOR_BOOLEAN)
            elif line.strip() == 'null':
                attr = curses.color_pair(self.COLOR_NULL)
            elif line.strip().replace('.', '').replace('-', '').isdigit():
                attr = curses.color_pair(self.COLOR_NUMBER)
            elif line.strip().startswith('{') or line.strip().startswith('['):
                attr = curses.color_pair(self.COLOR_DICT if line.strip().startswith('{') else self.COLOR_ARRAY)

            try:
                stdscr.addstr(i + 1, start_x + 1, display_line, attr)
            except curses.error:
                pass

    def format_value_for_display(self, value, max_width):
        """Format a value for display in the value panel"""
        if isinstance(value, (dict, list)):
            # For containers, show formatted JSON
            try:
                json_str = json.dumps(value, indent=2, ensure_ascii=False)
                lines = json_str.split('\n')

                # Truncate very long output
                if len(lines) > 50:
                    lines = lines[:47] + ['...', f'({len(lines) - 47} more lines)']

                # Wrap long lines
                wrapped_lines = []
                for line in lines:
                    if len(line) <= max_width:
                        wrapped_lines.append(line)
                    else:
                        # Simple word wrapping for JSON
                        while len(line) > max_width:
                            wrapped_lines.append(line[:max_width])
                            line = '  ' + line[max_width:]  # Indent continuation
                        if line.strip():
                            wrapped_lines.append(line)

                return wrapped_lines
            except Exception:
                return [f"<{type(value).__name__} with {len(value)} items>"]

        elif isinstance(value, str):
            # For strings, show with quotes and handle long strings
            if len(value) > 1000:
                display_value = value[:1000] + "... (truncated)"
            else:
                display_value = value

            # Add quotes and wrap lines
            quoted_value = json.dumps(display_value, ensure_ascii=False)
            lines = []
            current_line = ""

            for char in quoted_value:
                if len(current_line) >= max_width - 1:
                    lines.append(current_line)
                    current_line = ""
                current_line += char

            if current_line:
                lines.append(current_line)

            return lines

        else:
            # For primitives, show as JSON
            json_str = json.dumps(value, ensure_ascii=False)
            if len(json_str) <= max_width:
                return [json_str]
            else:
                # Split long primitive values
                lines = []
                while len(json_str) > max_width:
                    lines.append(json_str[:max_width])
                    json_str = json_str[max_width:]
                if json_str:
                    lines.append(json_str)
                return lines

    def draw_status_bar(self, stdscr):
        """Draw the status bar at the bottom"""
        height, width = stdscr.getmaxyx()
        status_y = height - 1

        # Current path
        current_path = ""
        if self.current_node < len(self.visible_nodes):
            current_path = self.visible_nodes[self.current_node].path or "<root>"

        # Status information
        total_size_str = format_size_kb(self.total_size)
        status_left = f"Path: {current_path}"
        status_right = f"Size: {total_size_str} | ↑↓:Navigate Enter:Expand/Collapse q:Quit"

        # Truncate left side if needed
        available_width = width - len(status_right) - 3
        if len(status_left) > available_width:
            status_left = status_left[:available_width - 3] + "..."

        # Create full status line
        padding = width - len(status_left) - len(status_right)
        status_line = status_left + " " * padding + status_right

        try:
            stdscr.addstr(status_y, 0, status_line[:width],
                         curses.color_pair(self.COLOR_STATUS) | curses.A_BOLD)
        except curses.error:
            pass

    def handle_input(self, key: int) -> bool:
        """Handle keyboard input. Returns False to quit."""
        if key == ord('q') or key == ord('Q'):
            return False
        elif key == curses.KEY_UP:
            if self.current_node > 0:
                self.current_node -= 1
        elif key == curses.KEY_DOWN:
            if self.current_node < len(self.visible_nodes) - 1:
                self.current_node += 1
        elif key == curses.KEY_ENTER or key == 10 or key == 13:
            if self.current_node < len(self.visible_nodes):
                self.visible_nodes[self.current_node].toggle_expand()
        elif key == curses.KEY_LEFT:
            # Collapse current node or go to parent
            if self.current_node < len(self.visible_nodes):
                node = self.visible_nodes[self.current_node]
                if node.expanded:
                    node.expanded = False
                elif node.parent and node.parent != self.root:
                    # Find parent in visible nodes
                    for i, visible_node in enumerate(self.visible_nodes):
                        if visible_node == node.parent:
                            self.current_node = i
                            break
        elif key == curses.KEY_RIGHT:
            # Expand current node
            if self.current_node < len(self.visible_nodes):
                node = self.visible_nodes[self.current_node]
                if node.is_expandable() and not node.expanded:
                    node.toggle_expand()

        return True

    def run(self, stdscr):
        """Main TUI loop"""
        self.init_colors()
        curses.curs_set(0)  # Hide cursor
        stdscr.keypad(True)  # Enable special keys

        while True:
            self.draw_tree(stdscr)
            key = stdscr.getch()

            if not self.handle_input(key):
                break

def run_tui(data: Any, filename: str):
    """Run the TUI application"""
    tui = JSONTreeTUI(data, filename)
    curses.wrapper(tui.run)

def analyze_json_structure(data, path="", current_depth=0, target_depth=1):
    """
    Analyze JSON structure up to specified depth and return size information.
    Returns a list of (path, size, type, count, depth) tuples.
    """
    results = []
    current_size = get_json_size(data)

    if isinstance(data, dict):
        # Add info about this dictionary
        results.append((path, current_size, "dict", len(data), current_depth))

        # Only recurse if we haven't reached target depth
        if current_depth < target_depth:
            for key, value in data.items():
                key_path = f"{path}.{key}" if path else key
                results.extend(analyze_json_structure(value, key_path, current_depth + 1, target_depth))

    elif isinstance(data, list):
        # Add info about this array
        results.append((path, current_size, "array", len(data), current_depth))

        # Only recurse if we haven't reached target depth
        if current_depth < target_depth:
            # Analyze first few items to show structure variety
            for i, item in enumerate(data[:3]):  # Only analyze first 3 items
                item_path = f"{path}[{i}]"
                results.extend(analyze_json_structure(item, item_path, current_depth + 1, target_depth))

            if len(data) > 3:
                # Add summary for remaining items at this depth level
                remaining_size = sum(get_json_size(item) for item in data[3:])
                results.append((f"{path}[3...{len(data)-1}]", remaining_size, "array_tail", len(data) - 3, current_depth + 1))

    else:
        # Leaf value - only show if we're at or below target depth
        if current_depth <= target_depth:
            value_type = type(data).__name__
            if isinstance(data, str):
                count = len(data)  # String length
            else:
                count = 1
            results.append((path, current_size, value_type, count, current_depth))

    return results

def print_analysis(results, target_depth):
    """Print the analysis results in a tree-like format"""
    # Sort by depth first, then by size within each depth level
    results.sort(key=lambda x: (x[4], -x[1]))  # x[4] is depth, x[1] is size

    print(f"JSON Structure Analysis (depth: {target_depth})")
    print("=" * 70)
    print(f"{'Path':<45} {'Size':<10} {'Type':<12} {'Count'}")
    print("-" * 70)

    for path, size, data_type, count, depth in results:
        # Add indentation based on actual depth
        indent = "  " * depth

        # Get the display name (last part of path)
        if path == "":
            display_name = "<root>"
        elif '[' in path and ']' in path:
            # Array index notation
            display_name = path.split('.')[-1] if '.' in path else path
        else:
            display_name = path.split('.')[-1] if '.' in path else path

        display_path = indent + display_name

        # Truncate very long paths
        if len(display_path) > 43:
            display_path = display_path[:40] + "..."

        size_str = format_size(size)
        count_str = str(count) if count != 1 or data_type in ['dict', 'array', 'array_tail'] else ""

        print(f"{display_path:<45} {size_str:<10} {data_type:<12} {count_str}")

def main():
    """Main function with argument parsing"""
    parser = argparse.ArgumentParser(
        description="Analyze and reduce JSON file structure",
        formatter_class=argparse.RawDescriptionHelpFormatter,
        epilog="""
Examples:
  json-structure data.json              # Reduce structure for LLM analysis (1 item per array)
  json-structure -d 3 data.json         # Keep 3 items per array/dict
  json-structure -a 1 data.json         # Analyze top-level keys only
  json-structure -a 2 data.json         # Analyze 2 levels deep
  json-structure -t data.json           # Interactive TUI mode
  json-structure --tui data.json        # Same as -t
        """
    )

    parser.add_argument('file_path', help='Path to JSON file')
    parser.add_argument('-d', '--depth', type=int, default=1, metavar='N',
                       help='Number of items to keep in arrays and array-like dictionaries (default: 1)')
    parser.add_argument('-a', '--analyzer', type=int, metavar='DEPTH',
                       help='Analyzer mode: show structure tree with size information up to DEPTH levels')
    parser.add_argument('-t', '--tui', action='store_true',
                       help='Interactive TUI mode: navigate JSON structure with keyboard')

    args = parser.parse_args()

    # Load JSON file
    try:
        with open(args.file_path, 'r', encoding='utf-8') as file:
            data = json.load(file)
    except FileNotFoundError:
        print(f"Error: File '{args.file_path}' not found.", file=sys.stderr)
        sys.exit(1)
    except json.JSONDecodeError as e:
        print(f"Error: Invalid JSON in '{args.file_path}': {e}", file=sys.stderr)
        sys.exit(1)

    if args.tui:
        # TUI mode - interactive tree navigation
        try:
            run_tui(data, args.file_path)
        except KeyboardInterrupt:
            print("\nExited by user.")
        except Exception as e:
            print(f"TUI Error: {e}", file=sys.stderr)
            sys.exit(1)
    elif args.analyzer is not None:
        # Analyzer mode - show structure and sizes
        depth = args.analyzer if args.analyzer > 0 else 1
        results = analyze_json_structure(data, target_depth=depth)
        print_analysis(results, depth)
    else:
        # Normal mode - reduce structure
        max_items = max(1, args.depth)  # Ensure at least 1 item
        reduced_data = reduce_json_structure(data, max_items)
        output = json.dumps(reduced_data, indent=4, ensure_ascii=False)
        print(output)

if __name__ == "__main__":
    main()
No results found