Skip to content

Instantly share code, notes, and snippets.

@LalatenduMohanty
Last active November 21, 2025 14:04
Show Gist options
  • Select an option

  • Save LalatenduMohanty/118399ee5cf5dd204c5751fd9b23e0c6 to your computer and use it in GitHub Desktop.

Select an option

Save LalatenduMohanty/118399ee5cf5dd204c5751fd9b23e0c6 to your computer and use it in GitHub Desktop.

--test-mode Design Document

Table of Contents

  1. Overview
  2. Problem Statement
  3. Design Goals
  4. Architecture
  5. Implementation Details
  6. API Design
  7. Error Handling
  8. Performance Considerations
  9. Security Considerations
  10. Testing Strategy
  11. Real-World Example
  12. Conclusion
  13. Limitations and Future Work

Overview

--test-mode enables resilient bootstrap by continuing processing after package failures, attempting to download pre-built wheels as fallback, and reporting all failures at completion.

⚠️ Important: Test mode is only supported with bootstrap (serial mode), not bootstrap-parallel. This ensures comprehensive failure detection including wheel compilation failures. Most packages will use cached wheels, so serial mode performance is acceptable for testing scenarios.

Problem Statement

Current bootstrap fails at first package failure, preventing discovery of all problematic packages.

Failure Types Detected

With bootstrap --test-mode, all failure types are detected:

  • Wheel compilation failures (C extensions, Rust, missing build tools, etc.)
  • Source preparation failures (download, unpack, patch)
  • Dependency resolution failures
  • Build system requirement failures

Design Goals

  1. Complete Discovery: Identify all failures in single run
  2. Efficient Processing: O(n) time complexity
  3. Clear Reporting: Actionable failure information
  4. Compatibility: Maintain existing workflow behavior
  5. Simplicity: Serial mode only - most packages use cache, performance is acceptable

Architecture

High-Level Design

graph TD
    A[Bootstrap Command] --> B{Test Mode?}
    B -->|Yes| C[Test Mode Bootstrap]
    B -->|No| D[Normal Bootstrap]
    
    C --> E[Process Package]
    E --> F{Build Success?}
    F -->|Yes| G[Continue to Next]
    F -->|No| H[Attempt Pre-built Download]
    H --> I{Download Success?}
    I -->|Yes| G
    I -->|No| J[Add to Failed Set]
    J --> G
    
    G --> K{More Packages?}
    K -->|Yes| E
    K -->|No| L[Generate Report]
    L --> M[Exit with Status]
    
    D --> N[Process Package]
    N --> O{Build Success?}
    O -->|Yes| P[Continue to Next]
    O -->|No| Q[Fail Fast]
    
    P --> R{More Packages?}
    R -->|Yes| N
    R -->|No| S[Complete]
Loading

Core Components

  1. CLI Interface: --test-mode flag with error handling and reporting
  2. Bootstrapper Class: Test mode state, failure tracking, runtime package settings modification
  3. Build Pipeline: Individual package processing with pre-built fallback strategy

Implementation Details

Key Data Structures

@dataclasses.dataclass
class BuildResult:
    wheel_filename: pathlib.Path | None = None
    sdist_filename: pathlib.Path | None = None
    unpack_dir: pathlib.Path | None = None
    source_url_type: str = "unknown"
    sdist_root_dir: pathlib.Path | None = None
    build_env: build_environment.BuildEnvironment | None = None
    failed: bool = False
    
    # Context fields for error tracking and reporting
    req: Requirement | None = None
    resolved_version: Version | None = None
    exception: Exception | None = None
    exception_type: str | None = None  # Serializable: exception.__class__.__name__
    exception_message: str | None = None  # Serializable: str(exception)

# In Bootstrapper.__init__():
self.failed_builds: list[BuildResult] = []  # Replaces old self.failed_packages: set[str]

Core Algorithm

Single-Pass Processing: Each package processed once with immediate failure handling and pre-built fallback.

def _build_package(self, req, resolved_version, pbi, build_sdist_only) -> BuildResult:
    """Build or download package - handles test mode failures gracefully."""
    try:
        # Attempt normal build process
        return self._build_wheel_and_sdist(req, resolved_version, pbi, build_sdist_only)
    except Exception as build_error:
        if not self.test_mode:
            raise  # Re-raise in normal mode (fail-fast behavior)
        
        # Test mode: try pre-built fallback
        logger.warning("test mode: build failed for %s==%s, attempting fallback to pre-built", 
                      req.name, resolved_version, exc_info=True)
        
        try:
            # Directly resolve and download pre-built wheel
            wheel_url, _ = self._resolve_prebuilt_with_history(
                req=req, req_type=RequirementType.TOP_LEVEL
            )
            wheel_filename, unpack_dir = self._download_prebuilt(
                req=req,
                req_type=RequirementType.TOP_LEVEL,
                resolved_version=resolved_version,
                wheel_url=wheel_url,
            )
            logger.info("test mode: successfully handled %s as pre-built after build failure", req.name)
            return BuildResult(
                wheel_filename=wheel_filename,
                unpack_dir=unpack_dir,
                source_url_type=str(SourceType.PREBUILT),
            )
        except Exception as prebuilt_error:
            # Even pre-built fallback failed - track failure with full context
            logger.error("test mode: failed to handle %s as pre-built: %s", 
                        req.name, prebuilt_error, exc_info=True)
            
            # Create failure result with captured exception details
            result = BuildResult.failure(
                req=req, 
                resolved_version=resolved_version, 
                exception=build_error  # Original build error, not prebuilt fallback error
            )
            self.failed_builds.append(result)
            return result

How it works:

  1. Single Attempt: Each package gets exactly one build attempt
  2. Immediate Failure Handling: Exception caught immediately, no retry loops
  3. Test Mode Check: Only applies fallback logic in test mode
  4. Pre-built Fallback: Directly resolve and download pre-built wheel from wheel servers
  5. Failure Tracking: Capture full context (requirement, version, exception) for detailed reporting
  6. Continue Processing: Return result (success or failure) and continue to next package

Pre-built Fallback Mechanism: When a build fails in test mode, the system directly attempts to resolve and download a pre-built wheel from configured wheel servers.

# In Bootstrapper class
def _resolve_prebuilt_with_history(
    self,
    req: Requirement,
    req_type: RequirementType,
) -> tuple[str, Version]:
    """Resolve pre-built wheel URL, checking previous bootstrap graph first."""
    # Check previous bootstrap graph for cached resolution
    cached_resolution = self._resolve_from_graph(
        req=req,
        req_type=req_type,
        pre_built=True,
    )
    
    if cached_resolution and not req.url:
        wheel_url, resolved_version = cached_resolution
        logger.debug(f"resolved from previous bootstrap to {resolved_version}")
    else:
        # Resolve from wheel servers (PyPI, cache server, etc.)
        servers = wheels.get_wheel_server_urls(
            self.ctx, req, cache_wheel_server_url=resolver.PYPI_SERVER_URL
        )
        wheel_url, resolved_version = wheels.resolve_prebuilt_wheel(
            ctx=self.ctx, req=req, wheel_server_urls=servers, req_type=req_type
        )
    return (wheel_url, resolved_version)

def _download_prebuilt(
    self,
    req: Requirement,
    req_type: RequirementType,
    resolved_version: Version,
    wheel_url: str,
) -> tuple[pathlib.Path, pathlib.Path]:
    """Download pre-built wheel and unpack metadata."""
    logger.info(f"{req_type} requirement {req} uses a pre-built wheel")
    
    wheel_filename = wheels.download_wheel(req, wheel_url, self.ctx.wheels_prebuilt)
    unpack_dir = self._create_unpack_dir(req, resolved_version)
    # Update the wheel mirror so pre-built wheels are indexed
    # and available to subsequent builds that need them as dependencies
    server.update_wheel_mirror(self.ctx)
    return (wheel_filename, unpack_dir)

How it works:

  1. Direct Resolution: When build fails, directly resolve pre-built wheel URL from wheel servers
  2. History Check: First checks previous bootstrap graph for cached resolutions
  3. Server Fallback: Falls back to resolving from configured wheel servers (PyPI, cache server, etc.)
  4. Download and Index: Downloads wheel and updates wheel mirror for subsequent builds
  5. No State Mutation: Does not modify package settings or mark packages as pre-built
  6. Simple and Efficient: Direct approach without runtime state management

API Design

CLI Interface

Command Line Options

@click.option(
    "--test-mode",
    "test_mode",
    is_flag=True,
    default=False,
    help="Test mode: mark failed packages as pre-built and continue, report failures at end",
)

Usage Examples

# Basic test mode usage
fromager bootstrap --test-mode package1 package2 package3

# Test mode with requirements file
fromager bootstrap --test-mode -r requirements.txt

# Test mode with requirements and constraints files
fromager -c constraints.txt bootstrap --test-mode -r requirements.txt

# Test mode discovers all build failures in serial mode
# Most packages will use cached wheels, so performance is acceptable
fromager bootstrap --test-mode -r large-requirements.txt

Programmatic Interface

# Bootstrapper initialization
Bootstrapper(ctx: WorkContext, test_mode: bool = False)

# Package build method with full type annotations
def _build_package(
    self,
    req: Requirement,
    resolved_version: Version,
    pbi: PackageBuildInfo,
    build_sdist_only: bool
) -> BuildResult

Error Handling

Recovery Strategy

  1. Build Failure: Attempt to download pre-built wheel from configured servers
  2. Fallback Failure: Return BuildResult.failure(), continue processing
  3. Dependency Failure: Skip dependencies for failed packages

Performance Considerations

Time Complexity

  • Single-Pass Processing: O(n) where n = number of packages
  • Each package processed exactly once
  • No retry loops or backtracking
  • Efficient failure handling with immediate fallback

Serial Mode Performance

  • Cached Wheels: Most packages use cached wheels (fast downloads)
  • Build-Required Packages: Only packages requiring compilation are slow
  • Acceptable for Testing: Complete failure discovery justifies serial processing
  • Not for Production: For production builds, use regular bootstrap mode

Memory Usage

  • Failed Builds List: O(f) where f = number of failures
  • Typical Case: f << n (failures much less than total packages)
  • Lightweight Objects: BuildResult contains metadata only, no source code
  • Minimal Overhead: Memory usage proportional to failure count, not total packages

Comparison with Parallel Mode

  • Serial Mode: Detects 100% of failures (source prep + wheel builds)
  • Parallel Mode: Would only detect ~20% of failures (source prep phase)
  • Trade-off: Slightly slower but complete failure discovery

Security Considerations

Exception Information Disclosure

  • Exception messages may contain sensitive path information
  • Logs should be reviewed before sharing publicly
  • Consider sanitizing paths in exception messages for public reports

Pre-built Package Fallback

  • Pre-built fallback directly downloads wheels from configured servers
  • Does not modify package settings or configuration files
  • Only attempts download when build fails in test mode
  • No persistent state changes, scoped to current bootstrap run only

Wheel Download Security

  • Pre-built fallback downloads from configured wheel servers
  • Uses existing authentication and verification mechanisms
  • No new security vectors introduced
  • Follows same security policies as normal bootstrap

Testing Strategy

Core Tests

Tests in tests/test_bootstrap_test_mode.py:

  1. test_test_mode_tracks_complete_failures

    • Verifies test mode tracks failures with full context (req, version, exception)
    • Tests pre-built fallback mechanism when build fails
    • Validates failed_builds list population when both build and fallback fail
  2. test_normal_mode_still_fails_fast

    • Ensures test_mode=False preserves fail-fast behavior
    • Regression test for existing functionality
  3. test_build_result_captures_exception_context

    • Tests BuildResult.failure() enhancement
    • Verifies exception_type and exception_message fields

Real-World Example

Scenario: Testing a Large Requirements File

# Start test mode bootstrap
$ fromager bootstrap --test-mode -r requirements.txt

INFO stevedore: building wheel...
INFO pbr: building wheel...
WARNING test mode: build failed for cryptography==41.0.0, attempting fallback to pre-built
INFO test mode: successfully handled cryptography as pre-built after build failure
WARNING test mode: build failed for lxml==4.9.0, attempting fallback to pre-built
ERROR test mode: failed to handle lxml as pre-built: No matching wheel found
WARNING test mode: build failed for pillow==10.0.0, attempting fallback to pre-built
ERROR test mode: failed to handle pillow as pre-built: RuntimeError: zlib not found

...

ERROR test mode: the following packages failed to build:
ERROR   - cryptography==41.0.0
ERROR     Error: CalledProcessError: Command '...' returned non-zero exit status 1
ERROR   - lxml==4.9.0
ERROR     Error: ValueError: No matching wheel found
ERROR   - pillow==10.0.0
ERROR     Error: RuntimeError: zlib development headers not found

ERROR
ERROR test mode: failure breakdown by type:
ERROR   CalledProcessError: 1 package(s)
ERROR   RuntimeError: 1 package(s)
ERROR   ValueError: 1 package(s)

ERROR test mode: 3 package(s) failed to build

Actionable Insights

This output helps you:

  1. Install missing build dependencies

    # From the error messages, we know we need:
    sudo apt-get install libssl-dev libxml2-dev libxslt1-dev zlib1g-dev
  2. Verify wheel availability

    • cryptography==41.0.0: Pre-built wheel available (fallback succeeded)
    • lxml==4.9.0: No wheel available, must build from source
    • pillow==10.0.0: Wheel might be available after installing zlib headers
  3. Complete failure list in one run

    • No need to fix one issue and re-run repeatedly
    • All 3 problems discovered in a single bootstrap attempt
    • Save time by parallelizing fixes (install all deps at once)

Conclusion

The --test-mode feature enables resilient bootstrap processing with comprehensive failure discovery. Key design decisions:

  1. Single-pass processing for O(n) efficiency
  2. Direct pre-built wheel download for immediate fallback without state mutation
  3. Comprehensive error reporting for actionable information
  4. Compatibility with existing workflows
  5. Serial mode only - simplifies implementation while catching all failure types

Scope: Test mode is only supported with bootstrap (serial mode):

  • Detects all failure types: wheel compilation, source preparation, and dependency resolution
  • Serial mode performance is acceptable since most packages use cached wheels
  • Simpler implementation without the complexity of parallel build coordination

Rationale for Serial Only: Adding test mode to bootstrap-parallel would:

  • Only catch ~20% of failures (source prep), missing ~80% (wheel builds)
  • Add significant complexity to parallel build phase
  • Give false confidence about test coverage

This design enables complete failure discovery in a single run while keeping the implementation maintainable.

Limitations and Future Work

Current Limitations

  1. Serial Mode Only

    • Not supported in bootstrap-parallel
    • Rationale: Parallel mode would only catch ~20% of failures (source prep phase)
    • Wheel compilation failures (~80% of issues) would be missed in parallel mode
    • Serial mode provides complete failure visibility
  2. Fixed Retry Strategy

    • Uses n+1 retry approach (build attempt, then pre-built fallback)
    • No configurable retry policies
    • Single pre-built fallback attempt per package
  3. Human-Readable Output Only

    • Logs are formatted for human consumption
    • No machine-readable output format (e.g., JSON)
    • Integration with CI/CD tools requires log parsing
  4. No Resume Capability

    • Cannot resume from last successful package
    • Must reprocess all packages on subsequent runs
    • No checkpoint/state persistence between runs

Future Enhancements

  1. Machine-Readable Output

    fromager bootstrap --test-mode --json-output failures.json -r requirements.txt
    • Export failure details in JSON format for tooling integration
    • Enable automated analysis and reporting
    • Support CI/CD pipeline integration
  2. Configurable Retry Policies

    # settings.yaml
    test_mode:
      max_retries: 3
      retry_delay: 5
      fallback_strategies:
        - pre_built_wheel
        - cached_wheel
        - skip
  3. Resume from Checkpoint

    fromager bootstrap --test-mode --resume-from checkpoint.json -r requirements.txt
    • Save progress to checkpoint file
    • Resume from last successful package
    • Useful for large requirement files with long build times
  4. Enhanced Failure Classification

    • Categorize failures by root cause (missing build tools, dependency issues, etc.)
    • Suggest specific remediation steps
    • Group related failures for batch fixing
  5. CI/CD Integration

    # GitHub Actions annotations
    fromager bootstrap --test-mode --github-actions -r requirements.txt
    
    # GitLab CI format
    fromager bootstrap --test-mode --gitlab-ci -r requirements.txt
    • Native integration with GitHub Actions workflow commands
    • GitLab CI test reports
    • Better visibility in CI/CD environments
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment