LalatenduMohanty/Testing_Design.md

## Testing_Design.md

      
    Raw
  

              Testing_Design.md
            
          
    Container Image Testing Design

This document describes the testing strategy for ODH base container images.
Goals


Validate built images - Ensure images meet quality standards before merge
Fast feedback - Tests run on every PR as pre-merge checks
Local reproducibility - Developers can run the same tests locally
No special hardware - Tests run on standard CI runners (no GPU required)

Test Architecture

┌─────────────────────────────────────────────────────────────┐
│                      GitHub Actions                         │
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────────────┐  │
│  │ Lint Job    │  │ Build Python│  │ Build CUDA          │  │
│  │ (hadolint)  │  │ + Test      │  │ + Test (no GPU)     │  │
│  └─────────────┘  └─────────────┘  └─────────────────────┘  │
└─────────────────────────────────────────────────────────────┘
                              │
                              ▼
┌─────────────────────────────────────────────────────────────┐
│                    pytest test suite                        │
│  ┌─────────────────┐  ┌─────────────────────────────────┐   │
│  │ conftest.py     │  │ test_python_image.py            │   │
│  │ - fixtures      │  │ test_cuda_image.py              │   │
│  │ - podman helper │  └─────────────────────────────────┘   │
│  └─────────────────┘                                        │
└─────────────────────────────────────────────────────────────┘
                              │
                              ▼
┌─────────────────────────────────────────────────────────────┐
│              Container Runtime (podman)                     │
│  ┌─────────────────┐       ┌─────────────────┐              │
│  │ Python Image    │       │ CUDA Image      │              │
│  │ (under test)    │       │ (under test)    │              │
│  └─────────────────┘       └─────────────────┘              │
└─────────────────────────────────────────────────────────────┘

Test Categories

All tests below are blocking - PRs cannot merge if any fail.
1. Smoke Tests

Basic sanity checks that the image starts and core tools work.


Test
Command
Expected


Python version
python --version
Python 3.12.x


pip available
pip --version
Exit 0


uv available
uv --version
Exit 0


2. User & Permission Tests

Verify OpenShift compatibility (non-root user, correct permissions).


Test
Check
Expected


User ID
id -u
1001


Group ID
id -g
0 (root group)


Not running as root
whoami
Not root


Workdir writable
touch /opt/app-root/src/test
Exit 0


3. Configuration Tests

Verify package index configuration files exist and are valid.


Test
Check
Expected


pip.conf exists
/etc/pip.conf
File exists


pip.conf valid
/etc/pip.conf
Contains [global]


uv.toml exists
/etc/uv/uv.toml
File exists


UV_CONFIG_FILE set
printenv UV_CONFIG_FILE
/etc/uv/uv.toml


4. Image Metadata Tests

Verify Dockerfile directives are set correctly via podman inspect.


Directive
Check
Expected


WORKDIR
.Config.WorkingDir
/opt/app-root/src


USER
.Config.User
1001


5. Environment Variable Tests

Verify expected environment variables are set.


Variable
Expected Value


HOME
/opt/app-root/src


PATH
Contains /opt/app-root/bin


PYTHONDONTWRITEBYTECODE
1


PYTHONUNBUFFERED
1


PIP_NO_CACHE_DIR
1


UV_SYSTEM_PYTHON
1


6. OCI Label Tests

Verify required OCI and OpenShift labels are present.


Label
Expected


name
Image name set


version
Version string set


io.k8s.display-name
Kubernetes display name


org.opencontainers.image.source
GitHub URL


com.opendatahub.accelerator
cpu or cuda


com.opendatahub.python
3.12


7. File System Structure Tests

Verify expected directories and files exist.


Path
Type
Expected


/opt/app-root/src
Directory
Exists, is WORKDIR


/etc/pip.conf
File
Exists


/etc/uv/uv.toml
File
Exists


8. Security Tests

Basic security posture checks.


Test
Check
Expected


User is non-root
Container starts as UID 1001
Not root


Sensitive files protected
cat /etc/shadow
Permission denied


9. CUDA-Specific Tests

Additional tests for CUDA image (no GPU required).


Test
Check
Expected


CUDA_VERSION
Environment variable
12.8.x


NVIDIA_VISIBLE_DEVICES
Environment variable
all


nvcc exists
which nvcc
/usr/local/cuda/bin/nvcc


CUDA in PATH
printenv PATH
Contains /usr/local/cuda/bin


CUDA toolkit dir
/usr/local/cuda
Directory exists


10. CUDA Library Tests (no GPU required)

Verify CUDA shared libraries are present.


Library
Check


libcudart
ldconfig -p | grep libcudart


libcublas
ldconfig -p | grep libcublas


libcudnn
ldconfig -p | grep libcudnn


11. CUDA Label Tests

Verify CUDA-specific labels.


Label
Expected


com.nvidia.cuda.version
CUDA version string


com.opendatahub.accelerator
cuda


Note: nvidia-smi requires GPU hardware and is skipped in CI.
Test Implementation

Directory Structure

tests/
├── conftest.py              # Shared fixtures and helpers
├── test_common.py           # Tests that apply to BOTH images
├── test_python_image.py     # Python-specific tests (labels)
└── test_cuda_image.py       # CUDA-specific tests

Test Dependencies

Create requirements-test.txt:
pytest>=8.0.0

Fixtures (conftest.py)

The test runner uses a session-scoped container for efficiency. Instead of starting
a new container for each test (~30 container startups), we start one container per image
and use podman exec to run commands. This reduces test time significantly.
import os
import subprocess
import json
import shlex
import pytest


class ContainerRunner:
    """Efficient container runner using session-scoped container with exec.

    Starts a single container per test session and uses 'podman exec' to run
    commands. This avoids the overhead of starting a new container for each test.
    """

    def __init__(self, image: str):
        self.image = image
        self.container_id = None

    def start(self):
        """Start container in background with sleep infinity."""
        result = subprocess.run(
            ["podman", "run", "-d", "--rm", self.image, "sleep", "infinity"],
            capture_output=True,
            text=True,
            timeout=60,
        )
        if result.returncode != 0:
            raise RuntimeError(f"Failed to start container: {result.stderr}")
        self.container_id = result.stdout.strip()

    def stop(self):
        """Stop and remove container."""
        if self.container_id:
            subprocess.run(
                ["podman", "stop", "-t", "1", self.container_id],
                capture_output=True,
                timeout=30,
            )
            self.container_id = None

    def run(self, command: str, timeout: int = 30) -> subprocess.CompletedProcess:
        """Execute command in running container using podman exec."""
        if not self.container_id:
            raise RuntimeError("Container not started. Call start() first.")
        return subprocess.run(
            ["podman", "exec", self.container_id, "bash", "-c", command],
            capture_output=True,
            text=True,
            timeout=timeout,
        )

    def get_env(self, var: str) -> str:
        """Get an environment variable value safely."""
        if not var.replace("_", "").isalnum():
            raise ValueError(f"Invalid environment variable name: {var}")
        result = self.run(f"printenv {var}")
        return result.stdout.strip() if result.returncode == 0 else ""

    def file_exists(self, path: str) -> bool:
        """Check if a file exists."""
        result = self.run(f"test -f {shlex.quote(path)}")
        return result.returncode == 0

    def dir_exists(self, path: str) -> bool:
        """Check if a directory exists."""
        result = self.run(f"test -d {shlex.quote(path)}")
        return result.returncode == 0

    def get_labels(self) -> dict:
        """Get image labels using podman inspect."""
        result = subprocess.run(
            ["podman", "inspect", "--format", "{{json .Config.Labels}}", self.image],
            capture_output=True,
            text=True,
            timeout=30,
        )
        if result.returncode == 0:
            return json.loads(result.stdout)
        return {}

    def get_config(self, key: str) -> str:
        """Get image config value using podman inspect."""
        result = subprocess.run(
            ["podman", "inspect", "--format", f"{{{{json .Config.{key}}}}}", self.image],
            capture_output=True,
            text=True,
            timeout=30,
        )
        if result.returncode == 0:
            return json.loads(result.stdout)
        return None


@pytest.fixture(scope="session")
def python_image():
    """Image name for Python base image."""
    return os.environ.get(
        "PYTHON_IMAGE",
        "localhost/odh-midstream-python-base:3.12-ubi9"
    )


@pytest.fixture(scope="session")
def cuda_image():
    """Image name for CUDA base image."""
    return os.environ.get(
        "CUDA_IMAGE",
        "localhost/odh-midstream-cuda-base:12.8-py312"
    )


@pytest.fixture(scope="session")
def python_container(python_image):
    """Session-scoped container runner for Python image.

    Container starts once at session start and stops at session end.
    All tests share the same running container.
    """
    runner = ContainerRunner(python_image)
    runner.start()
    yield runner
    runner.stop()


@pytest.fixture(scope="session")
def cuda_container(cuda_image):
    """Session-scoped container runner for CUDA image.

    Container starts once at session start and stops at session end.
    All tests share the same running container.
    """
    runner = ContainerRunner(cuda_image)
    runner.start()
    yield runner
    runner.stop()
Performance comparison:


Approach
~30 tests
Container starts


New container per test
~60-90 seconds
30


Session container + exec
~5-10 seconds
1


Note: Since tests share the same container, avoid tests that modify global state.
All current tests are read-only (checking env vars, file existence, running queries)
so this is safe.
Example Tests (test_common.py)

import pytest


@pytest.fixture(params=["python_container", "cuda_container"])
def container(request):
    """Parameterize to run same tests against both images."""
    return request.getfixturevalue(request.param)


# --- Smoke Tests ---

def test_python_version(container):
    result = container.run("python --version")
    assert result.returncode == 0
    assert "Python 3.12" in result.stdout


def test_pip_available(container):
    result = container.run("pip --version")
    assert result.returncode == 0


def test_uv_available(container):
    result = container.run("uv --version")
    assert result.returncode == 0


# --- User & Permission Tests ---

def test_user_id(container):
    result = container.run("id -u")
    assert result.returncode == 0
    assert result.stdout.strip() == "1001"


def test_group_id(container):
    result = container.run("id -g")
    assert result.returncode == 0
    assert result.stdout.strip() == "0"


def test_not_root(container):
    result = container.run("whoami")
    assert result.returncode == 0
    assert result.stdout.strip() != "root"


def test_workdir_writable(container):
    result = container.run("touch /opt/app-root/src/test && rm /opt/app-root/src/test")
    assert result.returncode == 0


# --- Configuration Tests ---

def test_pip_conf_exists(container):
    assert container.file_exists("/etc/pip.conf")


def test_pip_conf_valid(container):
    result = container.run("cat /etc/pip.conf")
    assert "[global]" in result.stdout


def test_uv_toml_exists(container):
    assert container.file_exists("/etc/uv/uv.toml")


def test_uv_config_file_env(container):
    assert container.get_env("UV_CONFIG_FILE") == "/etc/uv/uv.toml"


# --- Image Metadata Tests ---

def test_workdir(container):
    assert container.get_config("WorkingDir") == "/opt/app-root/src"


def test_user(container):
    assert container.get_config("User") == "1001"


# --- Environment Variable Tests ---

def test_home(container):
    assert container.get_env("HOME") == "/opt/app-root/src"


def test_path_contains_app_root(container):
    assert "/opt/app-root/bin" in container.get_env("PATH")


def test_pythondontwritebytecode(container):
    assert container.get_env("PYTHONDONTWRITEBYTECODE") == "1"


def test_pythonunbuffered(container):
    assert container.get_env("PYTHONUNBUFFERED") == "1"


def test_pip_no_cache_dir(container):
    assert container.get_env("PIP_NO_CACHE_DIR") == "1"


def test_uv_system_python(container):
    assert container.get_env("UV_SYSTEM_PYTHON") == "1"


# --- Security Tests ---

def test_shadow_not_readable(container):
    result = container.run("cat /etc/shadow")
    assert result.returncode != 0
Example Tests (test_cuda_image.py)

# --- CUDA Environment Tests ---

def test_cuda_version(cuda_container):
    assert cuda_container.get_env("CUDA_VERSION").startswith("12.8")


def test_nvidia_visible_devices(cuda_container):
    assert cuda_container.get_env("NVIDIA_VISIBLE_DEVICES") == "all"


def test_cuda_in_path(cuda_container):
    assert "/usr/local/cuda/bin" in cuda_container.get_env("PATH")


# --- CUDA Toolkit Tests ---

def test_nvcc_exists(cuda_container):
    result = cuda_container.run("which nvcc")
    assert result.returncode == 0
    assert "/usr/local/cuda" in result.stdout


def test_cuda_dir_exists(cuda_container):
    assert cuda_container.dir_exists("/usr/local/cuda")


# --- CUDA Library Tests ---

def test_libcudart_present(cuda_container):
    result = cuda_container.run("ldconfig -p | grep libcudart")
    assert result.returncode == 0


def test_libcublas_present(cuda_container):
    result = cuda_container.run("ldconfig -p | grep libcublas")
    assert result.returncode == 0


def test_libcudnn_present(cuda_container):
    result = cuda_container.run("ldconfig -p | grep libcudnn")
    assert result.returncode == 0


# --- CUDA Label Tests ---

def test_cuda_version_label(cuda_container):
    assert "com.nvidia.cuda.version" in cuda_container.get_labels()


def test_accelerator_label(cuda_container):
    assert cuda_container.get_labels().get("com.opendatahub.accelerator") == "cuda"
Running Tests

Local Development

# Build image first
./scripts/build.sh python

# Install test dependencies
pip install -r requirements-test.txt

# Run tests for Python image
pytest tests/test_common.py tests/test_python_image.py -v

# Run tests for CUDA image
./scripts/build.sh cuda
pytest tests/test_common.py tests/test_cuda_image.py -v
Environment Variables


Variable
Description
Default


PYTHON_IMAGE
Python image to test
localhost/odh-midstream-python-base:3.12-ubi9


CUDA_IMAGE
CUDA image to test
localhost/odh-midstream-cuda-base:12.8-py312


CI Integration

GitHub Actions Workflow

jobs:
  build-python:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: Build Python image
        run: ./scripts/build.sh python
      - name: Install test dependencies
        run: pip install -r requirements-test.txt
      - name: Run tests
        run: pytest tests/test_common.py tests/test_python_image.py -v

  build-cuda:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: Build CUDA image
        run: ./scripts/build.sh cuda
      - name: Install test dependencies
        run: pip install -r requirements-test.txt
      - name: Run tests
        run: pytest tests/test_common.py tests/test_cuda_image.py -v
Future: GPU Testing

When self-hosted GPU runners are available:
@pytest.mark.gpu
def test_nvidia_smi(cuda_container):
    result = cuda_container.run("nvidia-smi")
    assert result.returncode == 0
    assert "CUDA Version" in result.stdout
Test	Command	Expected
Python version	`python --version`	`Python 3.12.x`
pip available	`pip --version`	Exit 0
uv available	`uv --version`	Exit 0
Test	Check	Expected
User ID	`id -u`	`1001`
Group ID	`id -g`	`0` (root group)
Not running as root	`whoami`	Not `root`
Workdir writable	`touch /opt/app-root/src/test`	Exit 0
Test	Check	Expected
pip.conf exists	`/etc/pip.conf`	File exists
pip.conf valid	`/etc/pip.conf`	Contains `[global]`
uv.toml exists	`/etc/uv/uv.toml`	File exists
UV_CONFIG_FILE set	`printenv UV_CONFIG_FILE`	`/etc/uv/uv.toml`
Directive	Check	Expected
`WORKDIR`	`.Config.WorkingDir`	`/opt/app-root/src`
`USER`	`.Config.User`	`1001`
Variable	Expected Value
`HOME`	`/opt/app-root/src`
`PATH`	Contains `/opt/app-root/bin`
`PYTHONDONTWRITEBYTECODE`	`1`
`PYTHONUNBUFFERED`	`1`
`PIP_NO_CACHE_DIR`	`1`
`UV_SYSTEM_PYTHON`	`1`
Label	Expected
`name`	Image name set
`version`	Version string set
`io.k8s.display-name`	Kubernetes display name
`org.opencontainers.image.source`	GitHub URL
`com.opendatahub.accelerator`	`cpu` or `cuda`
`com.opendatahub.python`	`3.12`
Path	Type	Expected
`/opt/app-root/src`	Directory	Exists, is WORKDIR
`/etc/pip.conf`	File	Exists
`/etc/uv/uv.toml`	File	Exists
Test	Check	Expected
User is non-root	Container starts as UID 1001	Not root
Sensitive files protected	`cat /etc/shadow`	Permission denied
Test	Check	Expected
CUDA_VERSION	Environment variable	`12.8.x`
NVIDIA_VISIBLE_DEVICES	Environment variable	`all`
nvcc exists	`which nvcc`	`/usr/local/cuda/bin/nvcc`
CUDA in PATH	`printenv PATH`	Contains `/usr/local/cuda/bin`
CUDA toolkit dir	`/usr/local/cuda`	Directory exists
Library	Check
libcudart	`ldconfig -p \| grep libcudart`
libcublas	`ldconfig -p \| grep libcublas`
libcudnn	`ldconfig -p \| grep libcudnn`