Skip to content

Instantly share code, notes, and snippets.

@cgwalters
Last active January 9, 2026 21:20
Show Gist options
  • Select an option

  • Save cgwalters/c18c9337aa9345d763aa446cc95c7847 to your computer and use it in GitHub Desktop.

Select an option

Save cgwalters/c18c9337aa9345d763aa446cc95c7847 to your computer and use it in GitHub Desktop.
Container Root Directory Handling: A Deep Investigation

Container Root Directory Handling: A Deep Investigation

Assisted-by: OpenCode (Opus 4.5)

Executive Summary

OCI container layer tars may or may not include a root directory entry (./ or /). This is a known specification gap in the OCI image-spec. When root entries exist, container runtimes ignore them - both Podman and Docker explicitly skip root directory entries during extraction. The mode difference (0555 vs 0755) comes from hardcoded defaults used when creating the extraction directory before extraction begins:

Runtime Root Mode Root Mtime Honors Tar Root Entry?
Podman/Buildah (containers/storage) 0555 Extraction time No (always skipped)
Docker/containerd 0755 Extraction time No (always skipped)

Bottom line: Including a ./ entry in layer tars has no effect on the extracted filesystem - runtimes ignore it completely.

Why This Matters for Verified Filesystems

For use cases requiring fully verified/reproducible filesystems (e.g., composefs with fs-verity), the root directory metadata is included in the filesystem digest. Since container runtimes:

  1. Ignore any root entry in the tar
  2. Use different hardcoded defaults (0555 vs 0755)
  3. Set non-deterministic mtime (extraction time)

...we cannot rely on what the container runtime produces. Instead, root metadata must be derived deterministically from the image content itself, independent of extraction environment.

Test Environment

All empirical testing performed on Fedora CoreOS:

Component Version
FCOS 43.20251214.3.0
Docker 29.0.4 (containerd 2.1.5, runc 1.4.0)
Podman 5.7.1
containers-common 0.64.2-1.fc43
skopeo 1.21.0

Popular Base Image Analysis

Raw OCI layer blobs analyzed using skopeo copy docker://IMAGE oci:DIR:

Images WITH ./ Root Entry

Image Build System Root Metadata
debian:latest (trixie) debuerreotype 0.17 drwxr-xr-x 0/0 2025-12-29 00:00
debian:bookworm debuerreotype 0.17 drwxr-xr-x 0/0
debian:bullseye debuerreotype 0.17 drwxr-xr-x 0/0
opensuse/leap KIWI 10.2.33 drwxr-xr-x 0/0

Images WITHOUT Root Entry

Image First Entry Build System
quay.io/fedora/fedora:43 .profile buildkit ADD
docker.io/library/ubuntu:latest bin (symlink) Docker ADD
docker.io/library/alpine:latest bin/ buildkit ADD
registry.access.redhat.com/ubi10/ubi afs/ Unknown
docker.io/library/rockylinux:9 afs/ buildkit ADD
docker.io/library/almalinux:9 afs/ buildkit ADD
docker.io/library/centos:7 anaconda-post.log Docker ADD
gcr.io/distroless/static usr/ bazel rules_docker
cgr.dev/chainguard/static bin apko

Key Observations

  1. Only ~10% of popular base images include a root entry - primarily Debian (modern versions) and OpenSUSE
  2. Build tool determines presence, not the distribution
  3. Historical change in Debian: Older versions (stretch, buster) did NOT have ./; the switch to debuerreotype introduced it
  4. It doesn't matter: Runtimes ignore the entry anyway

Empirical Extraction Results

Testing with docker.io/library/debian:latest which HAS a ./ root entry:

Raw TAR Metadata (from OCI blob)

drwxr-xr-x 0/0   0 2025-12-29 00:00:00 ./
  • Mode: 0755 (drwxr-xr-x)
  • UID/GID: 0/0 (root:root)
  • Mtime: 2025-12-29 00:00:00 UTC

Docker/containerd Extraction

Path: /var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/14/fs
Mode: 0755 (drwxr-xr-x)
Mtime: 2026-01-09 17:10:10 (extraction time)
UID/GID: 0/0

Podman Rootful Extraction

Path: /var/lib/containers/storage/overlay/14aa62b.../diff
Mode: 0555 (dr-xr-xr-x)  ← DIFFERS FROM TAR
Mtime: 2026-01-09 17:10:37 (extraction time)
UID/GID: 0/0

Podman Rootless Extraction

Path: /var/home/core/.local/share/containers/storage/overlay/14aa62b.../diff
Mode: 0555 (dr-xr-xr-x)  ← SAME AS ROOTFUL
Mtime: 2026-01-09 17:10:57 (extraction time)
UID/GID: 1000/1000 (user-namespaced)

Summary Table

Runtime Mode Mtime UID/GID Honors Tar Root?
Raw tar 0755 2025-12-29 0/0 N/A (baseline)
Docker 0755 extraction time 0/0 No (coincidentally matches default)
Podman rootful 0555 extraction time 0/0 No
Podman rootless 0555 extraction time 1000/1000 No

Critical insight: Docker's 0755 matches the tar only by coincidence - containerd uses 0755 as its hardcoded default, not because it reads from the tar.


Code Analysis

containers/storage (Podman/Buildah)

Root entry is explicitly skipped during tar extraction.

Default Permissions

drivers/overlay/overlay.go:51:

const defaultPerms = os.FileMode(0o555)

Diff Directory Creation (before extraction)

drivers/overlay/overlay.go:1111-1112:

diff := path.Join(dir, "diff")
if err := idtools.MkdirAs(diff, forcedSt.Mode, forcedSt.IDs.UID, forcedSt.IDs.GID); err != nil {

Root Entry Skip Logic

pkg/archive/archive.go:1152-1180:

path := filepath.Join(dest, hdr.Name)
rel, err := filepath.Rel(dest, path)
if err != nil {
    return err
}
if rel == "." {
    rootHdr = hdr    // Save for potential xattr storage
}

// ... later ...

if fi, err := os.Lstat(path); err == nil {
    if fi.IsDir() && hdr.Name == "." {
        continue     // ROOT ENTRY SKIPPED HERE
    }
}

Flow

  1. Overlay driver creates diff directory with 0o555 (or forceMask)
  2. Tar extraction begins
  3. Root entry (./ or .) is detected and skipped because directory already exists
  4. If ForceMask enabled: original tar metadata saved to user.containers.override_stat xattr (but not applied to filesystem)

Docker/containerd

Root entry is explicitly skipped with a debug log message.

Snapshot Directory Creation

plugins/snapshots/overlay/overlay.go:539:

if err := os.Mkdir(filepath.Join(td, "fs"), 0755); err != nil {

Root Entry Skip Logic

pkg/archive/tar.go:261-263:

path := filepath.Join(ppath, filepath.Join("/", base))
if path == root {
    log.G(ctx).Debugf("file %q ignored: resolved to root", hdr.Name)
    continue   // ROOT ENTRY EXPLICITLY SKIPPED
}

Flow

  1. Snapshotter creates fs directory with 0755
  2. Tar extraction begins
  3. If entry resolves to root path, it's logged and completely ignored
  4. No xattr fallback, no post-processing - metadata is lost

OCI Specification Analysis

Current Specification Text

From layer.md:

"The final filesystem layout MUST match the result of applying the layers to an empty directory. The ownership, mode, and other attributes of the initial empty directory are unspecified."

This was explicitly added by PR #408.

Relevant Issues and PRs

Repository Issue/PR Status Title
opencontainers/image-spec #970 Open layer: clarify attributes for implied directories
opencontainers/image-spec #737 Open behavior around parent directory needs clarification
opencontainers/image-spec #408 Merged manifest: Explicitly unspecified attributes for the initial layer directory
containers/storage #2194 Closed chunked: handle creating root directory
containers/storage #1931 Closed archive: always fix mode for root dir with ForceMask
containers/storage #1799 Closed overlay: use the default mode for the root directory
containers/storage #937 Closed The mergedDir has different permission mode on two hosts
moby/moby #41261 Open "chmod 555 /" within docker build not working correctly

Key Discussion Points from PR #970

PR #970 proposes normative text:

When applying a layer, implementations MUST create any parent directories 
implied by an entry's path, even if it is otherwise absent from the archive. 
Attributes of the created parent directories MUST be set as follows:
* mtime is set to the Unix epoch (0)
* uid is set to 0
* gid is set to 0
* mode is set to 0755
* xattrs are empty

Discussion highlights:

  • Moby/Docker deliberately excludes . from parent directory creation
  • umoci (used by SUSE) uses 0755, root UIDs, and Unix epoch timestamps
  • containerd has complex logic but still skips root
  • Concern about shared layers: permissions can vary based on image pull order

Real-World Bugs from This Ambiguity

  • ubuntu:jammy-20240427 lacks root directory record, causing ForceMask issues
  • chmod 555 / doesn't persist through docker build (moby#41261)
  • Different permission modes (0755 vs 0555) on different hosts (storage#937)
  • mkdirat /: no such file or directory with zstd:chunked (storage#2191)

Handling Comparison Table

Aspect containers/storage containerd
Root detection rel == "." after filepath.Clean path == root after resolution
Skip condition If dir exists AND entry is "." Always if resolves to root
Metadata preservation Via xattr if ForceMask (not applied) None
Default root perms 0555 (defaultPerms) 0755 (snapshotter default)
"." vs "./" handling Both → "." Both → "."
Logging None Debug log

References

Specifications

Source Code (permalinks to current tip)

Key Issues

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment