Skip to content

Instantly share code, notes, and snippets.

@ruario
Last active January 18, 2026 19:56
Show Gist options
  • Select an option

  • Save ruario/b02c762c95d6521d5df4f798cd795318 to your computer and use it in GitHub Desktop.

Select an option

Save ruario/b02c762c95d6521d5df4f798cd795318 to your computer and use it in GitHub Desktop.
# SCO cpio (newc) 9 EiB file size extension

SCO cpio (newc) 9 EiB file size extension

The cpio newc format only supports individual files up to a 4 GiB limit. SCO extended the newc format to support much larger files (up to around 9 EiB).

TL;DR

  • SCO archives are normal newc unless a file entry is too large.
  • For 'large files', the full size is stored as a second NUL‑terminated string in the name field.
  • For 'large files', c_filesize does not contain a usable size and is ignored.

Note: All of the following was worked out by creating archives with various cpio implementations and examining them in hexdump. No source code was used.

How SCO encodes large file sizes

For files larger than the standard newc limit, the value stored in c_filesize does not represent the real size and should be ignored. The true size is stored as an extra string in the name field.

The newc name field is defined as exactly c_namesize bytes long. The first NUL‑terminated string inside it is the filename. This extension to the newc format places an additional NUL‑terminated string after it to encode the real file size.

SCO sets c_namesize large enough to include:

  • the actual filename, ending with \0
  • a second string of the form size=<16-hex-digit>
  • a trailing \0
  • any required padding

Example: <filename>\0size=0000000153e59000\0

The <16-hex-digit> value (after size=) encodes the full file size. SCO cpio and Heirloom pax/cpio interpret it as a signed 64 bit integer, giving a practical maximum of around 9 EiB.

How to detect an SCO extended size entry

A newc reader should treat an entry as using the SCO extension if the name field contains a second NUL‑terminated string of the form size=<16-hex-digit> (where the value after size= is valid hex).

How to find the true size and filename

When a size= string is present:

  1. Parse the <16-hex-digit> value as a signed 64 bit integer and use that value as the real file size.
  2. Ignore the c_filesize field entirely.
  3. Use only the first NUL‑terminated string as the filename when extracting.

Note: If no size= string is present, treat the entry as a normal newc file and use c_filesize as usual.

This file size extension is only used if required

SCO archives are ordinary newc archives by default. The extension is only used when a file’s size exceeds what standard newc can represent.

This is similar to how libarchive’s restricted pax (rpax) mode works:

  • If a file fits within ustar limits, write plain ustar.
  • If a file exceeds ustar limits, add pax extensions.

Likewise, here:

  • If a file fits within the 32 bit newc size field, write a normal newc entry.
  • If a file exceeds that limit, add a NUL‑separated size=<hex> string and store a placeholder in c_filesize.

Tools that understand this extension

SCO cpio and Heirloom pax/cpio can create and extract cpio files of this style.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment