The cpio newc format only supports individual files up to a 4 GiB limit. SCO
extended the newc format to support much larger files (up to around
9 EiB).
- SCO archives are normal
newcunless a file entry is too large. - For 'large files', the full size is stored as a second NUL‑terminated string in the name field.
- For 'large files',
c_filesizedoes not contain a usable size and is ignored.
Note: All of the following was worked out by creating archives with
various cpio implementations and examining them in hexdump. No source code
was used.
For files larger than the standard newc limit, the value stored in
c_filesize does not represent the real size and should be ignored. The true
size is stored as an extra string in the name field.
The newc name field is defined as exactly c_namesize bytes long. The first
NUL‑terminated string inside it is the filename. This extension to the newc
format places an additional NUL‑terminated string after it to encode the real
file size.
SCO sets c_namesize large enough to include:
- the actual filename, ending with
\0 - a second string of the form
size=<16-hex-digit> - a trailing
\0 - any required padding
Example: <filename>\0size=0000000153e59000\0
The <16-hex-digit> value (after size=) encodes the full file size. SCO
cpio and Heirloom pax/cpio interpret it as a signed 64 bit integer, giving a
practical maximum of around 9 EiB.
A newc reader should treat an entry as using the SCO extension if the name
field contains a second NUL‑terminated string of the form size=<16-hex-digit>
(where the value after size= is valid hex).
When a size= string is present:
- Parse the
<16-hex-digit>value as a signed 64 bit integer and use that value as the real file size. - Ignore the
c_filesizefield entirely. - Use only the first NUL‑terminated string as the filename when extracting.
Note: If no size= string is present, treat the entry as a normal newc
file and use c_filesize as usual.
SCO archives are ordinary newc archives by default. The extension is only
used when a file’s size exceeds what standard newc can represent.
This is similar to how libarchive’s restricted pax (rpax) mode works:
- If a file fits within ustar limits, write plain ustar.
- If a file exceeds ustar limits, add pax extensions.
Likewise, here:
- If a file fits within the 32 bit
newcsize field, write a normalnewcentry. - If a file exceeds that limit, add a NUL‑separated
size=<hex>string and store a placeholder inc_filesize.
SCO cpio and Heirloom pax/cpio can create and extract cpio files of this style.