The question of how to cache HTTP range requests — particularly for cloud-native geospatial formats like COG, PMTiles, FlatGeobuf, and cloud-optimized GeoParquet — keeps coming up. Brandon Liu's How many ranges can you fit in one request is a good treatment of the multi-range packing problem. But there's a mature, battle-tested system that already handles much of this at the client level, and its design choices are instructive even for people building entirely different stacks: GDAL's /vsicurl/ virtual filesystem.
When GDAL reads a cloud-optimized file via /vsicurl/ (or its cloud-specific variants /vsis3/, /vsigs/, /vsiaz/), it performs range request management internally. The behaviour is controlled by a set of configuration options that most users never touch, but that encode a lot of hard-won knowledge about how to read efficiently over HTTP.
Range merging and multiplexing. GDAL_HTTP_MERGE_CONSECUTIVE_RANGES merges nearby byte ranges into a single request, avoiding the overhead of many small HTTP round-trips. GDAL_HTTP_MULTIPLEX (defaulting to YES on HTTP/2 connections) enables HTTP/2 multiplexing so multiple range requests travel over a single connection. The related GDAL_HTTP_VERSION option controls which HTTP version is negotiated — defaulting to 1.1 in most environments, but 2TLS on some cloud VMs like Google Compute Engine.
Block size tuning. CPL_VSIL_CURL_CHUNK_SIZE controls the read block size (default 16KB). Increasing it reduces request count at the cost of potentially fetching unused bytes. GDAL_INGESTED_BYTES_AT_OPEN controls how many bytes are read in a single GET at file opening, which matters for cloud-optimized GeoTIFFs with large headers — reading more up front avoids extra round-trips during metadata parsing.
Two-tier caching. GDAL maintains two separate caches. VSI_CACHE / VSI_CACHE_SIZE control a per-file-handle block cache (defaulting to 25 MB) that is discarded when the handle is closed. CPL_VSIL_CURL_CACHE_SIZE controls a separate global LRU cache (defaulting to 16 MB) shared across all downloaded content. The global cache is the more interesting one: content persists after a file handle is closed and can be reused when the same file is reopened, for the lifetime of the process or until VSICurlClearCache() is called. When increasing the chunk size, the recommendation is to set the global cache to roughly 128× the chunk size.
Retry logic. GDAL_HTTP_MAX_RETRY and GDAL_HTTP_RETRY_DELAY handle transient failures on HTTP 429, 502, 503, and 504, which matters enormously for unreliable network conditions.
The /vsicurl/ layer understands HTTP 206 partial content responses natively and can parse multipart range responses — the same mechanism Liu's post discusses for packing multiple ranges into one HTTP round-trip.
- GDAL Virtual File Systems documentation — comprehensive reference for
/vsicurl/,/vsis3/,/vsigs/, and friends - GDAL Configuration Options index — full listing of
GDAL_HTTP_*,VSI_CACHE*,CPL_VSIL_CURL_*and everything else
Consider two ends of the connectivity spectrum. A user with fast, reliable internet benefits from aggressive range merging, HTTP/2 multiplexing, and a generous local block cache — essentially what GDAL already does out of the box with the right config. A custom client would replicate this: batch range requests, use multipart range responses where the server supports them, and cache blocks keyed by URL plus byte-range.
A user with slow, unreliable internet needs an adaptive approach. GDAL doesn't do adaptive request sizing, but the pattern could be layered on top: start with conservative single-range requests, widen the batch size as successful responses come back, and back off on failures. This is TCP slow-start logic applied at the application layer. The retry configuration options already handle the transient-failure case, but there's no built-in mechanism for dynamically adjusting request strategy based on observed throughput.
The client-side story is well-served by GDAL's existing infrastructure. The harder, less-solved problem is proxy-level caching.
Range requests and HTTP caching interact awkwardly. A naive caching proxy sees Range: bytes=1000-2000 and Range: bytes=1500-2500 as completely different requests with no overlap. To cache effectively, a proxy needs block-aligned caching: decomposing arbitrary range requests into fixed-size blocks, caching those blocks independently, and assembling responses from cached blocks. This is exactly what GDAL's VSI cache does in-process at the client, and the same principle needs to be replicated at the proxy level.
NGINX's proxy_cache combined with the slice module can do block-aligned caching of range requests, which is probably the most practical path for a CDN or regional proxy layer. Varnish can also be configured for this, but it requires deliberate setup in either case — it doesn't happen by default.
The architectural insight is that block-aligned caching is the same problem at every layer of the stack. GDAL solves it in-process with its two-tier cache. A regional proxy solves it with NGINX slice or equivalent. A CDN solves it the same way again. The block size at each layer can differ, but the principle is identical: normalize arbitrary byte ranges into aligned blocks, cache the blocks, assemble on read.
PMTiles is worth a specific mention because its directory structure is designed for efficient range access — tile-index lookups and tile data fetches are inherently well-suited to range request batching and block caching. QGIS's PMTiles support goes through GDAL's virtual filesystem, so all of the above VSI configuration applies directly. The format's two-level directory design means that caching the root directory and leaf directory blocks eliminates most of the metadata round-trips, leaving only the tile data fetches.
- Brandon Liu: How many ranges can you fit in one request
- Even Rouault's GDAL documentation on virtual file systems and network configuration — there are many knobs directly relevant to this problem that are not widely known