The authoritative DNS service solves two distinct but related problems:
User-facing DNS management: Datum Cloud customers own domain names (e.g., example.com) and want
Datum to serve authoritative DNS for them. Users create a Domain resource to claim ownership, a
DNSZone resource to declare a hosted zone, and DNSRecordSet resources to manage records. Datum's
infrastructure then serves live DNS responses for those zones.
Infrastructure DNS bootstrapping: Datum's own infrastructure (datumdomains.net,
datumproxy.net, datum-cloud.net, prism.*.datum.net) needs authoritative nameservers. This is
handled by a separate, static system (datum-auth-dns) consisting of KnotDNS + HickoryDNS instances
managed via DNSEndpoint CRs and ExternalDNS RFC2136 updates.
The two systems share the same physical nameserver fleet but serve different purposes: the dns-operator system is the dynamic, Kubernetes-API-driven path for customer zones; the datum-auth-dns system is the static infrastructure path for Datum's own zones.
| Component | Role |
|---|---|
network-services-operator |
Watches user Domain resources; verifies domain ownership (TXT/HTTP/DNSZone); fetches RDAP/WHOIS registration metadata; creates DNSRecordSet resources for Gateway hostnames |
dns-operator (control-plane role: --role=replicator) |
Watches DNSZone and DNSRecordSet resources across all project control planes (via Milo multi-cluster discovery); replicates them into the downstream cluster |
dns-operator (downstream/agent role) |
In the downstream cluster; reconciles DNSZone / DNSRecordSet objects into PowerDNS via its API; manages DNSZoneClass / DNSZone / DNSRecordSet lifecycle |
PowerDNS Auth 5.1 |
Authoritative DNS server; uses LMDB backend; configured to expand ALIAS records via an in-pod recursor |
PowerDNS Recursor 5.1 |
In-pod sidecar used exclusively by PowerDNS for ALIAS record expansion (forwards to 1.1.1.1/8.8.8.8); listens on 127.0.0.1:5300 |
LightningStream |
Synchronizes LMDB state between the authoritative dns-operator writer and the read-only PowerDNS replicas; uses GCS (S3-compatible API) as the shared object store |
| GCS bucket (via Crossplane) | Central object store for LMDB snapshots; provisioned by Crossplane storage.gcp.upbound.io/v1beta1 Bucket |
external-dns-webhook |
Reads DNSEndpoint and Gateway HTTPRoute resources; translates them into DNSRecordSet objects in the Milo control plane for Datum's own infrastructure zones |
external-dns (infra system) |
Reads Gateway routes; syncs DNS records to GCP Cloud DNS for the cluster's own *.staging.env.datum.net / *.production.env.datum.net hostnames |
| KnotDNS (datum-auth-dns) | Authoritative NS for datumproxy.net, datum-cloud.net, prism.*.datum.net infrastructure zones; updated via RFC2136 |
| HickoryDNS (datum-auth-dns) | Serves only ns4.* addresses; exists to support memory-safety (Prossimo) demonstrating a Rust DNS server in production |
ExternalDNS RFC2136 sidecars |
Paired with each KnotDNS/HickoryDNS pod; watch DNSEndpoint CRs and push updates via RFC2136 NSUPDATE |
Milo control plane (milo-apiserver) |
Provides per-project Kubernetes-compatible API servers; DNS CRDs (DNSZone, DNSRecordSet) are installed into it and are the source-of-truth for customer DNS configuration |
| Redis (optional) | Shared cache for RDAP/WHOIS registry lookup results and rate-limit state across network-services-operator replicas |
1. User creates DNSZone + DNSRecordSet in their project control plane
(Milo per-project apiserver, namespace = project namespace)
2. dns-operator (control-plane, --role=replicator) is watching all project
control planes via Milo multi-cluster discovery
-> Discovers new DNSZone, looks up its DNSZoneClass (e.g., datum-external-global-dns)
-> Replicates DNSZone + all associated DNSRecordSets into the downstream cluster
(datum-dns-system namespace)
3. dns-operator (downstream/agent) reconciles in the downstream cluster
-> Calls PowerDNS API to create the zone in LMDB
-> Writes each record set to LMDB via the PowerDNS API
-> Sets status.Accepted=True, status.Programmed=True on DNSZone/DNSRecordSet
-> Writes back nameservers (from DNSZoneClass.spec.nameServerPolicy.static)
to DNSZone.status.nameservers
4. LightningStream detects the LMDB change (schema_tracks_changes: true)
-> Uploads a new LMDB snapshot to the GCS bucket
5. Each PowerDNS DaemonSet pod's lightningstream container (in receive mode)
polls the GCS bucket and downloads the latest snapshot to its local LMDB volume
6. PowerDNS reads zone data from the shared LMDB file
-> DNS queries for the zone are now answered live on port 53
7. The GCS bucket has dual access:
- Primary SA (objectAdmin) - the dns-operator writer
- Secondary SA (objectViewer) - the read-only edge replicas
sequenceDiagram
actor User
participant Milo as Milo API Server<br/>(project control plane)
participant Replicator as dns-operator<br/>(replicator)
participant Downstream as Downstream Cluster<br/>API Server
participant Agent as dns-operator<br/>(agent)
participant PDNS as PowerDNS<br/>HTTP API :8082
participant LSWriter as LightningStream<br/>(writer, primary SA)
participant GCS as GCS Bucket<br/>datum-lightningstream
participant LSReader as LightningStream<br/>(receiver, secondary SA)
participant LMDB as /lmdb/db<br/>(emptyDir per pod)
User->>Milo: kubectl create DNSZone + DNSRecordSet
Milo-->>Replicator: watch event (DNSZone created)
Replicator->>Downstream: replicate DNSZone + DNSRecordSet
Downstream-->>Agent: watch event (DNSZone created)
Agent->>PDNS: POST /api/v1/servers/localhost/zones (create zone)
Agent->>PDNS: PUT /api/v1/.../zones/{zone}/records (write records)
PDNS->>LMDB: write zone + records (LMDB append)
Agent->>Milo: patch DNSZone status (Accepted=True, nameservers=[...])
LMDB-->>LSWriter: schema_tracks_changes detects write
LSWriter->>GCS: upload LMDB delta snapshot + update_marker
loop each edge DaemonSet pod
LSReader->>GCS: poll update_marker
GCS-->>LSReader: new marker detected
LSReader->>GCS: download delta snapshot
LSReader->>LMDB: apply delta to local /lmdb/db
end
Note over LMDB: PowerDNS now serves live answers<br/>from memory-mapped LMDB
1. User creates Domain resource with spec.domainName = "example.com"
2. DomainReconciler (network-services-operator) runs:
a. Validates eTLD+1 via publicsuffix
b. Checks for existing DNSZone referencing this Domain
-> If DNSZone exists, is Accepted+Programmed, and its status.nameservers
overlap with Domain.status.nameservers from RDAP -> marks Verified=True
via VerifiedDNSZone path (no TXT/HTTP challenge needed)
c. If no DNSZone, generates a UUID verification token:
-> DNS path: TXT record _datum-custom-hostname.<domainname>
-> HTTP path: GET http://<domain>/.well-known/datum-custom-hostname-challenge/<uid>
d. Retries on backoff (5s -> 1m -> 5m with 25% jitter)
e. Concurrently runs RDAP/WHOIS lookup via registrydata.Client
-> Populates status.registration (registrar, expiry, DNSSEC)
-> Populates status.nameservers (from RDAP NS delegation)
3. Once Verified=True, Gateway controller can proceed to create DNS records
flowchart TD
A[User creates Domain<br/>spec.domainName = example.com] --> B[Validate eTLD+1<br/>via publicsuffix]
B --> C{DNSZone exists for<br/>this domain AND<br/>Accepted+Programmed?}
C -->|Yes| D[Compare DNSZone.status.nameservers<br/>vs RDAP nameservers]
D --> E{Nameservers overlap?}
E -->|Yes| F[VerifiedDNSZone=True\nFastest path — no challenge needed]
E -->|No| G[Fall through to TXT/HTTP]
C -->|No| G
G --> H[Generate UUID verification token]
H --> I{Try TXT record<br/>_datum-custom-hostname.example.com}
I -->|Found| J[VerifiedDNS=True]
I -->|Not found| K{Try HTTP challenge<br/>/.well-known/datum-custom-hostname-challenge/UUID}
K -->|200 OK| L[VerifiedHTTP=True]
K -->|Fail| M[Backoff: 5s → 1m → 5m\n±25% jitter\nRetry]
M --> I
F --> N[RDAP/WHOIS lookup\npopulates status.registration\nstatus.nameservers]
J --> N
L --> N
N --> O[Verified=True\nGateway controller can create DNSRecordSets]
1. User creates Gateway with hostname "api.example.com" on their project control plane
2. GatewayReconciler (network-services-operator) processes the Gateway:
a. Lists Domains in the same namespace
b. For "api.example.com" -> checks zones [example.com]
c. Finds Domain "example.com" with VerifiedDNSZone=True
d. Finds DNSZone for "example.com"
e. Determines record type:
- Apex domain (example.com) -> ALIAS record
- Subdomain (api.example.com) -> CNAME record
f. Creates DNSRecordSet named "{gateway-name}-{sha256(hostname)[:8]}"
pointing hostname -> GatewayDNSAddress (words+entropy subdomain under TargetDomain)
g. Sets owner reference on the DNSRecordSet (for GC when Gateway is deleted)
3. dns-operator picks up the new DNSRecordSet -> programs into PowerDNS
4. LightningStream replicates to all edge pods -> live DNS
1. Admin commits a DNSEndpoint CR to infra Git repo
(e.g., datumproxy.net_dnsendpoint.yaml with A/AAAA glue records for ns1-ns4)
2. FluxCD applies the DNSEndpoint CR to the cluster
3. ExternalDNS sidecar in the KnotDNS pod watches DNSEndpoint CRs
-> Sends RFC2136 NSUPDATE to the local knotd on 127.0.0.1:1053
-> KnotDNS updates its zone in memory
4. HickoryDNS ExternalDNS sidecar does the same for its pod (ns4 only)
5. DNS clients query ns1-ns4.datumproxy.net -> hit the LoadBalancer IPs
-> Routed to a KnotDNS or HickoryDNS pod via Cilium BGP
This section traces a UDP DNS query for a customer-managed record from the moment it leaves the client's resolver to the moment the response is returned.
The PowerDNS DaemonSet is exposed through a Kubernetes LoadBalancer Service of type
datum-managed-auth-dns. The Service carries a Cilium IPAM annotation that pins four specific IPv4
and four specific IPv6 addresses:
# apps/dns-operator/downstream/edge/service.yaml
lbipam.cilium.io/ips:
67.14.160.128, 67.14.161.128, 67.14.162.128, 67.14.163.128 (IPv4)
2607:ed40:0:8000::1, 2607:ed40:1:8000::1, ... (IPv6)
These IPs correspond to the published nameservers in the production DNSZoneClass:
# apps/dns-operator/downstream/production/dnszoneclass.yaml
nameServerPolicy:
mode: Static
static:
servers:
- ns1.datumdomains.net. (-> 67.14.160.128)
- ns2.datumdomains.net. (-> 67.14.161.128)
- ns3.datumdomains.net. (-> 67.14.162.128)
- ns4.datumdomains.net. (-> 67.14.163.128)
Cilium's BGP control plane (bgpControlPlane: enabled: true in
infrastructure/cilium/base/cilium-values.yaml) advertises these LoadBalancer IPs upstream via
eBGP. The CiliumBGPAdvertisement resource named auth-dns selects Services labeled
app.kubernetes.io/part-of: datum-managed-auth-dns:
# infrastructure/bgp/edge/auth-dns-advertisement.yaml
spec:
advertisements:
- advertisementType: "Service"
service:
aggregationLengthIPv4: 24 # aggregate to /24
aggregationLengthIPv6: 44 # aggregate to /44
addresses:
- LoadBalancerIP
selector:
matchExpressions:
- key: app.kubernetes.io/part-of
operator: In
values:
- datum-auth-dns
- datum-managed-auth-dns
Each worker node in an edge cluster has a CiliumBGPClusterConfig generated per-node with sessions
to two IPv4 and two IPv6 peers at NetActuate (ASN 36236), using the cluster's local ASN (e.g., 33438
for us-central-1-charlie):
# infrastructure/bgp/edge/generated/clusters/us-central-1-charlie/bgp-worker-2de7846f-dfw.json
"localASN": 33438,
"peers": [
{ "peerASN": 36236, "peerAddress": "209.177.156.100" },
{ "peerASN": 36236, "peerAddress": "209.177.156.254" },
{ "peerASN": 36236, "peerAddress": "2607:f740:100::f99" },
{ "peerASN": 36236, "peerAddress": "2607:f740:100::fa1" }
]
The BGP peer config (infrastructure/bgp/edge/peer-config.yaml) uses:
- Hold time: 90 seconds, keepalive: 30 seconds
- Graceful restart enabled with 120-second restart time
- Dual-stack (IPv4 unicast + IPv6 unicast) families
Since every worker node in an edge cluster advertises the same /24 (IPv4) or /44 (IPv6) prefix, Internet traffic is routed to the nearest edge cluster by the upstream AS (anycast via BGP). Within the cluster, Cilium ECMP spreads traffic across multiple nodes.
A companion DaemonSet (cilium-bgp-route-reconciler, RECONCILE_INTERVAL_SECONDS=30) runs on
every node with hostNetwork: true. Every 30 seconds it calls cilium bgp routes advertised and
uses ip route add local <prefix> dev lo table local to install a local route for each advertised
prefix. This ensures the node's kernel accepts packets destined for the LoadBalancer IP without
dropping them at the PREROUTING hook before Cilium's eBPF programs can handle them.
Cilium is configured with:
loadBalancer:
mode: dsr # Direct Server Return
dsrDispatch: geneve
serviceTopology: true
routingMode: tunnel
tunnelProtocol: geneve
kubeProxyReplacement: true
DSR mode means the node that receives the packet from the upstream BGP peer answers directly without
hairpinning back through a central load balancer node. serviceTopology: true causes Cilium to
prefer pods on the local node when they exist (trafficDistribution: PreferClose in the Service
spec confirms this preference).
The incoming UDP packet on port 53 is intercepted by Cilium's XDP or TC eBPF program before it
reaches the kernel's normal network stack. Cilium's kube-proxy replacement identifies the destination
address as a LoadBalancer VIP, performs DNAT, and selects a backend Pod endpoint. Because the
DaemonSet runs one pod per non-control-plane node and trafficDistribution: PreferClose is set,
Cilium will select the pod on the same node if one is healthy there.
The DaemonSet pod does not run with hostNetwork: true — it uses a standard pod network
namespace. Port 53 is declared as a named container port:
ports:
- containerPort: 53
name: dns
protocol: UDP
- containerPort: 53
name: dns-tcp
protocol: TCP
PowerDNS requires NET_BIND_SERVICE to bind a port below 1024 inside the container, which is
explicitly granted:
securityContext:
runAsUser: 953
runAsGroup: 953
capabilities:
drop: ["ALL"]
add: ["NET_BIND_SERVICE"]
After Cilium's DNAT the packet is delivered into the pod's network namespace and received by the
PowerDNS process on 0.0.0.0:53 / [::]:53 (both IPv4 and IPv6, as configured by
local-address=0.0.0.0,:: in pdns.conf).
PowerDNS Auth 5.1 is configured as a pure authoritative server with no recursion or zone transfer support:
# apps/dns-operator/downstream/edge/pdns.conf
primary=no
secondary=no
disable-axfr=yes
Zone discovery caches are disabled entirely so that zone additions are visible immediately from LMDB without requiring a cache flush:
zone-cache-refresh-interval=0
zone-metadata-cache-ttl=0
All DNS data is served from LMDB:
load-modules=liblmdbbackend.so
launch=lmdb
lmdb-filename=/lmdb/db
lmdb-shards=1
lmdb-random-ids=yes
lmdb-flag-deleted=yes
lmdb-map-size=1000 # megabytes
lmdb-lightning-stream=yes
The lmdb-lightning-stream=yes flag activates LightningStream-compatible operation: PowerDNS uses
the LMDB file in a read-only or append-limited fashion and relies on LightningStream to manage the
full file. lmdb-flag-deleted=yes means deleted records are flagged rather than physically removed
so LightningStream can track tombstones across replicas.
Query processing flow inside PowerDNS:
- PowerDNS receives the UDP datagram and parses the DNS message.
- It looks up the zone name in the LMDB backend by walking up the owner name hierarchy until it finds a zone apex that matches.
- It looks up the requested record type (QTYPE) within that zone.
- If the record exists and is a normal type (A, AAAA, TXT, MX, etc.), PowerDNS builds the answer section and returns it immediately.
- If the record type is CNAME, PowerDNS returns the CNAME and, depending on the query, may follow it.
- If the record is an ALIAS type (used for apex domains), PowerDNS triggers ALIAS expansion (step 4 below).
The PowerDNS server-id is set per-pod as $(NODE_NAME)/$(POD_NAME) which is visible in CHAOS
TXT id.server queries and in log output — useful for debugging which replica answered.
When PowerDNS encounters an ALIAS record (the Datum equivalent of ANAME/CNAME-at-apex), it needs to resolve the target hostname to A/AAAA records to return a synthesized answer in the A/AAAA query response. It cannot use the system resolver because that would create circular dependencies with any zones it is itself authoritative for.
Instead, PowerDNS is configured to forward ALIAS expansion queries to the in-pod recursor:
# pdns.conf
resolver=127.0.0.1:5300
expand-alias=yes
The recursor listens only on loopback and only accepts queries from loopback (hardened to prevent misuse):
# recursor.conf
incoming:
listen:
- "127.0.0.1:5300"
- "[::1]:5300"
allow_from:
- "127.0.0.1/32"
- "::1/128"
The recursor itself has no caching for zones it is authoritative for (it is not authoritative for anything). All queries are forwarded to Cloudflare and Google public resolvers:
forward_zones_recurse:
- zone: "."
forwarders:
- "1.1.1.1"
- "8.8.8.8"
So for an ALIAS record pointing example.com. at
exciting-word-12ab.prism.global.datum-cloud.net., the path is:
Client resolves A? for example.com
-> PowerDNS finds ALIAS record pointing at exciting-word-12ab.prism.global.datum-cloud.net.
-> PowerDNS sends A? for that target to 127.0.0.1:5300
-> Recursor forwards to 1.1.1.1 or 8.8.8.8
-> Gets A records for the canonical name
-> PowerDNS synthesizes an A response for example.com with those addresses
-> Returns to client
The recursor's resource allocation is deliberately generous (4Gi memory limit, 2 CPU) to handle its recursive resolution work, while the auth server itself needs far less (1Gi memory).
Prometheus metrics from the recursor are scraped on port 8083 by the PodMonitor.
PowerDNS builds the DNS response packet with the appropriate answer, authority, and additional sections, sets the AA (Authoritative Answer) bit, and sends the UDP datagram back. In DSR mode the response goes directly from the pod back to the client without passing through the ingress node again.
The data PowerDNS reads is kept current by LightningStream. In the edge DaemonSet pods the lightningstream container runs in receive mode:
args: ["--config", "/etc/lightningstream/lightningstream.yaml",
"--minimum-pid", "50", "receive"]
--minimum-pid 50 tells LightningStream to wait until the system has had at least 50 PIDs
allocated (a proxy for "other processes in the pod have started") before beginning sync. This
prevents it from downloading a snapshot before PowerDNS has opened the LMDB file.
LightningStream is configured to watch two LMDB databases on the same /lmdb/ volume:
# lightningstream.yaml (edge configmap)
lmdbs:
main:
path: /lmdb/db
options:
no_subdir: true
create: true
schema_tracks_changes: true
shard:
path: /lmdb/db-0
options:
no_subdir: true
create: true
schema_tracks_changes: true
schema_tracks_changes: true means LightningStream relies on schema-level change tracking rather
than polling the entire LMDB for changes, which is the efficient mode for PowerDNS's LMDB backend.
Storage is GCS accessed via its S3-compatible API:
storage:
type: s3
options:
endpoint_url: https://storage.googleapis.com/
bucket: datum-lightningstream # production
use_update_marker: true
use_update_marker: true causes LightningStream to write a small marker object to the bucket after
each snapshot upload. Receivers poll for this marker to detect when a new snapshot is available
without having to list all objects on every check.
The edge pods use the secondary GCS service account (objectViewer only), provisioned by ExternalSecrets from GCP Secret Manager:
# apps/dns-operator/downstream/edge/external-secret.yaml
spec:
secretStoreRef:
name: gcp-secret-store
kind: ClusterSecretStore
target:
name: s3-credentials
dataFrom:
- extract:
key: dns-s3-credentials-secondary
The primary (objectAdmin) account is used only by the dns-operator agent's LightningStream instance
(the StatefulSet in datum-dns-system) which writes new snapshots after PowerDNS zones are
programmed through the API.
LightningStream polling interval: LightningStream does not expose a configurable poll interval
in this config file; instead it uses the update marker and change notification approach. In practice,
after the dns-operator agent writes a new zone or record via the PowerDNS API, LightningStream
detects the LMDB change via schema_tracks_changes, uploads the snapshot to GCS (with an update
marker), and receivers detect the marker and download the delta. The end-to-end propagation from API
write to all edge pods seeing the change is typically sub-minute in normal GCS-connected conditions.
LightningStream metrics are exposed on port 8500 (lmdb-metrics) and scraped by the
PodMonitor alongside the PowerDNS API metrics on 8082 and recursor metrics on 8083.
PowerDNS Auth does not enable a packet cache in this deployment. The zone-cache-refresh-interval=0
and zone-metadata-cache-ttl=0 settings eliminate internal metadata caches. This means every query
results in a direct LMDB lookup, which is the correct behavior for a low-latency memory-mapped
database — LMDB reads are effectively in-process memory accesses (the OS page cache holds the
mapped pages). The absence of a packet cache ensures that record changes propagated by LightningStream
are visible immediately without cache staleness.
The recursor does maintain its own internal cache for recursive lookups (standard PowerDNS Recursor behavior), but this only affects ALIAS expansion targets — not the authoritative records themselves. The recursor's cache TTL is governed by the TTLs returned by upstream resolvers for the target hostnames.
The default TTL for records within a zone is 300 seconds, inherited from
DNSZoneClass.spec.defaults.defaultTTL: 300. Per-record TTLs can override this (e.g., the NS glue
records for datumdomains.net use ttl: 300 explicitly in the static DNSRecordSet manifests).
Clients' recursive resolvers will cache responses for those TTL durations, so in practice the
propagation delay visible to an end user is: LightningStream sync time + client resolver cache TTL.
Internet client (resolver)
| UDP port 53 to 67.14.160.128 (ns1.datumdomains.net)
v
NetActuate upstream router (ASN 36236)
| BGP ECMP across edge clusters advertising the /24
v
Edge node (Datum AS 33438, Cilium BGP peer)
| kernel receives packet; lo has local route for 67.14.160.128 (via reconciler)
| Cilium XDP/TC eBPF intercepts; DNAT to pod IP; DSR configured
v
Pod: datum-managed-auth-dns-<node> (namespace: datum-managed-auth-dns)
+-- container: pdns (ports 53/udp, 53/tcp, 8082/tcp)
| | reads from /lmdb/db via LMDB mmap
| | for ALIAS: queries 127.0.0.1:5300
| v
+-- container: pdns-recursor (port 5300/tcp+udp loopback only, 8083/tcp metrics)
| | forwards to 1.1.1.1 / 8.8.8.8
| v
+-- container: lightningstream (port 8500/tcp metrics)
| polls GCS bucket (storage.googleapis.com / datum-lightningstream)
| downloads LMDB deltas, applies to /lmdb/db
v
emptyDir volume: /lmdb (shared by pdns + lightningstream)
sequenceDiagram
participant Client as DNS Client<br/>(resolver)
participant NetActuate as NetActuate<br/>ASN 36236
participant Cilium as Cilium eBPF<br/>(edge node)
participant PDNS as PowerDNS<br/>container :53
participant LMDB as /lmdb/db<br/>(mmap)
participant Recursor as pdns-recursor<br/>127.0.0.1:5300
participant Upstream as Upstream Resolver<br/>1.1.1.1 / 8.8.8.8
Client->>NetActuate: UDP query A? example.com<br/>dst=67.14.160.128:53
Note over NetActuate: BGP ECMP selects<br/>nearest edge cluster
NetActuate->>Cilium: forward packet to edge node
Note over Cilium: XDP/TC intercepts; DNAT to pod IP<br/>DSR mode: response goes direct to client
Cilium->>PDNS: deliver UDP datagram to pod :53
PDNS->>LMDB: lookup zone for example.com (mmap read)
LMDB-->>PDNS: zone found
alt Normal record (A, AAAA, TXT, MX, CNAME, ...)
PDNS->>LMDB: lookup QTYPE records (mmap read)
LMDB-->>PDNS: records returned
PDNS-->>Client: DNS response (AA bit set, DSR direct)
else ALIAS record (apex domain)
PDNS->>LMDB: lookup ALIAS target hostname (mmap read)
LMDB-->>PDNS: ALIAS → exciting-word-12ab.prism.global.datum-cloud.net.
PDNS->>Recursor: A? exciting-word-12ab.prism.global.datum-cloud.net<br/>UDP 127.0.0.1:5300
Recursor->>Upstream: recursive query (forwards all via ".")
Upstream-->>Recursor: A records for canonical name
Recursor-->>PDNS: resolved A/AAAA records
PDNS-->>Client: synthesized A response for example.com (AA bit set, DSR direct)
end
| Resource | Scope | Purpose |
|---|---|---|
DNSZoneClass |
Cluster | Defines a class of DNS backend (controller name, nameserver policy, TTL defaults). Example: datum-external-global-dns with PowerDNS controller and 4 static nameservers |
DNSZone |
Namespaced | A hosted zone. References a DNSZoneClass. Status populated with nameservers, recordCount, DomainRef. Has selectable fields on spec.domainName and status.domainRef.name |
DNSRecordSet |
Namespaced | One DNS record type within a zone. References a DNSZone. Supports A, AAAA, ALIAS, CNAME, TXT, MX, SRV, CAA, NS, SOA, PTR, TLSA, HTTPS, SVCB. Has selectable fields on spec.dnsZoneRef.name and spec.recordType |
| Resource | Scope | Purpose |
|---|---|---|
Domain |
Namespaced | Represents a domain name a user wants to claim. Tracks ownership verification state (TXT/HTTP/DNSZone), RDAP/WHOIS registration metadata, nameserver delegation. Immutable spec.domainName |
HTTPProxy |
Namespaced | High-level L7 proxy abstraction that creates a Gateway + HTTPRoute underneath |
Gateway (Gateway API) |
Namespaced | Extended by network-services-operator which programs DNSRecordSet resources for each hostname whose domain has VerifiedDNSZone=True |
| Resource | Scope | Purpose |
|---|---|---|
DNSEndpoint |
Namespaced | Used by the datum-auth-dns path; contains static A/AAAA/CNAME records that ExternalDNS RFC2136 sidecars apply to KnotDNS/HickoryDNS |
| System | How Used |
|---|---|
| GCP Cloud DNS | Used by the external-dns infrastructure system for cluster-owned zones (*.staging.env.datum.net, *.production.env.datum.net). Workload Identity authenticated |
RDAP (openrdap/rdap) |
registrydata.Client queries RDAP providers (bootstrapped per TLD) to fetch domain registration metadata, expiry, registrar, nameservers. Rate-limited with token bucket + block windows |
WHOIS (domainr/whois) |
Fallback when RDAP bootstrap has no TLD match. Queries IANA bootstrap then registry/registrar WHOIS host |
| GCS (via S3 API) | LightningStream uses GCS HMAC keys (S3-compatible) as the central replication bus for PowerDNS LMDB state. Crossplane provisions the bucket and HMAC keys |
| Milo / per-project control planes | dns-operator (control-plane role) uses multi-cluster runtime (sigs.k8s.io/multicluster-runtime) with a Milo provider to discover all project control planes and watch their DNS resources |
| PowerDNS API | dns-operator agent directly calls the local PowerDNS HTTP API (port 8082) to create/update zones and records in LMDB |
| NetActuate (ASN 36236) | BGP upstream provider at each edge PoP; peers with Cilium on each worker node to receive LoadBalancer IP prefix advertisements |
| cert-manager CSI driver | TLS for the dns-operator webhook (csi.cert-manager.io); also used for client cert auth to the Milo control plane |
| Redis (optional) | Shared cache for RDAP/WHOIS results and rate-limit state across network-services-operator replicas |
clusters/{env}/apps/dns-operator.yaml
-> apps/dns-operator/control-plane/{staging|production}/
dependsOn: [victoria-metrics, milo-apiserver]
sourceRef: GitRepository flux-system
Kustomizations deployed:
dns-operator-manager (--role=replicator; watches Milo projects)
dns-operator-core-control-plane-resources (installs CRDs into Milo apiserver)
clusters/{env}/apps/dns-operator-downstream.yaml
-> apps/dns-operator/downstream/{staging|production}/
Kustomizations deployed:
dns-operator-agent (manages PowerDNS; sources from OCIRepository)
[edge DaemonSet deployed via apps/dns-operator/downstream/edge/]
clusters/{env}/infrastructure/dns-operator-storage.yaml
-> apps/dns-operator/storage/{staging|production}/
Deploys Crossplane GCS Bucket + ServiceAccounts + HMAC keys for LightningStream
clusters/{env}/apps/datum-auth-dns.yaml (staging only; edge has its own)
-> apps/datum-auth-dns/{staging|edge}/
dependsOn: [victoria-metrics]
Deploys KnotDNS + HickoryDNS DaemonSets + ExternalDNS sidecars
clusters/{env}/apps/external-dns-webhook.yaml (staging only)
-> apps/external-dns-webhook/staging/
dependsOn: [dns-operator]
Deploys ExternalDNS with Datum webhook provider
channels/edge/stable/infrastructure/cilium-bgp-announcements/
-> infrastructure/bgp/edge/ + generated/clusters/${cluster}
Deploys CiliumBGPPeerConfig, CiliumBGPAdvertisement (auth-dns, downstream-gateway,
edge-services), CiliumLoadBalancerIPPool, per-node CiliumBGPClusterConfig,
and the cilium-bgp-route-reconciler DaemonSet
| Image | Source |
|---|---|
ghcr.io/datum-cloud/dns-operator-kustomize |
Kustomize bundle for dns-operator (both control-plane and agent paths) |
ghcr.io/datum-cloud/external-dns-webhook |
Custom ExternalDNS webhook provider |
powerdns/pdns-auth-51 |
PowerDNS authoritative server |
powerdns/pdns-recursor-51:5.1.9 |
PowerDNS recursor (ALIAS expansion sidecar) |
powerdns/lightningstream:main |
LightningStream LMDB sync agent |
| Namespace | Contents |
|---|---|
datum-dns-system |
dns-operator manager + agent, LMDB secrets |
datum-managed-auth-dns (edge) |
PowerDNS + Recursor + LightningStream DaemonSet pods |
datum-auth-dns |
KnotDNS + HickoryDNS DaemonSets |
external-dns-webhook |
ExternalDNS + webhook provider |
external-dns |
Infrastructure ExternalDNS (GCP Cloud DNS) |
LightningStream for horizontal scale without a shared database. PowerDNS uses LMDB (a memory-mapped file) as its backend. The standard problem with LMDB in multi-pod deployments is that it cannot be shared across nodes. LightningStream solves this by having exactly one writer (the dns-operator agent, via the PowerDNS API) and N readers (edge pods). The agent writes to LMDB, LightningStream uploads deltas to GCS, and every edge pod's LightningStream container receives those deltas and applies them locally. This avoids a shared database entirely.
ALIAS record support via in-pod recursor. PowerDNS's ALIAS record type (ANAME-style) expands
the target hostname to A/AAAA records on the fly. To do this, PowerDNS needs a resolver. A
co-located pdns-recursor container listens on 127.0.0.1:5300, accepting only loopback traffic,
and forwards recursion to Cloudflare/Google. The authoritative server is configured with
resolver=127.0.0.1:5300 and expand-alias=yes. This means apex domains can point to a CDN
hostname without needing the client to do CNAME chasing.
DSR (Direct Server Return) with anycast BGP. Cilium is configured with
loadBalancer.mode: dsr and dsrDispatch: geneve. Combined with per-node BGP adjacencies to
NetActuate, each edge node receives and directly answers DNS packets for its local pod without
hairpinning. The cilium-bgp-route-reconciler DaemonSet (running every 30 seconds) ensures the
local kernel routing table has a local route for each advertised prefix on lo, which is required
for the node to accept packets destined for LoadBalancer IPs that Cilium's eBPF intercepts.
Zone cache completely disabled. zone-cache-refresh-interval=0 and
zone-metadata-cache-ttl=0 disable PowerDNS's internal zone metadata caches. This is intentional:
since LMDB is memory-mapped, lookups are already in-process memory accesses, and zone cache staleness
would delay visibility of records just programmed by LightningStream. The tradeoff is slightly more
LMDB reads per query (still O(1) by design).
No packet cache. There is no cache-ttl or query-cache-ttl setting in pdns.conf, so
PowerDNS's packet cache is off by default. This means every DNS query causes an LMDB read. For a
memory-mapped database on modern hardware this is extremely fast (sub-microsecond for pages already
in the OS page cache), and it eliminates the possibility of serving stale data after a record update
propagates via LightningStream.
Domain verification before DNS programming. The Gateway DNS controller will not create a
DNSRecordSet for a hostname unless its apex Domain resource has VerifiedDNSZone=True. The
fastest verification path is the DNSZone path: if the user has already delegated to Datum's
nameservers (matching Domain.status.nameservers vs DNSZone.status.nameservers), no TXT record or
HTTP challenge is needed. The controller reads nameservers from RDAP/WHOIS to make this
determination.
Multi-cluster reconciliation via multicluster-runtime + Milo. The dns-operator control-plane
component uses sigs.k8s.io/multicluster-runtime with the Milo provider. This means the operator
dynamically discovers all project control planes (per-tenant Kubernetes-compatible API servers) and
opens watch connections to each. A single operator instance handles all projects. In staging and
production the discovery.mode is milo; in the single-cluster server config it is single.
Two separate DNS stacks for infrastructure. The datum-auth-dns system (KnotDNS + HickoryDNS)
serves Datum's own infrastructure zones (ns1-ns4 glue records, prism-internal zones). This is
deliberately kept separate from the customer DNS stack (PowerDNS + dns-operator). The infrastructure
zones are managed by DNSEndpoint CRs and RFC2136 updates, not the dns-operator API, because they
predate it and have simpler, static record sets. HickoryDNS is deployed alongside KnotDNS
specifically to support the Prossimo memory-safety initiative by running ns4 on a Rust DNS
implementation.
Conflict detection for DNSRecordSets. When the Gateway controller creates a DNSRecordSet, it
checks whether an existing record with the same hostname annotation is already owned by a different
manager (labelManagedBy). If a conflict is detected, the hostname condition is set to
DNSRecordReasonConflict rather than silently overwriting.
GatewayDNSAddress uses words + entropy. The canonical hostname assigned to a Gateway (the
CNAME/ALIAS target) is not a sequential name. It is generated by
words.WordsAndEntropy(suffix, gatewayUID) — a deterministic but human-readable address derived from
the gateway's UUID, scoped under a configured TargetDomain. This prevents hostname enumeration.
Dual GCS credentials with least-privilege. Two GCS service accounts are provisioned by
Crossplane: one with roles/storage.objectAdmin (used by the dns-operator agent's LightningStream
writer) and one with roles/storage.objectViewer (used by all edge pod LightningStream receivers).
This ensures that a compromised edge pod cannot modify the DNS data that other pods read.
@startuml Datum Authoritative DNS - Container Diagram
!include https://raw.githubusercontent.com/plantuml-stdlib/C4-PlantUML/master/C4_Container.puml
LAYOUT_WITH_LEGEND()
title Datum Cloud Authoritative DNS Service — Container Diagram
Person(user, "Datum User", "Creates domains, zones, record sets, gateways")
Person(infra_admin, "Infra Admin", "Manages Datum's own DNS zones via GitOps")
System_Boundary(control_plane, "Control Plane Cluster") {
Container(milo_apiserver, "Milo API Server", "Kubernetes-compatible API server", "Per-tenant project control planes; hosts DNSZone, DNSRecordSet, Domain, Gateway CRDs")
Container(network_services_operator, "network-services-operator", "Go / controller-runtime", "Reconciles Domain (RDAP/WHOIS verification), DNSRecordSet creation for Gateway hostnames")
Container(dns_operator_manager, "dns-operator (replicator)", "Go / multicluster-runtime", "Discovers all project control planes via Milo; replicates DNSZone + DNSRecordSet to downstream cluster")
Container(redis, "Redis", "Redis", "Optional shared cache for RDAP/WHOIS lookup results and rate-limit state")
ContainerDb(milo_etcd, "Milo etcd", "etcd", "Stores DNSZone, DNSRecordSet, Domain, Gateway objects for all projects")
}
System_Boundary(downstream_cluster, "Downstream / Edge Cluster") {
Container(dns_operator_agent, "dns-operator (agent)", "Go / controller-runtime", "Watches local DNSZone + DNSRecordSet; programs PowerDNS via HTTP API; runs LightningStream writer")
Container(powerdns, "PowerDNS Auth 5.1", "C++ DNS server (LMDB backend)", "Serves authoritative DNS on :53 (UDP+TCP); no packet cache; LMDB reads are memory-mapped; ALIAS expansion via in-pod recursor")
Container(pdns_recursor, "PowerDNS Recursor 5.1", "C++ DNS recursor", "Listens on 127.0.0.1:5300 loopback only; forwards to 1.1.1.1/8.8.8.8; used solely for ALIAS expansion")
Container(lightningstream_writer, "LightningStream (writer)", "Go LMDB sync — in dns-operator agent pod", "Detects LMDB changes via schema_tracks_changes; uploads delta snapshots to GCS with update marker")
Container(lightningstream_reader, "LightningStream (receiver)", "Go LMDB sync — in each DaemonSet pod", "Polls GCS update marker; downloads delta snapshots; applies to local /lmdb/db emptyDir volume")
Container(bgp_reconciler, "cilium-bgp-route-reconciler", "Bash DaemonSet (hostNetwork)", "Every 30s: queries cilium bgp routes; installs local kernel routes on lo for advertised prefixes")
ContainerDb(lmdb, "LMDB emptyDir", "Memory-mapped file per pod", "Per-pod local DNS zone data; written by lightningstream_reader; read by PowerDNS via mmap")
}
System_Boundary(infra_dns_stack, "Infrastructure Auth DNS (datum-auth-dns)") {
Container(knotdns, "KnotDNS", "C DNS server", "Serves ns1-ns3 for datumproxy.net, datum-cloud.net, prism zones; RFC2136 updated")
Container(hickorydns, "HickoryDNS", "Rust DNS server", "Serves ns4 only; Prossimo memory-safety initiative")
Container(extdns_rfc2136, "ExternalDNS (RFC2136)", "Go sidecar per pod", "Watches DNSEndpoint CRs; pushes updates via RFC2136 NSUPDATE to local knotd/hickory")
}
System_Boundary(infra_external_dns, "Infrastructure ExternalDNS") {
Container(extdns_gcp, "ExternalDNS (GCP)", "Go HelmRelease", "Watches Gateway routes; syncs *.staging/production.env.datum.net to GCP Cloud DNS")
Container(extdns_webhook, "external-dns-webhook", "Go webhook provider HelmRelease", "Translates Gateway HTTPRoute / DNSEndpoint into DNSRecordSet objects for Datum infra zones")
}
System_Ext(gcs, "Google Cloud Storage", "Stores LightningStream LMDB delta snapshots; bucket datum-lightningstream; S3-compatible endpoint storage.googleapis.com")
System_Ext(gcp_cloud_dns, "GCP Cloud DNS", "Hosts cluster-level infrastructure zones")
System_Ext(rdap, "RDAP Providers", "Per-TLD RDAP endpoints (Verisign, IANA, etc.) — rate-limited, cached")
System_Ext(whois, "WHOIS Providers", "IANA bootstrap + registry WHOIS servers — fallback to RDAP")
System_Ext(dns_resolvers, "Cloudflare / Google DNS", "1.1.1.1, 8.8.8.8 — upstream resolvers used by recursor for ALIAS expansion")
System_Ext(netactuate, "NetActuate (ASN 36236)", "BGP upstream at each PoP; peers with Cilium per-node; receives /24 and /44 prefix advertisements for anycast")
' User interactions
Rel(user, milo_apiserver, "Creates Domain, DNSZone, DNSRecordSet, Gateway", "kubectl / API")
Rel(infra_admin, milo_etcd, "Commits DNSEndpoint CRs via GitOps", "FluxCD Git")
' Control plane internal
Rel(network_services_operator, milo_apiserver, "Watches Domain, DNSZone, Gateway; writes DNSRecordSet", "k8s watch/patch")
Rel(network_services_operator, rdap, "RDAP domain lookup (cached, rate-limited)", "HTTPS")
Rel(network_services_operator, whois, "WHOIS fallback (cached, rate-limited)", "TCP/43")
Rel(network_services_operator, redis, "Cache RDAP/WHOIS results and rate-limit state", "Redis protocol")
Rel(dns_operator_manager, milo_apiserver, "Discovers project control planes; watches DNSZone + DNSRecordSet", "k8s watch multicluster")
Rel(dns_operator_manager, dns_operator_agent, "Replicates DNSZone + DNSRecordSet into downstream cluster", "k8s create/update")
' Downstream — write path
Rel(dns_operator_agent, powerdns, "Creates/updates zones and records", "HTTP :8082 PowerDNS API")
Rel(lightningstream_writer, gcs, "Uploads LMDB delta snapshots + update marker", "GCS S3 API (primary objectAdmin SA)")
' Downstream — read path
Rel(lightningstream_reader, gcs, "Polls update marker; downloads deltas", "GCS S3 API (secondary objectViewer SA)")
Rel(lightningstream_reader, lmdb, "Applies deltas to local LMDB file", "filesystem write")
Rel(powerdns, lmdb, "Reads zone + record data", "mmap read (O(1) in-process)")
Rel(powerdns, pdns_recursor, "ALIAS expansion queries", "UDP/TCP 127.0.0.1:5300 loopback")
Rel(pdns_recursor, dns_resolvers, "Recursive resolution for ALIAS targets", "UDP/TCP :53")
' BGP and networking
Rel(bgp_reconciler, netactuate, "Installs local kernel routes so node accepts LB VIP packets; Cilium peers advertise /24 /44 prefixes", "Cilium BGP + iproute2")
' Infrastructure DNS
Rel(extdns_rfc2136, milo_etcd, "Watches DNSEndpoint CRs", "k8s watch")
Rel(extdns_rfc2136, knotdns, "RFC2136 NSUPDATE", "TCP loopback :1053")
Rel(extdns_rfc2136, hickorydns, "RFC2136 NSUPDATE", "TCP loopback")
Rel(extdns_gcp, gcp_cloud_dns, "Upsert DNS records", "GCP DNS API")
Rel(extdns_webhook, milo_apiserver, "Writes DNSRecordSet for infrastructure zones", "k8s create/update")
@endumlOperator source code:
/Users/aar/src/datum-cloud/network-services-operator/api/v1alpha/domain_types.go/Users/aar/src/datum-cloud/network-services-operator/internal/controller/domain_controller.go/Users/aar/src/datum-cloud/network-services-operator/internal/controller/gateway_dns_controller.go/Users/aar/src/datum-cloud/network-services-operator/internal/registrydata/DESIGN.md/Users/aar/src/datum-cloud/network-services-operator/internal/config/config.go
DNS operator API types (Go module cache):
/Users/aar/go/pkg/mod/go.miloapis.com/dns-operator@v0.5.1/api/v1alpha1/dnszone_types.go/Users/aar/go/pkg/mod/go.miloapis.com/dns-operator@v0.5.1/api/v1alpha1/dnsrecordset_types.go/Users/aar/go/pkg/mod/go.miloapis.com/dns-operator@v0.5.1/api/v1alpha1/dnszoneclass_types.go
Infrastructure repo — edge DNS serving:
/Users/aar/src/datum-cloud/infra/apps/dns-operator/downstream/edge/daemonset.yaml/Users/aar/src/datum-cloud/infra/apps/dns-operator/downstream/edge/pdns.conf/Users/aar/src/datum-cloud/infra/apps/dns-operator/downstream/edge/recursor.conf/Users/aar/src/datum-cloud/infra/apps/dns-operator/downstream/edge/lightningstream.yaml/Users/aar/src/datum-cloud/infra/apps/dns-operator/downstream/edge/service.yaml/Users/aar/src/datum-cloud/infra/apps/dns-operator/downstream/edge/external-secret.yaml/Users/aar/src/datum-cloud/infra/apps/dns-operator/downstream/edge/podmonitor.yaml
Infrastructure repo — BGP and networking:
/Users/aar/src/datum-cloud/infra/infrastructure/bgp/edge/auth-dns-advertisement.yaml/Users/aar/src/datum-cloud/infra/infrastructure/bgp/edge/peer-config.yaml/Users/aar/src/datum-cloud/infra/infrastructure/bgp/edge/cilium-bgp-local-routes-daemonset.yaml/Users/aar/src/datum-cloud/infra/infrastructure/bgp/edge/tools/reconcile-cilium-bgp-routes.sh/Users/aar/src/datum-cloud/infra/infrastructure/cilium/base/cilium-values.yaml/Users/aar/src/datum-cloud/infra/infrastructure/bgp/edge/generated/clusters/us-central-1-charlie/bgp-worker-2de7846f-dfw.json
Infrastructure repo — control plane and downstream operator:
/Users/aar/src/datum-cloud/infra/apps/dns-operator/control-plane/base/manager.yaml/Users/aar/src/datum-cloud/infra/apps/dns-operator/downstream/production/manager-kustomization-patch.yaml/Users/aar/src/datum-cloud/infra/apps/dns-operator/downstream/production/dnszoneclass.yaml/Users/aar/src/datum-cloud/infra/apps/dns-operator/downstream/production/dnsrecordsets.yaml/Users/aar/src/datum-cloud/infra/apps/dns-operator/storage/base/gcs-lightningstream.yaml
Infrastructure repo — datum-auth-dns (static infra zones):
/Users/aar/src/datum-cloud/infra/apps/datum-auth-dns/README.md/Users/aar/src/datum-cloud/infra/apps/datum-auth-dns/base/knotdns/knot.conf/Users/aar/src/datum-cloud/infra/apps/datum-auth-dns/edge/zones/datumproxy.net_dnsendpoint.yaml
Infrastructure repo — cluster entrypoints:
/Users/aar/src/datum-cloud/infra/clusters/staging/apps/dns-operator.yaml/Users/aar/src/datum-cloud/infra/clusters/staging/apps/dns-operator-downstream.yaml/Users/aar/src/datum-cloud/infra/clusters/production/apps/dns-operator.yaml/Users/aar/src/datum-cloud/infra/clusters/production/apps/dns-operator-downstream.yaml/Users/aar/src/datum-cloud/infra/channels/edge/stable/infrastructure/cilium-bgp-announcements/cilium-bgp-announcements.yaml