Skip to content

Instantly share code, notes, and snippets.

@thedudeabidesai
Last active February 9, 2026 21:22
Show Gist options
  • Select an option

  • Save thedudeabidesai/eb9490c031d869313142368150a060e9 to your computer and use it in GitHub Desktop.

Select an option

Save thedudeabidesai/eb9490c031d869313142368150a060e9 to your computer and use it in GitHub Desktop.
Securing Your OpenClaw Deployment β€” A practical security guide through Child & First Principles lenses 🎳

Deploying & Securing OpenClaw on Hetzner

A Complete Production Guide β€” Secure From Step One

Guide version: 2.0 β€” February 7, 2026 Last reviewed: 2026-02-07 | Lines: ~1125 | Grade: Multi-model audited (Opus 4.6, Codex 5.3, Grok 3)

Based on Brad Barbin's original Hetzner deployment gist. Security hardening from a real production audit by The Dude 🎳.

Platform: Written for Hetzner VPS (Ubuntu 24.04/22.04 or Debian 12), but the security principles and Docker-based deployment apply to any Linux host. macOS-specific notes are called out where relevant.


How to Read This Guide

Every major section has three parts:

  • πŸ§’ Child Lens β€” A simple analogy. If you can't explain it to a kid, you don't understand it.
  • πŸ”¬ First Principles Lens β€” What's actually at risk. No security theater.
  • Commands β€” Copy-paste ready.

This isn't theoretical. Every security item came from auditing a real OpenClaw deployment β€” including a backup system that had been silently failing for days.


Architecture Overview

Laptop ──SSH tunnel──▢ Hetzner VPS (127.0.0.1:18789) ──▢ Docker container (:18789)
                            β”‚
Tailnet devices β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ (host-level Tailscale proxy)

Key rules:

  • Gateway listens on container port 18789 β€” this never changes
  • Docker publishes to host port ${OPENCLAW_GATEWAY_PORT} β€” this can vary
  • Host binding stays loopback-only (127.0.0.1) unless you explicitly need remote exposure
  • Access via SSH tunnel or Tailscale β€” never expose directly to the internet

This guide cross-checks the live repo:

  • Dockerfile uses Bun and runs as USER node
  • Default image CMD is node openclaw.mjs gateway --allow-unconfigured
  • docker-compose.yml keeps container gateway port fixed at 18789

Prerequisites

  • Hetzner VPS (Ubuntu 24.04/22.04 or Debian 12)
  • Root SSH access
  • Domain/TLS optional (recommended if exposing beyond loopback/tailnet)
  • OpenClaw repo available on the host

1. Provision and Baseline Hardening

πŸ§’ Child Lens: Before you put anything in your new house, you lock the doors, install smoke detectors, and check the windows. Don't move in first and secure later.

πŸ”¬ First Principles Lens: A fresh VPS has SSH open to the internet and no firewall. Every minute it's exposed unpatched is a minute attackers can probe it. Baseline hardening reduces the attack surface before you install anything worth stealing.

SSH in and update

This is the only time you SSH as root. After creating the deploy user below, all subsequent commands use deploy@YOUR_VPS_IP with sudo.

ssh root@YOUR_VPS_IP

apt-get update
apt-get -y upgrade
apt-get install -y --no-install-recommends \
  ca-certificates curl gnupg ufw fail2ban unattended-upgrades jq

Automatic security updates

dpkg-reconfigure -plow unattended-upgrades

Firewall β€” before anything else

ufw default deny incoming
ufw default allow outgoing
ufw allow OpenSSH
ufw --force enable
ufw status verbose

⚠️ The SSH rule matters. Enabling the firewall without allowing SSH first means you just locked yourself out. There is no "undo" button from outside. We've seen deployment guides skip this step.

SSH hardening (recommended)

First, create a non-root deploy user (all remaining commands use this user via sudo):

adduser deploy
usermod -aG sudo deploy
# Copy your SSH key to the new user
mkdir -p /home/deploy/.ssh
cp ~/.ssh/authorized_keys /home/deploy/.ssh/
chown -R deploy:deploy /home/deploy/.ssh
chmod 700 /home/deploy/.ssh && chmod 600 /home/deploy/.ssh/authorized_keys

Then disable password auth and root login in /etc/ssh/sshd_config:

PasswordAuthentication no
PermitRootLogin no
AllowUsers deploy

Then systemctl restart sshd.

Test before disconnecting! Open a second terminal and verify ssh deploy@YOUR_VPS_IP works before closing your root session. If it fails, you still have the root session to fix it.

Key types: Use Ed25519 keys (ssh-keygen -t ed25519). RSA works but Ed25519 is shorter, faster, and has no known weaknesses. Changing the SSH port (e.g., 2222) reduces log noise from bots but is not a security measure β€” don't rely on it.

Time: 5 minutes. Impact: Massive.

fail2ban configuration

fail2ban was installed above but needs activation. Enable the SSH jail:

cat > /etc/fail2ban/jail.local <<'EOF'
[sshd]
enabled = true
port = ssh
maxretry = 5
bantime = 3600
findtime = 600
EOF
systemctl enable --now fail2ban
fail2ban-client status sshd  # verify it's running

Host intrusion detection (optional but recommended)

For detecting unauthorized file changes on the host:

apt install -y aide
aideinit  # generates initial database (takes a few minutes)
# Run daily check via cron:
echo '0 3 * * * root /usr/bin/aide --check' > /etc/cron.d/aide-check

πŸ”¬ First Principles Lens: fail2ban rate-limits brute-force attempts; AIDE detects if someone modifies system files after gaining access. Together they cover both the "getting in" and "already in" attack phases.


2. Install Docker Engine (Signed Repository)

πŸ§’ Child Lens: When you install an app, you want to make sure it came from the real store, not a fake one. Using Docker's signed repo is like checking the store's ID badge.

πŸ”¬ First Principles Lens: curl | sh downloads and executes in one step β€” HTTPS provides transport integrity, but you never inspect what you're running. A compromised server or CDN serves you malware and you execute it blindly. GPG-signed apt repos let the package manager verify the package hasn't been tampered with before installing β€” you can also inspect what you're getting.

install -m 0755 -d /etc/apt/keyrings
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | gpg --dearmor -o /etc/apt/keyrings/docker.gpg
chmod a+r /etc/apt/keyrings/docker.gpg

. /etc/os-release
echo \
  "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu ${VERSION_CODENAME} stable" \
  > /etc/apt/sources.list.d/docker.list

apt-get update
apt-get install -y docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin

docker --version
docker compose version

For Debian, replace ubuntu in the repo URL with debian.


3. Clone OpenClaw and Prepare Persistent Directories

πŸ§’ Child Lens: You're building a house (the container) on a foundation (the host). The stuff you care about β€” photos, documents β€” lives in the foundation, not the house. If the house burns down, you rebuild it. The foundation stays.

πŸ”¬ First Principles Lens: Containers are ephemeral. Any state not on a mounted volume is lost on recreate. File ownership must match the container's runtime user (node, UID 1000). Permission 700 means only the owner can read the directory β€” other users on the host can't peek at your config or secrets.

git clone https://github.com/openclaw/openclaw.git
cd openclaw

mkdir -p /home/deploy/.openclaw /home/deploy/.openclaw/workspace
chown -R 1000:1000 /home/deploy/.openclaw /home/deploy/.openclaw/workspace
chmod 700 /home/deploy/.openclaw /home/deploy/.openclaw/workspace

The 1000:1000 ownership matches USER node in the container image. Verify after building: docker compose run --rm openclaw-gateway id β€” expect uid=1000(node) gid=1000(node).

Git secrets audit β€” do this now

Before you start committing anything to this repo, set up your .gitignore:

cat >> .gitignore <<'EOF'
# Secrets β€” never commit these
auth-profiles.json
*.env
.env
discord-history/
EOF

Audit for any secrets already in history:

git log --all --diff-filter=A --name-only --pretty=format: | sort -u | grep -iE 'token|secret|key|password|auth|\.env'

If you find anything, scrub it:

# Install the tool
pip install git-filter-repo  # or: apt install git-filter-repo

# Remove a file from all history
git filter-repo --path auth-profiles.json --invert-paths --force

⚠️ git filter-repo rewrites ALL commit hashes. Existing clones and forks will diverge. Only use on repos you fully control, and force-push after.

What we actually found: A discord-history/ directory with message dumps committed to a workspace repo. Scrubbed it from all history. The content wasn't catastrophic, but the habit is β€” next time it could be API keys.


4. Create .env with Strict Permissions

πŸ§’ Child Lens: Your .env file is like a keychain with all your house keys, car keys, and safe combination on it. You don't leave it on the front porch β€” you keep it in your pocket, and only you can reach it.

πŸ”¬ First Principles Lens: The .env file contains bearer credentials. Anyone who reads it IS you from the provider's perspective. chmod 600 means only the file owner can read it. Never commit it. Never paste its contents in Discord or Slack β€” those are cloud services with message history, search indexing, and admin access you don't control.

cat > .env <<'ENV'
OPENCLAW_IMAGE=openclaw:hetzner
OPENCLAW_GATEWAY_TOKEN=   # Generate below
OPENCLAW_GATEWAY_BIND=lan  # Binds gateway to 0.0.0.0 INSIDE container β€” safe because Docker restricts host-side to 127.0.0.1 (see compose). On bare metal without Docker, use "loopback" instead!

# Host-side published ports only
OPENCLAW_GATEWAY_PORT=18789
OPENCLAW_BRIDGE_PORT=18790

OPENCLAW_CONFIG_DIR=/home/deploy/.openclaw
OPENCLAW_WORKSPACE_DIR=/home/deploy/.openclaw/workspace

# Optional provider secrets
# CLAUDE_AI_SESSION_KEY=
# CLAUDE_WEB_SESSION_KEY=
# CLAUDE_WEB_COOKIE=
ENV

chmod 600 .env

Generate a gateway token:

openssl rand -hex 32

Paste it into .env as OPENCLAW_GATEWAY_TOKEN.

Secret handling rules

  • Never commit .env
  • Keep .env mode 600
  • Rotate all leaked provider/session secrets immediately
  • Hand off secrets via encrypted channels only (Signal, iMessage) β€” never Discord/Slack

5. Compose File (Hardened)

πŸ§’ Child Lens: The compose file is your house's blueprint. It says where the doors are (ports), what rooms connect to what (volumes), and who's allowed in (bindings). A bad blueprint means unlocked doors facing the street.

πŸ”¬ First Principles Lens: Docker's default port publishing binds to 0.0.0.0 β€” every network interface. On a VPS with a public IP, that means your gateway is exposed to the entire internet. Binding to 127.0.0.1 restricts access to localhost only. Combined with token auth and SSH tunneling, this creates defense in depth.

services:
  openclaw-gateway:
    image: ${OPENCLAW_IMAGE:-openclaw:local}
    build:
      context: .
      dockerfile: Dockerfile
    environment:
      HOME: /home/node
      TERM: xterm-256color
      OPENCLAW_GATEWAY_TOKEN: ${OPENCLAW_GATEWAY_TOKEN}
      OPENCLAW_GATEWAY_BIND: ${OPENCLAW_GATEWAY_BIND:-lan}
      CLAUDE_AI_SESSION_KEY: ${CLAUDE_AI_SESSION_KEY}
      CLAUDE_WEB_SESSION_KEY: ${CLAUDE_WEB_SESSION_KEY}
      CLAUDE_WEB_COOKIE: ${CLAUDE_WEB_COOKIE}
    volumes:
      - ${OPENCLAW_CONFIG_DIR}:/home/node/.openclaw
      - ${OPENCLAW_WORKSPACE_DIR}:/home/node/.openclaw/workspace
    ports:
      - "127.0.0.1:${OPENCLAW_GATEWAY_PORT:-18789}:18789"
      - "127.0.0.1:${OPENCLAW_BRIDGE_PORT:-18790}:18790"
    init: true
    restart: unless-stopped
    security_opt:
      - no-new-privileges:true
    # Note: Docker applies default seccomp + AppArmor profiles automatically.
    # For stricter hardening, create a custom seccomp profile:
    # seccomp: /path/to/custom-seccomp.json
    # See: https://docs.docker.com/engine/security/seccomp/
    mem_limit: "1g"
    pids_limit: 256
    logging:
      driver: json-file
      options:
        max-size: "10m"
        max-file: "3"
    ulimits:
      nofile:
        soft: 65536
        hard: 65536
    command:
      [
        "node",
        "dist/index.js",
        "gateway",
        "--bind",
        "${OPENCLAW_GATEWAY_BIND:-lan}",
        "--port",
        "18789"
      ]

Why this matters:

  • Container always listens on 18789. Changing OPENCLAW_GATEWAY_PORT only affects the host mapping.
  • 127.0.0.1 binding = not reachable from the internet.
  • mem_limit / pids_limit prevent runaway processes from killing the VPS.
  • Never mount the Docker socket (/var/run/docker.sock) into the container β€” it's equivalent to root on the host.

Repo delta: The upstream docker-compose.yml omits the build: section, does not bind to 127.0.0.1, and has no resource limits. This guide adds all three as security hardening. If deploying with pre-built images and relying solely on firewall rules, you may remove the build section.


6. Hardened Dockerfile

πŸ§’ Child Lens: When you download a game, you want to know it's the real game and not a virus wearing a game costume. Checksums are like checking the game's fingerprint against a trusted list.

πŸ”¬ First Principles Lens: Supply chain attacks target the build pipeline. Pinning versions and verifying SHA256 checksums ensures you get exactly the binary you expect β€” not a compromised one from a hijacked release. releases/latest is a mutable pointer; an attacker who compromises the repo can redirect it.

Hardening additions over repo default: SHELL directive for safer pipe handling, pinned Bun version (upstream uses latest), optional binary installation with SHA256 verification. If you don't need custom skill binaries, the repo Dockerfile works as-is.

FROM node:22-bookworm

SHELL ["/bin/bash", "-o", "pipefail", "-c"]

# Install Bun to shared path (not /root, which is inaccessible to USER node)
ARG BUN_VERSION=1.2.22
# Download Bun install script to file first (inspectable, not a blind curl|bash pipe)
# Pin Bun binary directly (no install script β€” eliminates supply-chain risk)
RUN BUN_URL="https://github.com/oven-sh/bun/releases/download/bun-v${BUN_VERSION}/bun-linux-x64.zip" \
    && curl -fsSL -o /tmp/bun.zip "$BUN_URL" \
    && unzip -o /tmp/bun.zip -d /tmp/bun-extract \
    && mv /tmp/bun-extract/bun-linux-x64/bun /usr/local/bin/bun \
    && chmod +x /usr/local/bin/bun \
    && rm -rf /tmp/bun.zip /tmp/bun-extract \
    && bun --version
ENV PATH="/usr/local/bin:${PATH}"

RUN corepack enable
WORKDIR /app

# Optional OS packages for skill binaries
ARG OPENCLAW_DOCKER_APT_PACKAGES=""
RUN if [ -n "$OPENCLAW_DOCKER_APT_PACKAGES" ]; then \
      apt-get update && \
      DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends $OPENCLAW_DOCKER_APT_PACKAGES && \
      apt-get clean && \
      rm -rf /var/lib/apt/lists/* /var/cache/apt/archives/*; \
    fi

# β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
# β”‚ SHA256 VERIFICATION β€” OPTIONAL vs REQUIRED              β”‚
# β”‚                                                         β”‚
# β”‚ Core build (no skill binaries): SHA256 args NOT needed. β”‚
# β”‚ Just omit the --build-arg flags and the RUN blocks      β”‚
# β”‚ below become no-ops.                                    β”‚
# β”‚                                                         β”‚
# β”‚ If you ADD skill binaries (gog, goplaces, wacli):       β”‚
# β”‚ SHA256 verification is MANDATORY. Provide all three     β”‚
# β”‚ --build-arg SHA256 values or the build will skip them.  β”‚
# β”‚ Never deploy unverified third-party binaries.           β”‚
# β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

# Optional: pinned binaries with SHA256 verification
ARG GOG_VERSION=0.6.3
ARG GOG_SHA256
ARG GOPLACES_VERSION=0.4.1
ARG GOPLACES_SHA256
ARG WACLI_VERSION=0.5.2
ARG WACLI_SHA256

# Optional: pinned binaries with SHA256 verification
# If you don't need skill binaries (gog, goplaces, wacli), skip the --build-arg flags
# and this block does nothing. If you DO need them, provide all three SHA256 args.
RUN if [ -n "$GOG_SHA256" ]; then \
      curl -fsSL -o /tmp/gog.tar.gz "https://github.com/steipete/gog/releases/download/v${GOG_VERSION}/gog_Linux_x86_64.tar.gz" && \
      echo "${GOG_SHA256}  /tmp/gog.tar.gz" | sha256sum -c - && \
      tar -xzf /tmp/gog.tar.gz -C /usr/local/bin && \
      chmod +x /usr/local/bin/gog && \
      rm -f /tmp/gog.tar.gz; \
    fi

RUN if [ -n "$GOPLACES_SHA256" ]; then \
      curl -fsSL -o /tmp/goplaces.tar.gz "https://github.com/steipete/goplaces/releases/download/v${GOPLACES_VERSION}/goplaces_Linux_x86_64.tar.gz" && \
      echo "${GOPLACES_SHA256}  /tmp/goplaces.tar.gz" | sha256sum -c - && \
      tar -xzf /tmp/goplaces.tar.gz -C /usr/local/bin && \
      chmod +x /usr/local/bin/goplaces && \
      rm -f /tmp/goplaces.tar.gz; \
    fi

RUN if [ -n "$WACLI_SHA256" ]; then \
      curl -fsSL -o /tmp/wacli.tar.gz "https://github.com/steipete/wacli/releases/download/v${WACLI_VERSION}/wacli_Linux_x86_64.tar.gz" && \
      echo "${WACLI_SHA256}  /tmp/wacli.tar.gz" | sha256sum -c - && \
      tar -xzf /tmp/wacli.tar.gz -C /usr/local/bin && \
      chmod +x /usr/local/bin/wacli && \
      rm -f /tmp/wacli.tar.gz; \
    fi

COPY package.json pnpm-lock.yaml pnpm-workspace.yaml .npmrc ./
COPY ui/package.json ./ui/package.json
COPY patches ./patches
COPY scripts ./scripts
RUN pnpm install --frozen-lockfile

COPY . .
RUN OPENCLAW_A2UI_SKIP_MISSING=1 pnpm build
ENV OPENCLAW_PREFER_PNPM=1
RUN pnpm ui:build

ENV NODE_ENV=production
RUN chown -R node:node /app

USER node

CMD ["node", "openclaw.mjs", "gateway", "--allow-unconfigured"]

Obtain checksums by downloading releases and computing locally:

curl -fsSL -o /tmp/gog.tar.gz \
  "https://github.com/steipete/gog/releases/download/v0.6.3/gog_Linux_x86_64.tar.gz"
sha256sum /tmp/gog.tar.gz
# Use output as GOG_SHA256 build arg

7. Build and Run

πŸ§’ Child Lens: You've drawn the blueprint and bought the materials. Now you actually build the house and turn on the lights.

πŸ”¬ First Principles Lens: Every dependency you pull is a trust boundary. --no-cache forces a full rebuild so stale layers can't mask a compromised upstream. Verifying checksums post-build closes the loop: you trusted the hash at build time, now confirm the binary matches at runtime. The entire supply chain β€” base image, package manager, binaries β€” is only as strong as its weakest verified link. Checking logs immediately catches startup failures before you assume everything's fine.

# Core build (no skill binaries β€” omit SHA args entirely):
docker compose build --no-cache
docker compose up -d openclaw-gateway

# With skill binaries (replace with REAL checksums β€” do NOT use placeholders):
# Get checksums: sha256sum ./path/to/gog ./path/to/goplaces ./path/to/wacli
# docker compose build --no-cache \
#   --build-arg GOG_SHA256=abc123... \
#   --build-arg GOPLACES_SHA256=def456... \
#   --build-arg WACLI_SHA256=789fed...

⚠️ Do not pass placeholder values like YOUR_GOG_SHA256 β€” non-empty placeholders trigger checksum validation and the build will fail. Either pass real checksums or omit the args entirely.

Verify binaries exist (only if you installed them):

# These only apply if you installed skill binaries β€” skip if you did a core-only build
docker compose exec openclaw-gateway which gog && echo "βœ… gog" || echo "⏭️ gog not installed"
docker compose exec openclaw-gateway which goplaces && echo "βœ… goplaces" || echo "⏭️ goplaces not installed"
docker compose exec openclaw-gateway which wacli && echo "βœ… wacli" || echo "⏭️ wacli not installed"

Verify gateway is up:

docker compose logs -f openclaw-gateway

Integrity verification β€” lock it down now

# Use strict installs (fails if packages don't match lockfile)
# pnpm install --frozen-lockfile is already in the Dockerfile

# Audit for known vulnerabilities
docker compose exec openclaw-gateway pnpm audit

The honest gap: OpenClaw updates come via git pull. Git verifies integrity (SHA hashes on every commit) but not identity (commits aren't GPG-signed). Skills from clawhub have no signature verification currently. Lockfiles and pre-update review are the best local defenses until upstream adds signed releases.


8. Access Patterns

πŸ§’ Child Lens: Your house is built and locked. Now you need a way to get in β€” but you want a secret tunnel, not a door facing the highway.

πŸ”¬ First Principles Lens: SSH tunnels encrypt traffic and require key authentication. Tailscale creates a WireGuard mesh with per-device identity. Both keep the gateway off the public internet. Direct internet exposure requires TLS + token + firewall β€” three things that must all work perfectly, all the time.

A) SSH tunnel (safest)

ssh -N -L 18789:127.0.0.1:18789 deploy@YOUR_VPS_IP

Open http://127.0.0.1:18789/ and enter your OPENCLAW_GATEWAY_TOKEN.

B) Tailnet access (recommended for remote devices)

Keep Docker published on loopback and expose via host-level Tailscale proxy. Do not publish gateway directly to the public internet.

# Install Tailscale (apt repo β€” NOT curl|sh, consistent with our Docker install approach)
# Detect distro automatically (works for Ubuntu 22.04/24.04 and Debian 12)
if [ ! -f /etc/os-release ]; then echo "ERROR: /etc/os-release not found β€” install Tailscale manually"; exit 1; fi
DISTRO=$(. /etc/os-release && echo "$ID")
CODENAME=$(. /etc/os-release && echo "$VERSION_CODENAME")
if [ -z "$DISTRO" ] || [ -z "$CODENAME" ]; then echo "ERROR: Could not detect distro/codename from /etc/os-release"; exit 1; fi
curl -fsSL "https://pkgs.tailscale.com/stable/${DISTRO}/${CODENAME}.noarmor.gpg" | tee /usr/share/keyrings/tailscale-archive-keyring.gpg >/dev/null
curl -fsSL "https://pkgs.tailscale.com/stable/${DISTRO}/${CODENAME}.tailscale-keyring.list" | tee /etc/apt/sources.list.d/tailscale.list
apt update && apt install -y tailscale
tailscale up

# Verify your Tailscale IP
tailscale ip -4

# Allow Tailscale traffic through UFW
ufw allow in on tailscale0

Access the gateway from any device on your tailnet: http://<tailscale-ip>:18789/

⚠️ Docker + UFW footgun: Do NOT change the Docker host bind from 127.0.0.1 to 0.0.0.0 to expose the port on Tailscale. Docker bypasses UFW rules for published container ports β€” your gateway would be exposed to the public internet regardless of UFW settings. Instead, use tailscale serve:

tailscale serve --bg http://127.0.0.1:18789

This proxies traffic through your tailnet without changing Docker bindings. The container stays locked to localhost.

Security note: Tailscale uses WireGuard for encryption between nodes. If a device on your tailnet is compromised, the attacker can see traffic between that node and others. Treat your tailnet as a trusted network β€” but not a zero-trust one. For defense-in-depth, the gateway still requires token auth regardless of network path.

Tailscale key management:

  • Prefer tagged reusable keys with explicit expiration
  • Track key expiry dates and rotate before expiration
  • After rotation, verify node connectivity and ACL enforcement

C) TLS reverse proxy (if you must expose publicly)

Use Caddy for automatic HTTPS with zero config. Install and create a Caddyfile:

apt install -y debian-keyring debian-archive-keyring apt-transport-https
curl -1sLf 'https://dl.cloudsmith.io/public/caddy/stable/gpg.key' | gpg --dearmor -o /usr/share/keyrings/caddy-stable-archive-keyring.gpg
curl -1sLf 'https://dl.cloudsmith.io/public/caddy/stable/debian.deb.txt' | tee /etc/apt/sources.list.d/caddy-stable.list
apt update && apt install caddy
# /etc/caddy/Caddyfile
openclaw.example.com {
    reverse_proxy 127.0.0.1:18789
}
ufw allow 443/tcp
systemctl enable --now caddy

Caddy auto-provisions and renews Let's Encrypt TLS certificates. Keep the upstream on 127.0.0.1 β€” the proxy handles all public-facing traffic.


9. Token Rotation

πŸ§’ Child Lens: If someone copies your house key and you never change the locks, they can come in forever and you'd never know. Change the locks regularly, and the copied key stops working.

πŸ”¬ First Principles Lens: API tokens are bearer credentials β€” anyone who has the token IS you. Tokens don't expire by default. The exposure window equals the token's lifetime. Rotation bounds that lifetime. A stolen key that stops working in 90 days is categorically different from one that works forever.

Gateway token rotation

# 1. Generate new token
NEW_TOKEN="$(openssl rand -hex 32)"

# 2. Update .env (OPENCLAW_GATEWAY_TOKEN=$NEW_TOKEN), keep mode 600

# 3. Restart gateway
docker compose up -d --force-recreate openclaw-gateway

# 4. Validate health
curl -sf -H "Authorization: Bearer $NEW_TOKEN" http://127.0.0.1:18789/ > /dev/null

# 5. Re-authenticate all clients with new token

# 6. Invalidate old token everywhere (shell history, password managers, notes)

Provider key rotation runbook

For each provider, document:

  1. Where to rotate β€” the dashboard URL
  2. Where the key lives β€” which config files (there may be multiple!)
  3. How to hand off β€” never paste keys in chat. Use encrypted channels.
  4. How to verify β€” what breaks if you got it wrong

Example rotation matrix:

Provider Config Locations Rotation Method
Anthropic auth-profiles.json (2 entries) console.anthropic.com β†’ API Keys
OpenAI env, auth.json platform.openai.com β†’ API Keys
ElevenLabs openclaw.json (2 places) elevenlabs.io β†’ Profile
Twilio openclaw.json (3 places) console.twilio.com (24h grace period!)
xAI openclaw.json console.x.ai
Google/Gemini openclaw.json + 2 skill configs aistudio.google.com
Brave Search openclaw.json (2 places) api.search.brave.com
Backblaze B2 ~/.config/restic/b2.env backblaze.com β†’ App Keys

Cadence: Quarterly (every 90 days). Set a recurring reminder. If you don't schedule it, it won't happen.


10. Backups and Disaster Recovery

πŸ§’ Child Lens: An alarm clock that's set but not plugged in doesn't wake you up. It looks right β€” the time is set, the alarm is on β€” but it's not actually working. You only find out when you oversleep.

πŸ”¬ First Principles Lens: A backup system has three parts: the scheduler (triggers it), the tool (creates it), and the verification (proves it worked). Most failures are silent. A backup you haven't verified is not a backup.

What to back up

  • /home/deploy/.openclaw (config, auth, state)
  • /home/deploy/.openclaw/workspace (workspace data)

Minimal nightly backup

#!/bin/bash
set -euo pipefail
export PATH="/usr/local/bin:/usr/bin:/bin"

BACKUP_DIR="/var/backups/openclaw"
mkdir -p "$BACKUP_DIR"

TS="$(date +%F-%H%M%S)"

if ! tar -C / -czf "${BACKUP_DIR}/openclaw-${TS}.tar.gz" home/deploy/.openclaw 2>&1; then
    echo "🚨 Backup tar creation failed at $(date)" >&2
    exit 1
fi

# Verify the tarball is readable
if ! tar -tzf "${BACKUP_DIR}/openclaw-${TS}.tar.gz" > /dev/null 2>&1; then
    echo "🚨 Backup tarball corrupt at $(date)" >&2
    exit 1
fi

# Retention: keep 14 days
find "$BACKUP_DIR" -type f -mtime +14 -delete

systemd timer

# /etc/systemd/system/openclaw-backup.service
[Unit]
Description=OpenClaw Backup

[Service]
Type=oneshot
ExecStart=/usr/local/bin/openclaw-backup.sh

# /etc/systemd/system/openclaw-backup.timer
[Timer]
OnCalendar=*-*-* 03:00:00
Persistent=true

[Install]
WantedBy=timers.target
systemctl enable --now openclaw-backup.timer

For restic/B2 users β€” use absolute paths!

What we actually found (macOS deployment): A backup scheduled via macOS launchd to run daily at 3am. The plist was loaded, the script existed, the configuration looked correct. But: exit code 127 β€” restic: command not found. launchd doesn't inherit your shell's PATH. The backup had been silently failing every night. Only one snapshot existed β€” from a manual run days earlier. (On Linux, systemd has a similar gotcha β€” always use absolute paths in timer units.)

Always use absolute paths in scheduled scripts:

#!/bin/bash
# Linux:
export PATH="/usr/local/bin:/usr/bin:/bin"
RESTIC=/usr/local/bin/restic
# macOS: uncomment below instead
# export PATH="/opt/homebrew/bin:/usr/local/bin:/usr/bin:/bin"
# RESTIC=/opt/homebrew/bin/restic

$RESTIC backup ~/.openclaw/workspace \
  --exclude '.git' \
  --exclude 'node_modules' \
  --tag openclaw-workspace

$RESTIC forget \
  --keep-daily 7 \
  --keep-weekly 4 \
  --keep-monthly 6 \
  --prune

Verify periodically

restic snapshots          # Are new ones appearing?
restic check              # Repository integrity

Recovery test (required)

  • Restore to a fresh VPS
  • Start with same .env
  • Confirm channels/auth/session state are intact

The irony: We had just added backup failure alerting 10 minutes before discovering the backup was already broken. If we'd had monitoring earlier, we'd have caught it days ago. Order matters.


11. Health Monitoring

πŸ§’ Child Lens: Your house has smoke detectors. They don't prevent fires β€” they make sure you KNOW there's a fire so you can act. Without them, a small problem becomes a big problem while you're asleep.

πŸ”¬ First Principles Lens: Every automated system can fail silently. The cost of a failure is proportional to how long it goes undetected. Monitoring doesn't prevent failures β€” it bounds the detection time.

What to monitor

Check Threshold Why
Disk space >85% warning, >95% critical Full disk = no logs, no backups, cascading failures
Gateway process Not running If it's down, everything's down
Free memory <50MB OOM kills are silent and random
Last backup age >36 hours Catches silent backup failures
Container restarts Repeated Crash loop indicates config or resource problem
Auth cooldowns Rate-limited profiles You're burning through quota

Health monitor script

#!/bin/bash
STATUS="healthy"
ALERTS=""

# Disk
DISK_PCT=$(df -h / | awk 'NR==2 {gsub(/%/,""); print $5}')
if [ "$DISK_PCT" -gt 95 ]; then
    STATUS="critical"; ALERTS+="🚨 Disk ${DISK_PCT}% full\n"
elif [ "$DISK_PCT" -gt 85 ]; then
    STATUS="warning"; ALERTS+="⚠️ Disk ${DISK_PCT}% full\n"
fi

# Gateway container
if ! docker compose ps --status running openclaw-gateway | grep -q "openclaw-gateway"; then
    STATUS="critical"; ALERTS+="🚨 Gateway container not running!\n"
fi

# Memory
FREE_MB=$(free -m | awk '/Mem:/ {print $7}')
if [ "$FREE_MB" -lt 50 ]; then
    STATUS="warning"; ALERTS+="⚠️ Low memory: ${FREE_MB}MB available\n"
fi

# Backup age
# Use stat for portability (GNU find -printf not available on all systems)
LAST_BACKUP=$(find /var/backups/openclaw -name '*.tar.gz' 2>/dev/null | xargs -r stat --format='%Y' 2>/dev/null | sort -rn | head -1)
if [ -n "$LAST_BACKUP" ]; then
    NOW=$(date +%s)
    HOURS_AGO=$(( (NOW - ${LAST_BACKUP%.*}) / 3600 ))
    if [ "$HOURS_AGO" -gt 36 ]; then
        STATUS="warning"; ALERTS+="⚠️ Backup stale: ${HOURS_AGO}h old\n"
    fi
else
    STATUS="warning"; ALERTS+="⚠️ No backups found!\n"
fi

# Gateway health probe
if ! curl -sf -H "Authorization: Bearer $OPENCLAW_GATEWAY_TOKEN" http://127.0.0.1:18789/ > /dev/null 2>&1; then
    STATUS="warning"; ALERTS+="⚠️ Health probe failed\n"
fi

echo "status:$STATUS"
if [ -n "$ALERTS" ]; then
    echo -e "$ALERTS"

    # Discord webhook alerting
    if [ -n "$DISCORD_WEBHOOK_URL" ]; then
        PAYLOAD=$(echo -e "$ALERTS" | jq -Rs '{content: ("🚨 **OpenClaw Health Alert**\n" + .)}')
        curl -sf -X POST -H "Content-Type: application/json" -d "$PAYLOAD" "$DISCORD_WEBHOOK_URL"
    fi

    # ntfy.sh alternative (lightweight, no setup)
    if [ -n "$NTFY_TOPIC" ]; then
        echo -e "$ALERTS" | curl -sf -d @- "https://ntfy.sh/${NTFY_TOPIC}"
    fi
fi

Cron it

The health script needs OPENCLAW_GATEWAY_TOKEN to probe the gateway. Source it from your env file:

# /etc/cron.d/openclaw-health
SHELL=/bin/bash
OPENCLAW_GATEWAY_TOKEN="" # paste token here (not sourced from .env β€” Docker .env files aren't guaranteed shell-safe)
LOG_DIR=/home/deploy/openclaw/logs
*/30 * * * * deploy mkdir -p $LOG_DIR && cd /home/deploy/openclaw && /usr/local/bin/openclaw-health.sh 2>&1 | grep -v "^status:healthy$" >> $LOG_DIR/openclaw-alerts.log

Log hygiene

  • Gateway logs may contain request metadata β€” review before shipping externally
  • Implement log rotation βœ… Handled by json-file logging driver in compose (max-size: 10m, max-file: 3)
  • If using log aggregation, ensure transport is encrypted and destination is access-controlled

The principle: Good monitoring is silent when everything's fine and loud when something's wrong. If it's noisy, you'll ignore it. If it's silent, you'll forget it exists.


12. Update Strategy

πŸ§’ Child Lens: Before you install the update, check what's in the box. Don't just click "update all" and hope.

πŸ”¬ First Principles Lens: Every update is a trust decision. git pull verifies integrity (SHA hashes) but not identity (no GPG signatures). Reviewing changes before applying them is the best local defense.

Pre-update safety script

#!/bin/bash
echo "=== Current Version ==="
git log --oneline -1

echo "=== Upstream Changes ==="
git fetch origin
git log --oneline HEAD..origin/main

echo "=== Package Audit ==="
docker compose exec -T openclaw-gateway pnpm audit 2>/dev/null || echo "No vulnerabilities"

Update procedure

git pull --ff-only
docker compose build
docker compose up -d openclaw-gateway
curl -sf -H "Authorization: Bearer $OPENCLAW_GATEWAY_TOKEN" http://127.0.0.1:18789/

The rule: review before you update. Never blind-update.

Rollback procedure

Tag known-good states before updating so you can revert cleanly:

# Before updating, tag the current working state
git tag "good-$(date +%F)"

# Update
git pull --ff-only
docker compose build
docker compose up -d openclaw-gateway

# If something breaks β€” roll back
docker compose down
git checkout good-2026-02-07   # your last known-good tag
docker compose build
docker compose up -d openclaw-gateway

Keep the last 3 known-good images to avoid re-building on rollback:

# List OpenClaw images by date
docker images openclaw --format "{{.ID}} {{.CreatedAt}}" | sort -k2 -r

# Remove all but the 3 most recent (sorted by creation time, newest first)
docker images openclaw --format "{{.CreatedAt}}\t{{.ID}}" | sort -r | tail -n +4 | awk '{print $NF}' | xargs -r docker rmi

# Clean up dangling layers
docker image prune -f

Repo delta: The upstream repo does not provide GPG-signed releases or reproducible builds. Skills from clawhub have no signature verification. Until upstream addresses this, lockfiles + pre-update review are your best defense.


13. Secrets at Rest

πŸ§’ Child Lens: Cash in a locked drawer is protected as long as nobody picks the lock. But if someone breaks in, the drawer lock is the only thing left. A safe inside the drawer adds a layer β€” but only if the combination isn't taped underneath.

πŸ”¬ First Principles Lens: File permissions (chmod 600) protect against other local users. They don't protect against a compromised process running as your user, physical theft of an unencrypted disk, or unencrypted backups.

Decision framework

Scenario Recommendation
Hetzner VPS LUKS at provisioning, file permissions tight, rotate keys quarterly
Always-on home server Disk encryption OFF (accept risk for auto-boot), file permissions tight, rotate quarterly
Office/colo server Full disk encryption ON, accept manual reboots
Belt-and-suspenders Full disk + age/sops + rotation

What to encrypt

  • LUKS (full disk): Protects against physical theft. Hetzner offers it at provisioning time. Use it.
  • File-level (age/sops): Diminishing returns β€” the decryption key must be accessible at runtime, creating circular dependency.

14. Image Provenance

πŸ§’ Child Lens: If someone hands you a mystery box and says "trust me, it's the right thing" β€” you'd want to at least check the label matches what you ordered.

πŸ”¬ First Principles Lens: If building locally, the source is the trust anchor. For pre-built images, verify the digest and scan for vulnerabilities.

# Pin images by digest in production
# image: openclaw@sha256:abc123...

# Scan for vulnerabilities
docker scout cves openclaw:hetzner
# or
trivy image openclaw:hetzner

15. Persistence Source of Truth

Container Path Host Path Source
/home/node/.openclaw ${OPENCLAW_CONFIG_DIR} Volume mount
/home/node/.openclaw/workspace ${OPENCLAW_WORKSPACE_DIR} Volume mount
/usr/local/bin/* N/A Image build

Container can be recreated safely if and only if host volumes are intact.

Data integrity considerations:

  • Volume corruption: Docker volumes use the host filesystem β€” if the host disk corrupts, volumes corrupt too. This is why offsite backups (Section 10) are non-negotiable.
  • Upgrade migrations: Before any OpenClaw version upgrade, snapshot the volumes: tar -czf openclaw-pre-upgrade-$(date +%s).tar.gz /home/deploy/.openclaw/. If the new version changes data formats, you have a clean rollback point.
  • Ownership drift: If you rebuild the container with a different UID, volume permissions break. Always verify with docker compose exec openclaw-gateway id after rebuilds.
  • Never store state inside the container that isn't on a mounted volume. docker compose down && docker compose up -d must be a no-op for your data.

16. Accepted Risks β€” Document Everything

πŸ§’ Child Lens: A to-do list isn't just for things you're going to do. It's also for things you've decided NOT to do β€” and why.

πŸ”¬ First Principles Lens: Security is a spectrum of trade-offs. Documenting accepted risks is not negligence β€” it's engineering. The dangerous position is having unexamined risks, not having documented ones.

Template

### [Risk Name] β€” [Current State]

**Risk:** What could go wrong.
**Why accepted:** Why this trade-off makes sense.
**Mitigations:** What you're doing instead.
**Revisit when:** Conditions that would change the decision.

Common accepted risks for OpenClaw deployments

  1. No GPG on updates β€” Can't verify commit authorship. Upstream gap. Bounded by pre-update review script.
  2. No age/sops file encryption β€” Circular dependency at runtime. Bounded by file permissions + token rotation.
  3. Full disk encryption off (home server variant) β€” Physical theft exposes disk. Bounded by token rotation + location.

17. Common Pitfalls

  • Changing container gateway port away from 18789 in compose command
  • Using releases/latest binary URLs in production
  • Installing binaries manually inside a running container (lost on recreate)
  • Exposing gateway publicly without firewall + TLS + token
  • Leaving stale tokens unrotated after team/user changes
  • Scheduling backups without verifying they actually run
  • Using npm install instead of npm ci / pnpm install --frozen-lockfile
  • Piping secrets through Discord or Slack

Security Scorecard

Use this after deployment. Every box should be checked or have a documented reason why not.

Baseline (do these or don't deploy)

  • Firewall enabled, default deny incoming
  • SSH key-only auth (password auth disabled)
  • .env exists with chmod 600
  • OPENCLAW_GATEWAY_TOKEN set (not empty)
  • Docker ports bound to 127.0.0.1
  • Persistent dirs owned by UID 1000, mode 700
  • security_opt: no-new-privileges set in compose
  • Docker socket NOT mounted into container

Supply Chain

  • Docker installed from signed apt repo (not curl | sh)
  • Skill binaries pinned to version + SHA256 checksum
  • Lockfiles present and used (--frozen-lockfile)
  • No releases/latest URLs in Dockerfile
  • .gitignore covers auth-profiles.json, *.env, secrets

Operations

  • Backup automation running and verified (check for real output!)
  • Backup retention policy configured
  • Recovery tested on a fresh VPS at least once
  • Health monitoring active (disk, process, memory, backup age)
  • Log rotation configured
  • Token rotation runbook documented
  • Quarterly rotation reminder set

Network

  • Gateway not exposed to public internet (SSH tunnel or Tailscale)
  • If TLS-terminated: certs auto-renew, HTTPS enforced, proxy ↔ gateway on loopback
  • Tailscale keys tracked with expiry dates (if used)

Documentation

  • Accepted risks documented with rationale
  • Provider rotation matrix filled in
  • Pre-update review script in place
  • This checklist reviewed on every major update

Verify Everything Works

Run this checklist after initial deployment or any major update. You can run it manually or use the automated script below.

# 1. Health endpoint responds
curl -sf -H "Authorization: Bearer $OPENCLAW_GATEWAY_TOKEN" http://127.0.0.1:18789/
echo "βœ… Gateway reachable"

# 2. Container runs as non-root
docker compose exec openclaw-gateway id
# Expect: uid=1000(node) gid=1000(node) β€” NOT root

# 3. Firewall rules correct
ufw status verbose
# Expect: default deny incoming, SSH allowed, no 18789 open to public

# 4. Backup cycle works
/usr/local/bin/openclaw-backup.sh && echo "βœ… Backup succeeded"
ls -lh /var/backups/openclaw/ | head -3

# 5. Channel connectivity (Discord, etc.)
# Send a test message through your configured channel and confirm delivery

# 6. Volumes have correct ownership
ls -la /home/deploy/.openclaw/
# Expect: owned by 1000:1000, mode 700

Automated Smoke Test

Save as ops/smoke-test.sh and run after every deployment or update:

#!/usr/bin/env bash
set -euo pipefail
source .env 2>/dev/null || true

PASS=0; FAIL=0
check() {
  if eval "$2" >/dev/null 2>&1; then
    echo "βœ… $1"; ((PASS++))
  else
    echo "❌ $1"; ((FAIL++))
  fi
}

check "Gateway health" \
  'curl -sf -H "Authorization: Bearer $OPENCLAW_GATEWAY_TOKEN" http://127.0.0.1:18789/'

check "Non-root container" \
  '[ "$(docker compose exec -T openclaw-gateway id -u)" = "1000" ]'

check "UFW active" \
  'ufw status | grep -q "Status: active"'

check "18789 bound to localhost only" \
  'docker inspect $(docker compose ps -q openclaw-gateway) 2>/dev/null | grep -q "127.0.0.1:.*18789"'

check "Volume ownership" \
  '[ "$(stat -c %u /home/deploy/.openclaw 2>/dev/null || stat -f %u /home/deploy/.openclaw)" = "1000" ]'

check "Log rotation configured" \
  'docker inspect $(docker compose ps -q openclaw-gateway) 2>/dev/null | grep -q max-size'

echo ""
echo "Results: $PASS passed, $FAIL failed"
[ "$FAIL" -eq 0 ] || exit 1
chmod +x ops/smoke-test.sh
./ops/smoke-test.sh

Troubleshooting

πŸ§’ Child Lens: When the lights go out, you check the breaker box β€” not rewire the whole house. Start with the most obvious cause.

πŸ”¬ First Principles Lens: Most failures are configuration errors, not bugs. Logs tell you what happened; network tools tell you what's reachable. Check both before changing anything.

Symptom Diagnosis Fix
Container won't start docker compose logs -f openclaw-gateway β€” look for crash reason Fix config/env, then docker compose up -d
Gateway unreachable ss -tlnp | grep 18789 β€” is it listening? Check UFW: ufw status Verify bind address in .env, check docker compose ps
Auth failures (401) Token mismatch between client and server Compare $OPENCLAW_GATEWAY_TOKEN in .env vs what client sends. Restart after changes: docker compose up -d --force-recreate
Backup failures Check exit code: echo $? after manual run. Common: disk full, wrong permissions df -h for space, ls -la /var/backups/openclaw/ for perms
Discord/channel disconnects Gateway lost websocket connection docker compose restart openclaw-gateway β€” check logs for rate limits
OOM killed dmesg | grep -i oom Increase mem_limit in compose or reduce workload

Philosophy

Security isn't a destination. It's a practice β€” like brushing your teeth. You do it regularly, you do it honestly, and you document what you skip and why.

The three most dangerous words in security are "it should work." Check. Verify. Test the backup by restoring from it. Run the health monitor and see if it actually alerts. Try to read your own secrets from a different user account.

Trust, but verify. Then verify again.


Based on Brad Barbin's Hetzner deployment gist and a real security audit conducted February 2026. All scripts referenced here live in the ops/ directory of the OpenClaw workspace.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment