Title: Feat/add back dokploy preview envs Author: Elroy Fernandes Size: +1397 / -453 across 11 files, 55 commits
This PR restores Dokploy-based preview environments for PRs. Each PR gets an isolated stack (API, dashboard, Postgres, Redis, LocalStack, Inngest) deployed to a worker server, with Traefik routing from a manager node. It includes dynamic port allocation for multi-PR coexistence, a stale preview cleanup cron, and a rewritten lambda deployer that uses the real deploy-lambda.sh scripts.
preview-deploy.ymluseshttps://dokploy.dev.obvious.aipreview-cleanup-stale.ymluseshttps://app.dokploy.com- These are almost certainly different servers. The cleanup will either fail or delete resources on the wrong instance.
- PRs 7045 and 8045 would both get port offset 45, causing conflicts if deployed to the same worker.
- With an active monorepo this is a real risk. Consider using
PR_NUMBER % 10000or a registry-based allocation.
preview-deploy.ymlline ~771: The "Verify API health" step exits 0 even when health checks fail after 5 minutes. This means the PR comment will show "deployed" even if the environment is broken. At minimum this should be a warning annotation, and arguably a hard failure.
- Line 1076:
RUN bun install— the AGENTS.md lockfile rule says "Never run barebun installin a worktree." While this is Docker, the same risk applies: the lockfile could produce different resolutions. Should bebun install --frozen-lockfile.
- Using
:latestmeans builds are non-reproducible and could break without code changes. - This is a Pro image requiring
LOCALSTACK_AUTH_TOKEN. If the token expires or is misconfigured, the entire preview stack fails with no clear error message. Should pin to a specific version (e.g.,localstack/localstack-pro:3.4).
- The stale cleanup workflow assumes
createdAtresets on redeploy: "if someone pushes a commit, preview gets redeployed and createdAt resets". Verify this with Dokploy's API — many systems only setcreatedAtonce and useupdatedAtfor subsequent changes.
- The Traefik YAML config in the deploy workflow (~lines 697-733) is assembled via
jqstring concatenation. This is fragile — a misplaced\nor changed variable can silently produce invalid YAML. Consider using a template file instead.
- Line 442: This will collapse the substituted env vars, which is fine for parsing but could cause confusion when debugging. More importantly, if
__DATABASE_URL_LINE__etc. are intentionally empty (no migrations case), the blank line removal is needed — but it's subtle. Worth a comment.
- In the cleanup job, the "Remove Traefik routing config" step runs unconditionally, which is fine. But the compose deletion step silently passes if the compose wasn't found ("may have been manually deleted"). If cleanup was already done, the Traefik cleanup is redundant but harmless — just noting the asymmetry.
- Lines 1637 and 1662:
--debugwas removed fromdocker buildcommands. This is intentional for cleaner output but worth confirming — debug output can be valuable when investigating build failures in CI.
11. The PR has 55 commits with many "fix:" incremental iterations (fix dockerfile, fix lambda builds, fix dokploy, etc.). This should be squashed before merge for a clean history.
12. scripts/deploy-lambdas-localstack.sh switches from #!/bin/sh to #!/bin/bash and uses bash-specific features (arrays, ${BASH_SOURCE[0]}). This is fine since the Dockerfile uses bash, but note the compatibility change.
13. The User-Agent: Mozilla/5.0 header on Dokploy API calls is unusual — is this required to bypass some WAF/rate limit? Worth documenting why.
14. LAMBDA_RUNTIME is hardcoded to nodejs22.x in deploy-lambda.sh (line 479) but was nodejs20.x in the old inline script. Confirm all functions are compatible with Node 22.
15. The env template removes INNGEST_EVENT_KEY and INNGEST_SIGNING_KEY from project var references, and hardcodes them empty in the compose file. The reasoning (Dokploy project vars override defaults) is sound and well-documented in comments.
- Branch name validation against injection — properly sanitized with regex
- JSON payload construction via
jq --argto prevent injection via branch names - Concurrency groups on the deploy workflow — prevents parallel deploys for the same PR
- Migration detection — smart differentiation between branches that need local Postgres vs. shared staging DB
- Container Lambda deferral — deploying memory-intensive ECR pushes last to avoid destabilizing LocalStack
- Retry logic in lambda deployer with health checks between deploys
- Pre-pulling Lambda runtime image to avoid cold start timeouts
- NODE_ENV build-time vs runtime separation — well-documented tradeoff
Not ready to merge as-is. The DOKPLOY_URL mismatch (#1) is a clear bug. The port collision risk (#2) and unpinned Pro image (#5) should be addressed. The rest are manageable follow-ups. After fixing the critical items, squash the 55 commits and this should be good to go.