Jack Peterson 2026-02-17 Version 1.0
Agentic development introduces unprecedented speed and creative leverage into modern software engineering. AI agents can now generate infrastructure, refactor codebases, scaffold applications, and accelerate experimentation at a pace previously unattainable. However, without clearly defined boundaries, the same autonomy that enables rapid progress can introduce significant operational, security, and governance risk.
This document defines a guardrail-based approach to agentic development. It establishes explicit boundaries between experimentation and production, clarifies where human authority must remain non-delegable, and outlines progressive controls that evolve as projects mature from prototype to deployed service. Particular emphasis is placed on infrastructure control, credential isolation, standardized architecture, enforceable gates, and non-negotiable testing requirements.
The objective is not to limit engineering creativity, but to maximize it safely. By separating exploratory velocity from production responsibility and by retaining human approval over irreversible actions, organizations can safely harness AI-powered engineering acceleration while preserving reliability, accountability, and long-term system integrity.
Agentic development introduces a spectrum of execution models:
- Full Delegation – Tasks are handed entirely to an AI agent, which plans, generates, modifies, and potentially deploys code with minimal human intervention.
- Collaborative Orchestration – AI agents generate proposals and implementations, but a human reviews changes at a granular level and authorizes movement between stages.
Both modes are powerful. Both modes can royally screw everything you’re working hard to improve in unpredictable ways, and when you least expect it.
The failure point is not AI capability or even the model(s). The failure is when AI-powered tools are granted unfettered access to resources beyond reasonable boundaries and make a mistake either through direct guidance, automatic trust of commands, or automagically as a side-effect.
AI agents are excellent at:
- Iteration
- Pattern recognition
- Refactoring
- Scaffolding
- Documentation drafting
- Test generation
They are not accountable for:
- Production outages
- Regulatory exposure
- Data loss
- Infrastructure misconfiguration
- Vendor lock-in decisions
Humans own consequences because AI Agents lack moral and ethical agency.
Therefore, mature agentic development defines explicit lines where autonomy stops.
Infrastructure-level changes must always require explicit human approval.
This includes:
- Destructive production database schema changes
- Destructive data operations
- Infrastructure provisioning or teardown
- IAM / permission changes
- Deployment pipeline configuration changes
- Network and security boundary changes
AI may:
- Generate Terraform modules
- Refactor infrastructure-as-code
- Analyze plan output
- Suggest cost optimizations
- Run
terraform validate - Run
terraform plan
AI must not be allowed to (without appropriate approval / review):
- Execute destructive state changes
- Modify production state without human review
- Run
terraform apply - Run
terraform destroy
The distinction is simple:
- AI may simulate.
- AI may propose.
Only humans should execute irreversible infrastructure changes in order to mitigate blast radius.
One subtle but critical risk emerges with operations engineers.
An experienced DevOps or platform engineer often has:
- Administrator-level cloud credentials
- Production database access
- CI/CD pipeline override permissions
- State backend access
If such an engineer uses AI tools in a local environment where those credentials are active, an AI agent may technically have the authority to execute destructive commands — even if unintentionally.
This creates a dangerous asymmetry:
The AI does not understand the power of the credentials it is operating under, and as a process with the same level of access as the user using the computer, any access to credentials on the computer may as well be assumed as being handed to that process (e.g., ~/.aws/credentials).
Therefore:
- High-privilege credentials must be isolated from wherever agentic processes run.
- AI agents should operate with scoped, least-privilege credentials.
- Production credentials should never be exposed to autonomous agents.
- Infrastructure execution should occur only through controlled pipelines with explicit approval gates or manually by Ops Engineers assisting with an experimental project.
Ops engineers must exercise additional discipline (such as removing, re-keying, or swapping credentials from the computer) when working with AI agents.
The more powerful your credentials, the more intentional your risk mitigation boundaries must be.
In 2025, an AI coding agent deleted a company’s production database during a code freeze, later describing its actions as a “catastrophic error in judgment.”
The issue was not that AI made a mistake. Spend any amount of time with AI tools and it’s readily apparent that it makes all kinds of decisions. Some of those decisions happen to be useful.
The issue was that the corporate system architecture allowed it to.
If an agent can:
- Modify production state by itself
- Ignore instructions (it will)
- Perform destructive operations
Then governance is missing.
Guardrails are architectural decisions, not behavioral hopes.
Agentic systems empower two primary profiles:
- Deep language fluency
- Architectural understanding
- Makes surgical or broad changes but can fully evaluate and understand what the proposed changes mean
- Uses AI as a multiplier
- Strong ideas
- Limited framework or infrastructure expertise
- Rapid prototyper
- AI-heavy workflow
Both are valuable. We must maximize creative experimentation.
But experimentation must not compromise production integrity.
- Creativity should be frictionless.
- Production-impacting modifications must be predictable.
Ambitious innovators often move fast by:
- Experimenting and deploying applications using multiple vendors
- Trying new managed services
- Testing SaaS integrations
- Exploring alternative cloud products
This is healthy during feasibility testing.
However, once a project moves beyond proof-of-concept and demonstrates viability, it must be brought into standardized company infrastructure.
That means migrating experimental projects to:
- Approved cloud providers
- Approved networking models
- Centralized IAM
- Observability integration
- Cost monitoring
- Backup and recovery policies
- Compliance alignment
- SDLC adherence
Guardrails and policies ought to be implemented to the degree that the business impact of a service justifies it.
Innovation can begin anywhere. “Production” must live within company standards.
Agentic development should accelerate experimentation and not lead to fragmented infrastructure that is ungovernable.
Versioned artifacts become the north star.
Usable software ecosystems standardize quality enforcement tools:
- Unit testing frameworks
- Integration testing
- Static analysis
- Linting
- Type systems
- Build validation mechanisms
Architectural integrity requires:
- Domain-driven design (where appropriate)
- Clear separation of business and technical concerns
- Encapsulation and externalization of business logic
- Explicit dependency boundaries
- Clear deployment models
But tools alone are not enough. AI agents require explicit steering from accountable stakeholders.
That steering comes from:
- Architectural decision records (ADRs)
- Coding standards
- Security policies
- Infrastructure constraints
- Approved vendor lists
- Deployment rules
- Directory conventions
- Service boundary documentation
Formal documentation, coherent tests, and enforceable rules ultimately become the steering wheel for software projects.
Without it, agents improvise. With it, agents (generally) align to intent.
If documentation is vague, outdated, or inaccessible, the agent will fill gaps with probabilistic guesses based on public patterns and not to the company’s standards.
Machine-readable steering documents should be:
- Version-controlled
- Stored in-repo
- Accessible to agents
- Structured clearly
- Updated as architecture evolves
Well-defined documentation transforms AI from improviser to compliant contributor.
Rigidity must increase as risk increases.
- Lightweight linting
- Minimal required unit tests
- Fast iteration allowed
- Warnings instead of hard failures (where appropriate)
- Compilation required
- Unit tests mandatory for new features
- Integration tests where relevant
- Style and static analysis enforced
- All gates pass
- Security scanning required
- Observability integrated
- Infrastructure changes human-approved
- Deployment sign-off required
Progressive hardening allows for maximum creativity early and stability later. Nobody has a perfect memory of what they wrote six months ago; gates protect people and agentically generated code from resource-wasting (and destroying) mistakes.
Regardless of stage:
- Code must compile.
- Basic unit tests must exist (but may not be comprehensive).
- A defined testing process must be in place.
Testing may be run by:
- AI agents
- CI/CD pipelines
- Human QA
- Hybrid review
But it must exist.
An untested agent is guessing. A tested agent is iterating within constraints.
But beware: the agent (or human) may delete those very guardrails in the same repository it’s working in. That in-repo policy change is just a file modification like any other.
An effective Guard means:
- Production service monitoring
- Automated application rollback capability
- Properly scoped permissions
- Human escalation paths
- Explicit infrastructure execution control
AI may:
- Propose
- Draft
- Refactor
- Validate
- Simulate
AI may not:
- Execute irreversible production changes
- Modify infrastructure state autonomously
- Bypass approval gates
Humans retain final authority over:
- Production data
- Infrastructure state
- Security posture
- Cost exposure
- Compliance impact
- Experiment boldly.
- Standardize deliberately.
- Authorize infrastructure changes explicitly.
- Never delegate irreversible responsibility to a system that does not bear the consequences.