The traditional developer experience is built around dashboards, documentation, and manual workflows -- the developer is the executor. In an agentic paradigm, the developer becomes the architect: defining intent, reviewing outcomes, and steering autonomous agents that do the building. Three forces make this inevitable:
- Context windows are now large enough to hold entire subsystems, making multi-file reasoning feasible.
- Tool-use and function-calling have matured -- models can reliably invoke external APIs, run terminals, edit files, and iterate against feedback loops (tests, linters, type-checkers).
- The cost curve is collapsing -- separating planning (expensive, high-intelligence model) from execution (fast, cheap model) makes agentic workflows economically viable at scale.
Cursor is the first IDE to treat the agent as a first-class citizen rather than a bolt-on copilot. Its architecture -- the agent harness (user messages + tools + instructions) -- is explicitly designed so that as frontier models improve, developers get better results without changing their workflow.
graph TB
subgraph layer1 [Layer 1: Context Engineering]
Rules["Rules (.cursor/rules/)"]
AgentsMD["AGENTS.md (nested)"]
TeamRules["Team Rules (Dashboard)"]
UserRules["User Rules (Global)"]
end
subgraph layer2 [Layer 2: Dynamic Knowledge]
Skills["Agent Skills (.cursor/skills/)"]
MCP["MCP Servers (mcp.json)"]
SemanticIdx["Semantic Search Index"]
end
subgraph layer3 [Layer 3: Agent Modes]
PlanMode["Plan Mode (Shift+Tab)"]
AgentMode["Agent Mode (Default)"]
AskMode["Ask Mode (Read-only)"]
DebugMode["Debug Mode"]
end
subgraph layer4 [Layer 4: Automation and Hooks]
Hooks["Hooks (hooks.json)"]
Commands["Commands (.cursor/commands/)"]
StopHook["Stop Hook (Grind Loop)"]
end
subgraph layer5 [Layer 5: Parallelism and Scale]
Worktrees["Git Worktrees"]
BestOfN["Best-of-N (Multi-model)"]
CloudAgents["Cloud Agents"]
end
subgraph layer6 [Layer 6: Distribution]
Plugins["Plugins (.cursor-plugin/)"]
Marketplace["Cursor Marketplace"]
end
layer1 --> layer2
layer2 --> layer3
layer3 --> layer4
layer4 --> layer5
layer5 --> layer6
Context is the single most important lever. Without it, agents hallucinate. With it, they build with precision.
Project Rules (.cursor/rules/*.mdc) -- Persistent, version-controlled instructions scoped by glob patterns. Four types:
Always Apply-- every session (e.g., coding standards)Apply Intelligently-- agent-decided based on descriptionApply to Specific Files-- glob-matched (e.g.,**/*.tsx)Apply Manually-- invoked via@my-rule
Nested AGENTS.md -- Simpler alternative. Place in project root and subdirectories for directory-scoped instructions. More specific files take precedence.
project/
AGENTS.md # Global: "Use TypeScript, follow repo pattern"
frontend/
AGENTS.md # "Use Tailwind, Framer Motion for animations"
components/
AGENTS.md # "Props interface at top, named exports"
backend/
AGENTS.md # "Use zod validation, export types from schemas"
Team Rules -- Centrally managed from the Cursor Dashboard. Can be enforced (cannot be disabled by team members) for compliance.
Best Practice: Rules should be concise (<500 lines), reference files with @filename.ts instead of copying content, and be added only when the agent makes the same mistake repeatedly.
This layer gives agents on-demand capabilities beyond what's in the codebase. Skills teach agents how to do things; MCP gives them access to things.
(.cursor/skills/ or .agents/skills/) -- Portable, version-controlled packages with SKILL.md files. Unlike Rules (always loaded), Skills are loaded progressively when the agent determines relevance. Skills can include:
scripts/-- executable code the agent can runreferences/-- additional docs loaded on demandassets/-- templates, config files
Skill 1: Troubleshoot CI Pipeline Failures
.cursor/skills/
troubleshoot-ci/
SKILL.md
scripts/
fetch-ci-logs.sh
parse-failures.py
references/
CI_RUNBOOK.md
assets/
known-flaky-tests.json
SKILL.md contents:
---
name: troubleshoot-ci
description: Diagnose and fix CI pipeline failures. Use when a CI/CD
pipeline has failed, tests are flaky, or builds are broken in GitLab
CI, GitHub Actions, or similar systems.
---
# Troubleshoot CI Pipeline
## When to Use
- A CI pipeline has failed and the developer asks "why did CI fail?"
- Flaky tests are blocking merges
- Pipeline timeouts or resource exhaustion
## Instructions
1. **Gather evidence**: Run `scripts/fetch-ci-logs.sh <pipeline-id>`
to pull the latest failed job logs. If using GitLab MCP, call the
pipeline jobs API to get logs directly.
2. **Classify the failure** into one of:
- Compilation/build error -- read the error output, find the file
and line, propose a fix
- Test failure -- check `assets/known-flaky-tests.json` first; if
the test is known-flaky, suggest a re-run; otherwise diagnose
- Infrastructure failure (OOM, timeout, runner issue) -- check
resource limits in the CI config and suggest increases
- Dependency failure -- check lock files, registry availability
3. **Cross-reference** with `references/CI_RUNBOOK.md` for team-
specific remediation steps (e.g., how to retrigger, who to page)
4. **Fix and verify**: Apply the fix, then instruct the developer to
push and monitor the next pipeline run
5. If the root cause is a flaky test, update
`assets/known-flaky-tests.json` with the test name and dateSkill 2: Debug Build Failures
.cursor/skills/
debug-build/
SKILL.md
scripts/
clean-build.sh
dependency-tree.sh
references/
BUILD_PATTERNS.md
SKILL.md contents:
---
name: debug-build
description: Diagnose and resolve build failures including compilation
errors, dependency conflicts, and configuration issues. Use when
builds fail locally or in CI, or when dependency resolution breaks.
---
# Debug Build Failures
## When to Use
- Local or CI build fails with compilation errors
- Dependency version conflicts or resolution failures
- Webpack/Vite/esbuild/tsc errors after package upgrades
- Docker image build failures
## Instructions
1. **Reproduce locally**: Run the project's build command (check
AGENTS.md or .cursor/rules for the correct command). Capture the
full error output.
2. **Parse the error**:
- TypeScript: Look for the TS error code (e.g., TS2345), find the
file:line, read surrounding context, and fix the type mismatch
- Docker: Identify the failing layer, check if it is a missing
dependency, wrong base image, or COPY path issue
- Native/compiled: Check compiler version, missing headers, or
linker errors
3. **Dependency conflicts**: Run `scripts/dependency-tree.sh` to
visualize the dependency graph. Look for duplicate or incompatible
versions. Consult `references/BUILD_PATTERNS.md` for team-approved
resolution strategies (e.g., `overrides`, `resolutions`).
4. **Clean build**: If the error is stale cache, run
`scripts/clean-build.sh` which removes node_modules, dist, .next,
__pycache__, and similar artifacts, then rebuilds.
5. After fixing, run the build again to verify. If in CI, push and
monitor.Skill 3: Troubleshoot CD / Deployment Failures
.cursor/skills/
troubleshoot-cd/
SKILL.md
scripts/
deploy.sh
rollback.sh
validate-deploy.py
references/
DEPLOYMENT_RUNBOOK.md
INFRA_TOPOLOGY.md
assets/
env-config-template.json
SKILL.md contents:
---
name: troubleshoot-cd
description: Diagnose and resolve deployment failures across staging
and production environments. Use when deployments fail, health checks
don't pass, or rollbacks are needed. Covers AWS ECS/EKS, Kubernetes,
and Terraform-based deployments.
---
# Troubleshoot CD / Deployment
## When to Use
- A deployment to staging or production has failed
- Health checks are failing after deploy
- Terraform plan/apply errors
- Need to perform an emergency rollback
## Instructions
1. **Identify the deployment target**: Read
`references/INFRA_TOPOLOGY.md` to understand the environment
topology (which services, which regions, which orchestrator).
2. **Gather deployment logs**: Use the AWS MCP or GitLab MCP to pull
deployment logs. For ECS, check task stopped reasons. For K8s,
check pod events and container logs.
3. **Classify the failure**:
- Image pull failure -- check ECR/registry permissions, image tag
- Health check failure -- verify the health endpoint, check env
vars against `assets/env-config-template.json`
- Terraform error -- read the plan diff, check for state drift or
resource conflicts
- Permission/IAM error -- check the role trust policy and attached
policies via AWS MCP
4. **Rollback if needed**: Run `scripts/rollback.sh <environment>` to
revert to the last known-good deployment. Follow the rollback
procedure in `references/DEPLOYMENT_RUNBOOK.md`.
5. **Fix forward**: Once root cause is identified, apply the fix, run
`scripts/validate-deploy.py <environment>` for pre-flight checks,
then deploy with `scripts/deploy.sh <environment>`.Skill 4: Investigate Production Incidents with Observability Data
.cursor/skills/
investigate-incident/
SKILL.md
scripts/
fetch-metrics.sh
correlate-events.py
references/
INCIDENT_PLAYBOOK.md
SERVICE_DEPENDENCIES.md
SKILL.md contents:
---
name: investigate-incident
description: Investigate production incidents by correlating logs,
metrics, and traces from Datadog and Splunk. Use when there is a
production alert, elevated error rate, latency spike, or customer-
reported issue.
---
# Investigate Production Incident
## When to Use
- PagerDuty/Datadog alert fires
- Error rate or latency spikes in a service
- Customer reports an issue that needs root cause analysis
## Instructions
1. **Establish timeline**: Ask the developer for the approximate start
time. Use the Datadog MCP to query metrics for the affected service
over that window (error rate, p99 latency, CPU/memory).
2. **Pull logs**: Use the Splunk MCP (or Datadog Logs MCP) to search
for error-level logs in the affected service within the time
window. Look for stack traces, error codes, and upstream failures.
3. **Trace the request path**: Consult
`references/SERVICE_DEPENDENCIES.md` for the service dependency
graph. Check upstream and downstream services for correlated
failures.
4. **Correlate with deployments**: Use GitLab MCP to check if any
deployment happened just before the incident start time. If yes,
this is likely a regression -- switch to the `troubleshoot-cd`
skill for rollback.
5. **Propose a fix or mitigation**: Based on evidence, either:
- Propose a code fix (with the file and line identified from logs)
- Suggest a config change (feature flag, env var, scaling)
- Recommend a rollback with `scripts/rollback.sh`
6. Follow `references/INCIDENT_PLAYBOOK.md` for post-incident steps
(write-up, timeline, action items).Skill 5: Debug Docker and Container Issues
.cursor/skills/
debug-containers/
SKILL.md
scripts/
inspect-container.sh
check-resources.sh
references/
DOCKER_PATTERNS.md
SKILL.md contents:
---
name: debug-containers
description: Debug Docker build failures, container runtime issues,
and orchestration problems in ECS/EKS/K8s. Use when containers
crash, fail to start, or exhibit resource issues.
---
# Debug Docker and Container Issues
## When to Use
- Dockerfile build fails
- Container exits with non-zero code (CrashLoopBackOff in K8s)
- OOMKilled or resource limit issues
- Networking or service discovery failures
## Instructions
1. **Build failures**: Read the Dockerfile and the build output.
Common issues: missing build args, wrong base image architecture
(amd64 vs arm64), COPY paths that don't exist in build context.
2. **Runtime crashes**: Run `scripts/inspect-container.sh <container>`
to get the last 100 lines of logs and the exit code. Check if the
entrypoint is correct and env vars are set.
3. **Resource issues**: Run `scripts/check-resources.sh` to compare
configured limits vs actual usage. If OOMKilled, recommend
increasing memory limits or investigating memory leaks.
4. **Networking**: Check that the container is listening on the
expected port, security groups allow traffic, and service discovery
(DNS/envoy/service mesh) is configured correctly.
5. Consult `references/DOCKER_PATTERNS.md` for team-standard
Dockerfile patterns (multi-stage builds, layer caching, .dockerignore).(.cursor/mcp.json) -- Connect the agent to external systems. Over 100+ integrations available. Three transport types:
stdio-- local processes (single user)SSE/Streamable HTTP-- remote servers (multi-user, OAuth)
Recommended MCP Stack for a Production DevEx Platform:
graph LR
subgraph vcs [Version Control]
GitLab["GitLab MCP"]
GitHub["GitHub MCP"]
end
subgraph cloud [Cloud Infrastructure]
AWS["AWS MCP"]
AWSBilling["AWS Billing MCP"]
AWSDocs["AWS Docs MCP"]
end
subgraph observability [Observability]
Datadog["Datadog MCP"]
Splunk["Splunk MCP"]
Sentry["Sentry MCP"]
end
subgraph pm [Project Management]
Linear["Linear MCP"]
Slack["Slack MCP"]
end
subgraph infra [Infrastructure]
Terraform["Terraform MCP"]
Docker["Docker MCP"]
end
Agent["Cursor Agent"] --> vcs
Agent --> cloud
Agent --> observability
Agent --> pm
Agent --> infra
GitLab MCP -- DevSecOps platform integration. Gives the agent access to merge requests, pipelines, CI job logs, issues, and repository data. Essential for teams using GitLab CI/CD.
{
"mcpServers": {
"gitlab": {
"url": "https://your-gitlab-instance.com/api/v4/mcp"
}
}
}Use cases: "Why did the pipeline fail on MR !456?", "Show me the diff for the last merge to main", "Create an MR with these changes"
AWS MCP -- Access AWS services through natural language. Covers EC2, ECS, Lambda, S3, IAM, CloudFormation, and more. Combine with the AWS Billing MCP for cost analysis and AWS Documentation MCP for up-to-date service docs.
{
"mcpServers": {
"aws": {
"command": "uvx",
"args": ["mcp-proxy-for-aws@latest",
"https://aws-mcp.us-east-1.api.aws/mcp",
"--metadata", "AWS_REGION=us-west-2"]
},
"aws-docs": {
"command": "uvx",
"args": ["awslabs.aws-documentation-mcp-server@latest"]
},
"aws-billing": {
"command": "uvx",
"args": ["awslabs.billing-cost-management-mcp-server@latest"]
}
}
}Use cases: "Check why ECS tasks are failing in staging", "What IAM permissions does this Lambda need?", "How much did us-east-1 EC2 cost last month?", "Show me the AWS docs for ECS task networking"
Datadog MCP -- Query metrics, logs, traces, and monitors. The agent can investigate latency spikes, error rate increases, and infrastructure alerts without leaving the IDE.
{
"mcpServers": {
"datadog": {
"url": "https://mcp.datadoghq.com/mcp",
"headers": {
"DD-API-KEY": "${env:DD_API_KEY}",
"DD-APPLICATION-KEY": "${env:DD_APP_KEY}"
}
}
}
}Use cases: "Show me p99 latency for the payments service over the last hour", "Pull error logs from the order-service for the last 30 minutes", "What monitors are currently alerting?"
Splunk MCP -- Search logs, run SPL queries, and analyze security events. Particularly valuable for teams using Splunk for centralized log management or SIEM.
{
"mcpServers": {
"splunk": {
"command": "npx",
"args": ["mcp-remote", "https://your-splunk.example.com/mcp"],
"env": {
"SPLUNK_TOKEN": "${env:SPLUNK_TOKEN}"
}
}
}
}Use cases: "Search Splunk for 500 errors in the auth-service in the last 2 hours", "Run an SPL query for failed login attempts by IP", "Correlate this error with recent deployment events"
How Skills and MCP Work Together:
The key design insight is that MCP provides deterministic tool integration (API calls, data fetching), while Skills provide adaptive context and workflows (domain knowledge, multi-step procedures). They are complementary:
- The
troubleshoot-ciskill tells the agent how to diagnose a CI failure (the methodology, the classification, the team-specific runbook) - The GitLab MCP gives the agent access to the actual pipeline logs and job data
- The
investigate-incidentskill teaches the agent how to correlate signals across services - The Datadog and Splunk MCPs give the agent access to the actual metrics and logs
Without Skills, the agent has tools but no methodology. Without MCP, the agent has methodology but no data. Together, they create agents that can reason about production systems the way a senior engineer would.
| Mode | When to Use | Key Capability |
|---|---|---|
| Plan | Complex features, unclear requirements | Creates reviewable plan before coding |
| Agent | Implementation, refactoring | Autonomous multi-file editing |
| Ask | Learning, onboarding, exploration | Read-only, no changes |
| Debug | Regressions, race conditions, memory leaks | Hypothesis generation + log instrumentation |
The Plan-then-Execute Pattern is the most impactful workflow:
- Activate Plan Mode (
Shift+Tab) - Use a powerful model (e.g., Claude Opus) to generate a detailed plan with file paths, function signatures, and logic
- Review and edit the plan (saved as markdown in
.cursor/plans/) - Execute with a faster model (e.g., Sonnet) -- it follows the plan as a diligent builder
- If the result is wrong, revert to checkpoint, refine the plan, and re-execute
Debug Mode is uniquely powerful for hard-to-reproduce bugs: it instruments code with logging, asks you to reproduce the bug, analyzes runtime evidence, then makes a targeted fix -- rather than guessing.
This is where agentic DevEx becomes truly autonomous.
Custom Commands (.cursor/commands/*.md) -- Reusable workflows triggered with / in chat:
/pr-- commit, push, create PR withgh/review-- run linters, flag issues, summarize/fix-issue [number]-- fetch issue from GitHub, find code, implement fix, open PR/deploy-staging-- run tests, build, push to staging
Hooks (.cursor/hooks.json) -- Scripts that run at defined stages of the agent loop. Two types:
- Command-based: Shell scripts receiving JSON via stdin, returning JSON via stdout
- Prompt-based: LLM-evaluated natural language conditions (policy enforcement without code)
Key hook events:
sessionStart/sessionEnd-- inject context, run cleanupbeforeShellExecution/afterShellExecution-- gate risky commandsbeforeReadFile/afterFileEdit-- run formatters, scan for secretspreToolUse/postToolUse-- generic tool lifecyclestop-- the grind loop: return afollowup_messageto keep the agent iterating until tests pass or a scratchpad says "DONE"subagentStart/subagentStop-- control Task tool execution
The Grind Loop Pattern -- Use a stop hook to create agents that iterate autonomously until a verifiable goal is met:
{
"version": 1,
"hooks": {
"stop": [{ "command": "bun run .cursor/hooks/grind.ts" }]
}
}The script checks if a scratchpad contains "DONE" or if max iterations are reached. If not, it returns a followup_message to continue. This is TDD on autopilot.
Git Worktrees -- Each parallel agent runs in an isolated worktree with its own files. Configured via .cursor/worktrees.json for dependency installation and environment setup.
Best-of-N -- Run the same prompt across multiple models simultaneously. Compare results side-by-side. Cursor suggests which solution is best. Especially valuable for hard problems.
Cloud Agents -- Delegate tasks to cloud-hosted agents that run in sandboxes. Start from cursor.com/agents, the editor, or even your phone. They clone the repo, create a branch, work autonomously, and open a PR when finished. Trigger from Slack with @Cursor.
Package everything into a Plugin (.cursor-plugin/plugin.json) that bundles:
- Rules, Skills, Agents, Commands, Hooks, MCP Servers
Distribute through the Cursor Marketplace (manually reviewed for security). Multi-plugin repositories are supported via marketplace.json.
This is how you scale agentic DevEx across an organization: encode your team's architectural patterns, compliance requirements, deployment workflows, and domain knowledge into a plugin that every developer gets automatically.
Week 1-2: Foundation
- Create
.cursor/rules/with coding standards, architecture patterns, and key commands - Add nested
AGENTS.mdfor directory-specific guidance - Set up 2-3 essential MCP integrations (GitHub, Linear/Jira, your database)
Week 3-4: Workflows
- Build custom commands for your top 5 repeatable workflows (
/pr,/review,/test,/deploy,/fix-issue) - Create Agent Skills for domain-specific tasks (deployment, data migrations, API design)
- Implement
afterFileEdithooks for auto-formatting
Week 5-6: Autonomy
- Implement the grind loop (
stophook) for TDD workflows - Set up
beforeShellExecutionhooks for security guardrails - Configure worktrees for parallel agent execution
Week 7-8: Scale
- Package as a Plugin for team distribution
- Set up Cloud Agents for async task delegation
- Establish Team Rules on the Cursor Dashboard for org-wide compliance
- Context over prompting -- Well-structured Rules, Skills, and MCP integrations eliminate the need for repetitive prompt engineering
- Plan before execute -- Separate architectural thinking (expensive model) from implementation (fast model) to get better results at lower cost
- Verifiable goals -- Use typed languages, linters, and tests to give agents clear success criteria; use the grind loop to iterate automatically
- Progressive complexity -- Start with Rules, add Skills when you need dynamic capabilities, add Hooks when you need automation, add Plugins when you need distribution
- Human as architect -- The developer defines intent, reviews plans, and approves outcomes; the agent handles the implementation details