Skip to content

Instantly share code, notes, and snippets.

@kvenkatrajan
Created February 26, 2026 00:12
Show Gist options
  • Select an option

  • Save kvenkatrajan/52e6e77f5560ca30640490b4cc65d109 to your computer and use it in GitHub Desktop.

Select an option

Save kvenkatrajan/52e6e77f5560ca30640490b4cc65d109 to your computer and use it in GitHub Desktop.
Analysis: Why Azure skills invoke unreliably in Claude Sonnet vs Opus, and how to fix skill descriptions

Why Azure Skills Invoke Unreliably in Claude Sonnet vs Opus 4.6

Analysis of skill description patterns across 24 Azure skills, with concrete recommendations for improving invocation reliability on Sonnet.


5 Root Causes (in order of impact)

1. Description Length Overload

Descriptions are too long and dense for Sonnet's skill-selection heuristics. Example from azure-prepare:

Default entry point for Azure application development. Invoke for ANY app work
related to Azure: creating, building, updating, migrating, or modernizing apps.
Analyzes your project and prepares it for Azure deployment by generating
infrastructure code (Bicep/Terraform), azure.yaml, and Dockerfiles.
USE FOR: create an app, build a web app, create API, create frontend, create
backend, add a feature, build a service, develop a project, migrate my app...

That's ~100 words just in the description field. Multiply by 24 skills and Sonnet is parsing ~2,000+ words of skill descriptions before deciding. Opus handles this gracefully; Sonnet's attention gets diluted.

Fix: Cap descriptions at ~40-50 words max. Move keyword lists into the body.

2. Overlapping Trigger Keywords Create Ambiguity

Several skills compete for the same user phrases:

User says Competing skills
"deploy to Azure" azure-prepare, azure-deploy, azure-validate
"Azure Functions" azure-prepare, azure-deploy, azure-ai
"authentication" azure-identity-py, azure-identity-dotnet, entra-app-registration
"storage" azure-storage, azure-prepare

Opus can reason through overlap ("deploy to Azure" + no existing plan = azure-prepare). Sonnet tends to either pick the wrong one or pick none when multiple skills seem equally relevant.

Fix: Make each skill's lead sentence contain a UNIQUE discriminator that no other skill shares.

3. "DO NOT USE FOR" Anti-Patterns Don't Work on Sonnet

Every skill has DO NOT USE FOR: ... clauses. This requires negation reasoning -- Sonnet must read "DO NOT USE FOR: Function apps" and correctly suppress activation. Sonnet is significantly weaker at processing negative constraints than Opus.

Worse, the "DO NOT USE FOR" sections introduce the very keywords that trigger the wrong skill. When azure-ai says DO NOT USE FOR: Function apps/Functions, Sonnet might actually key on "Functions" and activate the skill.

Fix: Remove all "DO NOT USE FOR" clauses entirely. Route positively, not negatively.

4. Inconsistent Description Structure

The skills use 3 different formats:

  • azure-identity-py: Triggers: "keyword", "keyword"
  • azure-ai: USE FOR: x, y, z. DO NOT USE FOR: a, b
  • azure-deploy: Prose sentence + USE FOR + DO NOT USE FOR

Sonnet benefits from pattern consistency -- when every description follows the same template, the model learns the pattern and parses it more reliably.

5. Missing Intent Hierarchy

There's no explicit routing logic that tells the model "when in doubt, start here." azure-prepare tries to be the default entry point but competes with 23 other descriptions. Sonnet doesn't reliably infer hierarchical priority.


Recommended Description Template

A concrete formula that works better on Sonnet:

description: >-
  [ACTION VERB] [UNIQUE_DOMAIN]. [One clarifying sentence].
  WHEN: [3-5 distinctive trigger phrases unique to THIS skill].

Before/After Examples

azure-deploy

Before (89 words):

description: >-
  Execute deployment to Azure. Final step after preparation and validation.
  Runs azd up, azd deploy, or infrastructure provisioning commands.
  USE FOR: run azd up, run azd deploy, execute deployment, provision
  infrastructure, push to production, go live, ship it, deploy web app...
  DO NOT USE FOR: creating or building apps (use azure-prepare), validating
  before deploy (use azure-validate).

After (35 words):

description: >-
  Execute deployment to Azure using azd up or azd deploy. Final step after
  app is already prepared and validated.
  WHEN: "run azd up", "deploy my app", "push to production", "go live".

azure-prepare

Before (98 words):

description: >-
  Default entry point for Azure application development. Invoke for ANY app
  work related to Azure: creating, building, updating, migrating, or
  modernizing apps. Analyzes your project and prepares it for Azure deployment
  by generating infrastructure code (Bicep/Terraform), azure.yaml, and
  Dockerfiles.
  USE FOR: create an app, build a web app, create API...
  DO NOT USE FOR: only validating an already-prepared app (use azure-validate)...

After (38 words):

description: >-
  Create, build, or scaffold applications for Azure. Generates Bicep,
  Terraform, azure.yaml, and Dockerfiles. Use this FIRST before validate
  or deploy.
  WHEN: "build a web app", "create an API", "add a database", "migrate to Azure".

azure-ai

Before (68 words):

description: >-
  Use for Azure AI: Search, Speech, OpenAI, Document Intelligence...
  USE FOR: AI Search, query search, vector search...
  DO NOT USE FOR: Function apps/Functions (use azure-functions)...

After (34 words):

description: >-
  Integrate Azure AI services: AI Search, Speech, OpenAI, and Document
  Intelligence into applications.
  WHEN: "vector search", "speech-to-text", "OCR", "Azure OpenAI",
  "hybrid search", "transcribe audio".

azure-identity-py

Before:

description: |
  Azure Identity SDK for Python authentication. Use for DefaultAzureCredential,
  managed identity, service principals, and token caching.
  Triggers: "azure-identity", "DefaultAzureCredential", "authentication",
  "managed identity", "service principal", "credential".

After:

description: >-
  Authenticate Python apps with Azure using azure-identity SDK. Covers
  DefaultAzureCredential, managed identity, and service principals.
  WHEN: Python code with "from azure.identity", "DefaultAzureCredential",
  "pip install azure-identity".

azure-storage

Before:

description: >-
  Azure Storage Services including Blob Storage, File Shares, Queue Storage,
  Table Storage, and Data Lake. Provides object storage, SMB file shares,
  async messaging, NoSQL key-value, and big data analytics capabilities...
  USE FOR: blob storage, file shares, queue storage...
  DO NOT USE FOR: SQL databases, Cosmos DB (use azure-prepare)...

After:

description: >-
  Manage Azure Storage: Blob, File Shares, Queue, Table, and Data Lake.
  Upload, download, and configure storage accounts and access tiers.
  WHEN: "upload blob", "storage account", "file share", "access tier",
  "data lake".

azure-messaging

Before:

description: >-
  Troubleshoot and resolve issues with Azure Messaging SDKs for Event Hubs
  and Service Bus. Covers connection failures, authentication errors, message
  processing issues, and SDK configuration problems.
  USE FOR: event hub SDK error, service bus SDK issue, messaging connection
  failure, AMQP error, event processor host issue, message lock lost...
  DO NOT USE FOR: creating Event Hub or Service Bus resources...

After:

description: >-
  Troubleshoot Azure Event Hubs and Service Bus SDK issues. Covers AMQP
  errors, connection failures, message lock, and checkpoint problems.
  WHEN: "event hub error", "service bus not receiving", "AMQP connection
  failed", "message lock lost".

Summary of Changes

Problem Fix Why it helps Sonnet
Descriptions too long (60-100 words) Cap at 35-45 words Less text to parse across 24 skills
DO NOT USE FOR clauses Remove entirely Eliminates negation reasoning and keyword contamination
USE FOR keyword lists Replace with WHEN: trigger phrases in quotes Quoted phrases are more distinctive than loose keywords
Inconsistent format Use same template everywhere Pattern consistency aids selection
Overlapping triggers Lead with unique action verb + domain First sentence should uniquely identify the skill
No routing hierarchy Add "Use this FIRST" to azure-prepare Explicit priority signal for the default entry point

Core Principle

Sonnet selects skills by fast pattern matching on the first ~20 words, not by deep reasoning over 100-word descriptions. Front-load the signal, eliminate noise, and make each skill's identity unmistakable in its opening phrase.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment