Skip to content

Instantly share code, notes, and snippets.

@harupy
Last active February 20, 2026 07:05
Show Gist options
  • Select an option

  • Save harupy/d6631a5680eb19ea2fd4d7488aa32e2f to your computer and use it in GitHub Desktop.

Select an option

Save harupy/d6631a5680eb19ea2fd4d7488aa32e2f to your computer and use it in GitHub Desktop.
GitHub Copilot Code Review: Comment Classification Analysis (mlflow/mlflow)

GitHub Copilot Code Review: Comment Classification Analysis

Analysis of the Copilot code review (copilot-pull-request-reviewer) dynamic workflow on mlflow/mlflow.

  • Source: CCR Agent: Comment stored log lines from the Run Autofind Agent step in the Agent job
  • Sample: 270 comments from 100 completed workflow runs
  • Date: 2026-02-20

Severity

3 levels:

Severity Count %
moderate 164 61%
nit 60 22%
critical 46 17%

Comment Type

19 distinct types observed:

Type Count Critical Moderate Nit
bug 60 27 33
test_coverage 41 1 40
api_design 37 12 25
maintainability 25 10 15
documentation 19 12 7
naming 12 8 4
best_practices 12 1 6 5
security 11 1 9 1
spelling 10 10
discrepancy_with_pr_description 10 3 7
operational_implications 7 1 5 1
error_message 6 5 1
codebase_conventions 5 1 4
concurrency 4 4
accessibility 4 3 1
performance 3 2 1
custom 2 2
future_dates 1 1
other 1 1

Key Observations

  • bug is the most common type and the one most frequently rated critical (27 out of 46 total critical comments).
  • test_coverage is almost exclusively moderate (40 out of 41).
  • spelling is always nit.
  • accessibility has the highest critical rate — 3 out of 4 comments are critical.
  • api_design is the second most common source of critical comments (12).
  • Rare/edge-case types include custom, future_dates, and other.

Schema

Each stored comment has the following fields:

{
  "comment_id": "001",
  "comment_type": "bug",
  "file_location": "path/to/file.py",
  "start_line": 42,
  "end_line": 42,
  "comment_content": "The review comment text...",
  "guideline_id": "1000002",
  "fixed": false,
  "severity": "moderate"
}

How to Reproduce

# 1. List completed runs (workflow ID 201861839)
gh api "repos/mlflow/mlflow/actions/workflows/201861839/runs?per_page=100&status=completed" \
  --jq '.workflow_runs[].id'

# 2. For each run, find the Agent job
gh api "repos/mlflow/mlflow/actions/runs/<RUN_ID>/jobs" \
  --jq '.jobs[] | select(.name == "Agent") | .id'

# 3. Grep the job logs for stored comments
gh api "repos/mlflow/mlflow/actions/jobs/<JOB_ID>/logs" | grep "Comment stored"
#!/usr/bin/env bash
# Fetch "Comment stored" logs from Copilot code review workflow runs
# and analyze comment_type and severity distributions.
set -euo pipefail
REPO="mlflow/mlflow"
WORKFLOW_ID=201861839
PER_PAGE=100 # max allowed by GitHub API
OUTPUT_DIR="/tmp/copilot-review-logs"
COMMENTS_FILE="$OUTPUT_DIR/all_comments.jsonl"
mkdir -p "$OUTPUT_DIR"
> "$COMMENTS_FILE"
echo "==> Fetching workflow run IDs (up to $PER_PAGE)..."
RUN_IDS=$(gh api "repos/$REPO/actions/workflows/$WORKFLOW_ID/runs?per_page=$PER_PAGE&status=completed" \
--jq '.workflow_runs[].id')
TOTAL=$(echo "$RUN_IDS" | wc -l | tr -d ' ')
echo "==> Found $TOTAL completed runs"
COUNT=0
RUNS_WITH_COMMENTS=0
for RUN_ID in $RUN_IDS; do
COUNT=$((COUNT + 1))
# Find the Agent job ID for this run
AGENT_JOB_ID=$(gh api "repos/$REPO/actions/runs/$RUN_ID/jobs" \
--jq '.jobs[] | select(.name == "Agent") | .id' 2>/dev/null || true)
if [[ -z "$AGENT_JOB_ID" ]]; then
echo " [$COUNT/$TOTAL] Run $RUN_ID — no Agent job found, skipping"
continue
fi
# Fetch logs and extract "Comment stored" lines
STORED=$(gh api "repos/$REPO/actions/jobs/$AGENT_JOB_ID/logs" 2>/dev/null \
| grep "Comment stored" || true)
if [[ -z "$STORED" ]]; then
echo " [$COUNT/$TOTAL] Run $RUN_ID — 0 comments"
continue
fi
NUM_COMMENTS=$(echo "$STORED" | wc -l | tr -d ' ')
RUNS_WITH_COMMENTS=$((RUNS_WITH_COMMENTS + 1))
echo " [$COUNT/$TOTAL] Run $RUN_ID — $NUM_COMMENTS comment(s)"
# Extract the JSON part (everything after the "Comment stored" tab) and append
echo "$STORED" | sed 's/.*Comment stored\t//' >> "$COMMENTS_FILE"
done
TOTAL_COMMENTS=$(wc -l < "$COMMENTS_FILE" | tr -d ' ')
echo ""
echo "========================================"
echo "RESULTS: $TOTAL_COMMENTS comments from $RUNS_WITH_COMMENTS runs (out of $TOTAL total)"
echo "========================================"
if [[ "$TOTAL_COMMENTS" -eq 0 ]]; then
echo "No comments found."
exit 0
fi
echo ""
echo "--- comment_type distribution ---"
jq -r '.comment_type' "$COMMENTS_FILE" | sort | uniq -c | sort -rn
echo ""
echo "--- severity distribution ---"
jq -r '.severity' "$COMMENTS_FILE" | sort | uniq -c | sort -rn
echo ""
echo "--- comment_type x severity ---"
jq -r '"\(.comment_type)\t\(.severity)"' "$COMMENTS_FILE" | sort | uniq -c | sort -rn
echo ""
echo "--- sample comments per (comment_type, severity) ---"
jq -r '"\(.comment_type)\t\(.severity)\t\(.file_location):\(.start_line)\t\(.comment_content)"' "$COMMENTS_FILE" \
| awk -F'\t' '
{
key = $1 "\t" $2
if (!(key in seen)) {
seen[key] = 1
printf "\n[%s | %s]\n file: %s\n comment: %.200s\n", $1, $2, $3, $4
}
}'
echo ""
echo "Raw comments saved to: $COMMENTS_FILE"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment