| name | description |
|---|---|
eagle-eye-screenshot-review |
Deep screenshot QA orchestrator for visual + a11y regressions. Use after recording Paparazzi screenshots; launches subagents per test class and requires a11y parity. |
Run a structured post-screenshot review across all newly generated Paparazzi images and enforce accessibility screenshot parity.
- Immediately after
recordPaparazziRelease - Before opening a PR with screenshot changes
- When UI semantics/content descriptions changed
- When visual regressions are reported in CI
- Always run after screenshot recording.
- Fail if a11y screenshots are missing for any new/updated visual screenshot test.
- Do not sign off until visual + a11y checks are complete.
- If missing a11y tests, implement them, record snapshots, and re-run review.
- When a11y semantics issues are found, invoke
a11y-reviewbefore final recommendations.
- Newly generated screenshots in
src/test/snapshots/images/ - Corresponding screenshot test classes in
src/test/java/.../*ScreenshotTest.kt - A11y screenshot classes in
src/test/java/.../*A11yScreenshotTest.kt - UI state sources (view states/sealed classes/feature flags/boolean branches) used by the composable under test
Act as an orchestrator and launch subagents to review screenshot groups.
- Gather changed screenshot PNGs (focus on files generated this run).
- Group by screenshot test class prefix in filename.
- Launch up to 4 subagents in parallel (one group per subagent).
- Each subagent must inspect:
- Visual screenshots (LIGHT/DARK, device variants)
- Matching a11y screenshots (if present)
- Related test/composable source when needed to validate semantics intent
- Reachable UI state definitions to validate screenshot state coverage
- Merge subagent findings into one severity-sorted report.
If a subagent times out (for example during image captioning), the orchestrator must not stop or skip coverage.
- Detect timeout/hang/failure from subagent output.
- Re-run that group in smaller chunks:
- split by state first (e.g.,
withImageSrc,withCreatorKitRemoved) - if needed split again by theme/device variant
- if still needed, process one screenshot per subagent task
- split by state first (e.g.,
- Continue retries until every targeted screenshot has been reviewed or an explicit blocker is documented.
- Merge partial results from all retries into one final report and clearly mark retries performed.
- Do not mark PASS while any intended screenshot remains unreviewed.
Subagent output format:
severity: critical | major | minorscreenshot: filenameissue: short descriptionevidence: what is visible in screenshot/a11y panelrecommended_fix: actionable patch direction
Final orchestrator report format (required):
- For every finding, include clickable markdown links to:
- problematic screenshot file
- related screenshot test file
- related source/composable file (when applicable)
- Use this format:
severity | [screenshot](absolute-path-to-png) | [test](absolute-path-to-test-file) | issue | evidence | recommended_fix
- If multiple screenshots are affected, include all as separate markdown links in the same row.
- If no findings, still include links to reviewed test files and representative screenshots in the PASS summary.
- Include a state coverage table per screenshot group:
state | represented_in_visual (yes/no) | represented_in_a11y (yes/no) | evidence
- Contrast issues (text/icon/background contrast)
- Touch target sizing (aim for >= 48dp interactive targets)
- Clipped text, clipped icons, clipped containers
- Unexpected truncation/overlap/wrapping artifacts
- Insufficient spacing or inconsistent padding rhythm
- Misalignment and layout jitter between states/themes
- Incorrect disabled-state affordances (looks enabled when disabled, or vice versa)
- RPL/design-system consistency (prefer existing RPL components/patterns)
- Theme parity (LIGHT vs DARK visual correctness)
- Empty/placeholder/image-loading artifacts
- Build a state matrix from component inputs and branching logic (sealed states, enums, booleans, feature flags, and key combinations).
- Verify every reachable/important state has screenshot coverage in visual tests.
- Verify every reachable/important visual state has corresponding a11y screenshot coverage.
- Treat unrepresented states as coverage defects even if no visual regression is currently visible.
- If a state is intentionally excluded, require explicit rationale in the report.
- Missing a11y screenshots for changed visual tests (hard failure)
- Focus-target correctness: label/role must be on the node TalkBack actually focuses (container vs child chosen intentionally)
- A11y panel entries match actually enabled actions
- Action-availability parity:
contentDescription, semantics actions, andclickable(enabled=...)must be gated by the same condition - Disabled actions are not exposed as actionable semantics
- Content descriptions are meaningful for action icons
- Decorative icons are hidden (
contentDescription = null) - Role/action labels are coherent with interaction behavior
- Interactive role correctness: interactive elements must use an interactive role (for example
Role.Button) and must not be exposed with non-interactive roles such asrole=Imagewhen acting like a button - No contradictory semantics (e.g., shows "Edit" but action disabled)
- No nested parent+child clickables that represent the same action
- Semantics modifier intent is correct: prefer
redditClearAndSetSemanticsin app code; usemergeDescendantswhen child actions must remain focusable - If semantics clearing/grouping hides child actions, required actions are re-exposed via custom actions
- Aggregate announcements include meaningful visual status indicators (badges/verification/state)
- Icon-only action labels describe user action (e.g., "Remove selfie image"), not glyph names (e.g., "Close icon")
- Account for
useUnmergedSemanticsTreeexperiment behavior when validating semantics recommendations
Use a11y-review as the semantics source of truth when eagle-eye detects accessibility issues.
- During subagent review, if findings involve semantics/labels/roles/action exposure, run
a11y-review. - Validate fix direction against
a11y-reviewfocus-target and icon-label rules. - Include in the final report:
- what
a11y-reviewrule was applied - whether screenshot/a11y output now matches that rule
- what
- If a new recurring pattern is discovered, update both skills:
- add detection rule in
eagle-eye-screenshot-review - add implementation guidance in
a11y-review
- add detection rule in
If a screenshot/a11y panel shows an action label (e.g., "Edit selfie image") while that action is disabled/unavailable in that state, report as major and require code fix.
If an interactive/focusable action is announced with an incorrect non-interactive role (for example Edit selfie image {role=Image}), report as major and require role semantics fix.
When visual screenshots exist but matching a11y screenshots do not:
- Create
*A11yScreenshotTestusingPaparazziA11yTest+A11yConfig. - Mirror key states from the visual screenshot test.
- For image-loading composables (
rememberGlidePainter), useMockPainterProvider+SizedBrushPainterand call:paparazzi.redditSnapshot(mockPainterProvider = sizedMockPainterProvider)
- Record snapshots again.
- Re-run screenshot tests.
- Re-run eagle-eye review.
When eagle-eye detects uncovered UI states:
- Add missing visual screenshot test case(s) for the uncovered state(s).
- Add matching a11y screenshot test case(s) for those same state(s).
- Record snapshots again.
- Re-run screenshot tests.
- Re-run eagle-eye review and confirm state coverage table is complete.
Only pass when all are true:
- Visual screenshots reviewed with no unresolved critical/major issues
- Every intended screenshot was actually reviewed (including timeout retries in smaller chunks)
- All reachable/important UI states are represented in visual screenshot tests (or explicitly justified)
- A11y screenshots exist for all changed visual screenshot tests
- All represented visual states have corresponding a11y state coverage (or explicitly justified)
- Any missing a11y tests were implemented and recorded
- Any missing visual/a11y state coverage was implemented and recorded
- Updated screenshot tests pass
- Findings summary is documented with explicit pass/fail per screenshot group
# Record
./gradlew :module:impl:recordPaparazziRelease
# Test
./gradlew :module:impl:testReleaseUnitTest