Coldaine/jules_screenshot_review.md

## jules_screenshot_review.md

      
    Raw
  

              jules_screenshot_review.md
            
          
    Jules Code Review: adb_vision Screenshot Backends

Date: 2026-03-03
Reviewer: Claude Opus 4.6
Scope: PRs #44 (scrcpy), #45 (u2), #46 (DroidCast), #47 (DroidCast) on branches pr-44-scrcpy, pr-45-u2, pr-46-droidcast2, pr-47-droidcast
Verdict: All four PRs are non-functional and would fail at runtime. None can be merged as-is.

Executive Summary

Jules was given four GitHub issues to implement screenshot backends for adb_vision/screenshot.py, a clean ADB+VLM module with zero ALAS dependency. Reference implementations exist in alas_wrapped/module/device/method/ (DroidCast: 345 lines, scrcpy: 237 lines, u2: 477 lines). Jules produced four PRs that share a consistent pattern of failure:


Each PR completely replaces screenshot.py instead of filling in the stubs. The existing file has a dispatcher (take_screenshot(), _resolve_backends()), four backend stubs, and the working _capture_screencap implementation. Jules deleted all of this and wrote a standalone file containing only the single backend function. This means every PR is mutually exclusive and destroys the plugin architecture.


Jules never read the reference implementations. Both DroidCast PRs reference DroidCast-debug-1.2.3.apk (does not exist; the repo has DroidCast_raw-release-1.0.apk), use the wrong Java class name, and use the wrong startup method. The scrcpy PR passes redundant -s serial to adb_run. The u2 PR uses a possibly-invalid screenshot endpoint format.


Jules did not read the code it was modifying. The _adb_run helper in server.py already prepends -s SERIAL to all commands. Three of four PRs pass -s serial again, which would produce adb -s SERIAL -s serial ... and fail.


Tests mock everything at the wrong level, verifying the mock wiring rather than the actual protocol behavior. The tests for the DroidCast PRs literally assert that the wrong APK name and wrong service class are called -- the tests "pass" but validate broken code.


Per-PR Detailed Critique

PR #44: scrcpy backend (pr-44-scrcpy)

Files changed: adb_vision/screenshot.py (complete rewrite, 164 lines), adb_vision/test_scrcpy.py (130 lines), adb_vision/pyproject.toml, adb_vision/__init__.py
Critical Bugs

1. Redundant -s serial on every adb_run call (FATAL)
await adb_run("-s", serial, "push", str(local_jar), device_jar, timeout=15.0)
The _adb_run helper in server.py (lines 110-135) already runs adb -s ADB_SERIAL *args. Passing "-s", serial as the first args produces:
adb -s 127.0.0.1:21513 -s 127.0.0.1:5555 push ...

This will fail with an ADB error. Every single adb_run call in this PR has this bug -- push, forward, forward --remove. The implementation cannot work.
2. Destroyed the plugin architecture (FATAL)
The original screenshot.py has 121 lines with a full dispatcher (take_screenshot(), _resolve_backends()) and four backend stubs. Jules deleted all of it and wrote only _capture_scrcpy. The server.py imports from screenshot import take_screenshot -- this import will now fail because take_screenshot no longer exists. The MCP server cannot start.
3. scrcpy v1.20 parameter mismatch
Jules' argument list:
"1.20", "info", "0", "20000000", "1", "-1", "true", "-", "false", "true", "0", "false", "false", codec_options, "-", "false"
Reference ScrcpyOptions.command_v120():
"1.20", "info", "1280", "20000000", "6", "-1", "true", "-", "false", "true", "0", "false", "false", codec_options, "-", "false"
Differences:

max_size: Jules passes "0" (no limit), reference passes "1280". While 0 technically works, it means scrcpy encodes at full device resolution instead of the expected 1280x720, wasting bandwidth and decode time.
max_fps: Jules passes "1", reference passes "6". Setting fps=1 is a reasonable choice for single-frame capture, but the codec_options are wrong.
codec_options: Jules passes "i-frame-interval=0", but the reference uses the full options string: "key_profile=1,key_level=4096,key_quality=100,key_bitrate_mode=0,key_i_frame_interval=0,key_color_format=12,key_capture_rate=6,key_bit_rate=20000000". Missing Baseline profile and constant-quality settings means the encoder may produce B-frames that require multiple frames to decode, defeating the purpose of "capture one frame."

4. Missing control socket connection
The scrcpy v1.20 protocol requires TWO connections to the abstract socket: first the video socket, then the control socket. The reference implementation (core.py lines 100-122) creates both. Jules creates only one connection. Whether the server hangs waiting for the second connection depends on whether control=true -- which Jules passes. With control=true and no second connection, the server may block indefinitely.
5. Port forwarding approach
Jules uses adb forward tcp:0 localabstract:scrcpy which is technically correct but different from the reference approach. The reference uses adb.create_connection(Network.LOCAL_ABSTRACT, "scrcpy") -- a direct ADB-level abstract socket connection without host port forwarding. Jules' approach works but adds latency and a port resource that needs cleanup.
6. Pushes the jar on every single screenshot call
The reference has an scrcpy_init() method that pushes the jar once, then scrcpy_ensure_running() only reconnects if needed. Jules pushes the jar, starts the server, connects, decodes one frame, kills the server, and removes the port forward -- every single time adb_screenshot is called. This would take 5-10 seconds per screenshot instead of the ~50ms the reference achieves by keeping the stream alive.
Test Issues

The test at test_scrcpy.py line 59 asserts:
adb_run.assert_any_call("-s", serial, "push", ANY, "/data/local/tmp/scrcpy-server.jar", timeout=15.0)
This validates the broken -s serial call pattern. The test passes but confirms the wrong behavior.

PR #45: uiautomator2 ATX HTTP backend (pr-45-u2)

Files changed: adb_vision/screenshot.py (complete rewrite, 124 lines), adb_vision/test_u2.py (143 lines)
Critical Bugs

1. Redundant -s serial on every adb_run call (FATAL)
Same bug as PR #44:
await adb_run("-s", serial, "forward", f"tcp:{local_port}", "tcp:7912", timeout=5.0)
Produces adb -s SERIAL -s serial forward .... Every adb_run call in the file has this bug.
2. Destroyed the plugin architecture (FATAL)
Same as PR #44. Only _capture_u2 exists; take_screenshot, _resolve_backends, and all other backends are deleted. The MCP server cannot import take_screenshot.
3. Wrong screenshot endpoint
screenshot_url = f"{url}/screenshot/0?format=png"
The ATX agent's /screenshot/0 endpoint returns JPEG by default. The ?format=png query parameter is not documented in the standard ATX agent HTTP API. The reference implementation uses self.u2.screenshot(format='raw') which calls the Python library's method that handles format negotiation internally. If the ATX agent ignores the format parameter (likely), this code will receive JPEG data, then the PNG magic byte check will fail:
if not content.startswith(b"\x89PNG"):
    raise RuntimeError(f"ATX agent on {serial} returned invalid image format (not PNG)")
This will always raise on standard ATX agent installations.
4. Hardcoded asset paths that do not exist
paths_to_check = [
    "alas_wrapped/bin/uiautomator2",
    "agent_orchestrator/.venv/lib/python3.12/site-packages/uiautomator2/assets"
]
Neither of these directories exist in the repo. The adb_vision/ module is supposed to have zero ALAS dependency, yet this code reaches into alas_wrapped/bin/ and an agent_orchestrator venv. The path uiautomator2/assets assumes a specific Python version (3.12) and a specific venv location. This is brittle guesswork.
5. atx-agent startup command blocks
await adb_run("-s", serial, "shell", "/data/local/tmp/atx-agent", "server", "--nouia", "-d", "--addr", "127.0.0.1:7912", timeout=10.0)
The -d flag daemonizes atx-agent, but adb shell may still block until the process backgrounds itself. With a 10-second timeout, this might work, but it also might timeout on slow emulators. The reference implementation handles this through the u2 Python library which manages the agent lifecycle more carefully.
6. No forward cleanup, silent failures everywhere
finally:
    # We don't remove the forward to allow reuse and faster subsequent captures.
    pass
While intentionally keeping the forward is a valid optimization, the implementation creates a NEW forward (with a new random port via _get_free_port()) on every single call. Over time, this leaks dozens of port forwards. The fallback code tries to find existing forwards, but it is racey and fragile.
Test Issues

The tests at test_u2.py assert the broken -s serial pattern:
self.adb_run.assert_any_call("-s", self.serial, "forward", "tcp:8888", "tcp:7912", timeout=5.0)
The mock patches requests.get but the actual code uses asyncio.to_thread(requests.get, ...). The patches happen to work because asyncio.to_thread calls the same function object, but this is testing mock wiring, not actual HTTP behavior.

PR #46: DroidCast backend (pr-46-droidcast2)

Files changed: adb_vision/screenshot.py (complete rewrite, 71 lines), adb_vision/test_server.py (124 lines, overwriting the existing test file name)
Critical Bugs

1. Wrong APK filename (FATAL)
apk_path = "alas_wrapped/bin/DroidCast/DroidCast-debug-1.2.3.apk"
This file does not exist. The actual file in the repository is:
alas_wrapped/bin/DroidCast/DroidCast_raw-release-1.0.apk

The install -r command will fail because the local file does not exist. Jules appears to have hallucinated this filename.
2. Wrong Java class and startup method (FATAL)
await adb_run(
    "shell", "am", "start-foreground-service",
    "-n", "com.rayworks.droidcast/.Main",
    "-a", "android.intent.action.MAIN",
    timeout=10.0
)
This is wrong in three ways:

Wrong class: com.rayworks.droidcast/.Main is the original DroidCast project. The repo uses DroidCast_raw which has class ink.mol.droidcast_raw.Main.
Wrong startup method: DroidCast_raw is NOT an Android service. It does not respond to am start-foreground-service. It is started via app_process:
CLASSPATH=/data/local/tmp/DroidCast_raw.apk app_process / ink.mol.droidcast_raw.Main


Fallback is also wrong: The fallback uses CLASSPATH=/data/local/tmp/DroidCast.dex (wrong file extension, wrong filename) and com.rayworks.droidcast.Main (wrong class).

3. Wrong HTTP endpoint
url = "http://localhost:53516/screenshot?format=png"
DroidCast_raw's /screenshot endpoint returns RGB565 raw bitmap, not PNG. To get PNG, you use /preview. The ?format=png query parameter is not part of DroidCast_raw's API. The reference code uses:

/preview for PNG (in screenshot_droidcast)
/screenshot for RGB565 raw bitmap (in screenshot_droidcast_raw)

Jules appears to have confused DroidCast (original, supports PNG at /preview) with DroidCast_raw (modified version, serves RGB565 at /screenshot), and then added a ?format=png query parameter that neither version supports.
4. Destroyed the plugin architecture (FATAL)
Same as other PRs. Only _capture_droidcast exists.
5. Uses requests library -- adds dependency
The adb_vision/ module aims for minimal dependencies. PR #46 adds requests as a dependency for a simple HTTP GET. The existing screencap backend uses zero HTTP libraries.
Test Issues

Tests in test_server.py validate the wrong APK name and wrong service call:
adb_run.assert_any_call("install", "-r", "alas_wrapped/bin/DroidCast/DroidCast-debug-1.2.3.apk", timeout=30.0)
adb_run.assert_any_call(
    "shell", "CLASSPATH=/data/local/tmp/DroidCast.dex",
    "app_process", "/", "com.rayworks.droidcast.Main",
    timeout=5.0
)
The tests pass, confirming the broken behavior is "working as implemented." This is worse than having no tests -- it creates false confidence.

PR #47: DroidCast backend (pr-47-droidcast)

Files changed: adb_vision/screenshot.py (complete rewrite, 70 lines), adb_vision/test_droidcast.py (98 lines)
Critical Bugs

This PR has every bug from PR #46, because it is essentially the same implementation with minor structural differences.
1. Wrong APK filename (FATAL)
apk_path = "alas_wrapped/bin/DroidCast/DroidCast-debug-1.2.3.apk"
Same hallucinated filename. Does not exist.
2. Wrong Java class and startup method (FATAL)
await adb_run(
    "shell", "am", "start-foreground-service",
    "-n", "com.rayworks.droidcast/.Main",
    "-a", "android.intent.action.MAIN",
    timeout=10.0
)
Same wrong class, wrong startup method. No app_process fallback in this version.
3. Wrong HTTP endpoint (FATAL)
url = f"http://localhost:{port}/screenshot?format=png"
Same /screenshot?format=png confusion.
4. Destroyed the plugin architecture (FATAL)
Same as all other PRs.
5. No -s serial bug here -- but only by accident
Unlike PRs #44 and #45, this PR does not pass -s serial to adb_run. This happens to be correct, but it is inconsistent with the other PRs, suggesting Jules did not have a systematic understanding of the _adb_run API.
6. Uses urllib.request -- different from PR #46
PR #46 uses requests, PR #47 uses urllib.request. Two PRs for the same backend, submitted by the same agent, using different HTTP libraries. This suggests no consistent design decisions.
Test Issues

Same pattern as PR #46. Tests validate the wrong APK and wrong startup command:
adb_run.assert_any_call("install", "-r", mock.ANY, timeout=30.0)
adb_run.assert_any_call("shell", "am", "start-foreground-service", "-n", "com.rayworks.droidcast/.Main", "-a", "android.intent.action.MAIN", timeout=10.0)

What Jules Got Right

To be fair:


Correct function signature. All four PRs implement async def _capture_X(*, adb_run, serial, adb_exe) -> str matching the expected keyword-only arguments and return type.


Async/await pattern. Jules correctly used asyncio.to_thread for blocking HTTP calls (PRs #45, #46) and asyncio.create_subprocess_exec for the scrcpy server process (PR #44). The async patterns are sound.


Idempotency attempts. PRs #46 and #47 both check whether the service is already running before attempting setup. This is a good design instinct.


Cleanup in finally blocks. PR #44 correctly uses try/finally to terminate the scrcpy server process and remove port forwards. PR #45 intentionally keeps the forward for reuse (with a comment explaining why).


base64 return format. All PRs correctly return base64.b64encode(data).decode("ascii").


scrcpy protocol structure. PR #44 correctly identifies the 69-byte header (1 dummy + 64 device name + 4 resolution), correctly uses PyAV's CodecContext for H.264 decode, and correctly converts frames to PNG. If the -s serial and plugin architecture bugs were fixed, the core decode logic would likely work.


Port forward cleanup. PR #44's finally block removes the port forward, preventing resource leaks.


Test structure. All PRs include tests with happy path, failure, and edge cases. The test organization is reasonable even if the assertions validate the wrong behavior.


Systemic Issues: What This Tells Us About Jules as a Coding Agent

1. Jules does not read existing code before modifying it

The most damaging pattern: every PR completely replaces screenshot.py instead of filling in the stubs. The existing file has 121 lines of working infrastructure (dispatcher, backend registry, screencap implementation). Jules deleted all of it. This means:

Jules did not read the file it was modifying
Jules did not understand it was filling in stubs within a larger system
Jules treated each task as "write a standalone file" rather than "integrate into existing architecture"

The server.py file has from screenshot import take_screenshot on line 29. Any agent that read server.py would know that take_screenshot must exist in screenshot.py. Jules deleted it.
2. Jules does not read reference implementations

The issue descriptions presumably pointed to the reference implementations in alas_wrapped/module/device/method/. If Jules had read droidcast.py (345 lines, heavily commented), it would have learned:

The APK is DroidCast_raw-release-1.0.apk, not DroidCast-debug-1.2.3.apk
The class is ink.mol.droidcast_raw.Main, not com.rayworks.droidcast.Main
The startup uses app_process, not am start-foreground-service
The /screenshot endpoint returns RGB565, not PNG

Instead, Jules appears to have hallucinated the APK name and used the original DroidCast project's API (which is a different project from the one in the repo).
3. Jules does not understand API contracts it is given

The _adb_run helper signature is documented in server.py:
async def _adb_run(*args: str, timeout: float = 10.0) -> bytes:
    """Run ``adb -s <serial> <args>`` as a non-blocking subprocess."""
The docstring literally says it prepends -s <serial>. Yet PRs #44 and #45 pass -s serial again. This suggests Jules either did not read the docstring or did not understand it. PR #47 gets this right, but inconsistently.
4. Jules creates duplicate/conflicting PRs

PRs #46 and #47 both implement the DroidCast backend. They have the same bugs (wrong APK, wrong class, wrong endpoint) but different HTTP libraries and slightly different structure. This suggests:

Jules was given the same issue twice and produced two independent attempts
Neither attempt learned from the other
There is no deduplication or self-review

5. Jules tests confirm broken behavior

Every test file validates the wrong behavior. For example, PR #47's test asserts:
adb_run.assert_any_call("shell", "am", "start-foreground-service", "-n", "com.rayworks.droidcast/.Main", ...)
This is an assertion that the code calls the wrong class with the wrong startup method. The test passes, creating false confidence. Jules' testing methodology is "mock everything and assert the mocks were called" rather than "verify the protocol is correct."
6. Jules does not consider merge conflicts

All four PRs completely rewrite the same file (adb_vision/screenshot.py). They are mutually exclusive. The first one merged would make the other three conflict. A competent implementation would fill in one stub per PR, leaving the rest of the file intact.
7. Jules has a hallucination problem with file paths

The filename DroidCast-debug-1.2.3.apk appears in both DroidCast PRs and does not correspond to any file in the repository. The actual file is DroidCast_raw-release-1.0.apk. This is a classic LLM hallucination -- Jules "knew" that DroidCast exists and that APKs have version-debug naming conventions, but fabricated the specific filename instead of checking the filesystem.

Recommendations for Actual Merge

None of these PRs should be merged. Here is what a correct implementation looks like:
1. Keep the existing screenshot.py dispatcher intact

Fill in the stubs. Do not delete take_screenshot(), _resolve_backends(), or _capture_screencap().
2. DroidCast backend (one PR, not two)

async def _capture_droidcast(*, adb_run: AdbRunFn, serial: str, adb_exe: str) -> str:
    # Use the actual APK: DroidCast_raw-release-1.0.apk
    # Push with: adb_run("push", str(local_apk), "/data/local/tmp/DroidCast_raw.apk")
    # Start with: adb_run("shell", "CLASSPATH=/data/local/tmp/DroidCast_raw.apk",
    #                      "app_process", "/", "ink.mol.droidcast_raw.Main")
    # Forward:    adb_run("forward", "tcp:53516", "tcp:53516")
    # Fetch PNG:  GET http://127.0.0.1:53516/preview  (NOT /screenshot)
    # OR raw:     GET http://127.0.0.1:53516/screenshot -> decode RGB565 -> encode PNG
3. scrcpy backend


Do NOT pass -s serial to adb_run
Consider keeping the server alive between calls (connection pooling)
Use the full codec_options from the reference for reliable single-frame capture
Create both video AND control socket connections (or pass control=false)

4. u2 backend


Do NOT pass -s serial to adb_run
Use /screenshot/0 and handle JPEG response (or use format=raw and decode)
Do not hardcode venv paths for asset discovery

5. All backends


Fill in the stub in the existing file, do not rewrite
Do not pass -s serial to adb_run (it already does this)
Include proper error handling and logging using the existing log object
One PR per backend, each touching only its own stub function


Appendix: File Inventory


File
Actual in repo
Jules' reference


DroidCast APK
alas_wrapped/bin/DroidCast/DroidCast_raw-release-1.0.apk
DroidCast-debug-1.2.3.apk (hallucinated)


scrcpy JAR
alas_wrapped/bin/scrcpy/scrcpy-server-v1.20.jar
Correct


DroidCast class
ink.mol.droidcast_raw.Main
com.rayworks.droidcast.Main (wrong project)


DroidCast remote path
/data/local/tmp/DroidCast_raw.apk
/data/local/tmp/DroidCast.dex (wrong extension)


DroidCast PNG endpoint
/preview
/screenshot?format=png (wrong)


DroidCast raw endpoint
/screenshot (returns RGB565)
N/A (confused with PNG)


DroidCast startup
CLASSPATH=... app_process / ink.mol.droidcast_raw.Main
am start-foreground-service -n com.rayworks.droidcast/.Main (wrong)
File	Actual in repo	Jules' reference
DroidCast APK	`alas_wrapped/bin/DroidCast/DroidCast_raw-release-1.0.apk`	`DroidCast-debug-1.2.3.apk` (hallucinated)
scrcpy JAR	`alas_wrapped/bin/scrcpy/scrcpy-server-v1.20.jar`	Correct
DroidCast class	`ink.mol.droidcast_raw.Main`	`com.rayworks.droidcast.Main` (wrong project)
DroidCast remote path	`/data/local/tmp/DroidCast_raw.apk`	`/data/local/tmp/DroidCast.dex` (wrong extension)
DroidCast PNG endpoint	`/preview`	`/screenshot?format=png` (wrong)
DroidCast raw endpoint	`/screenshot` (returns RGB565)	N/A (confused with PNG)
DroidCast startup	`CLASSPATH=... app_process / ink.mol.droidcast_raw.Main`	`am start-foreground-service -n com.rayworks.droidcast/.Main` (wrong)
No results found