On very large pages, the snapshot can fill the entire context #1329
believe a good source of inspiration is Claude Code/Codex. There are two things that I'm thinking of which is:
Grab the HTML instead. Unfortunately, usually when the browser_snapshot is so huge, the html is also pretty big (in my case 12m HTML and above sometimes) Take a browser_screenshot and grab the HTML and browser_snapshot and use a "Grep" tool or an HTML parser (or however your agent is running) to whatever you need so that your MCP can do a browser action. Images should be much smaller. Here's some slop Claude code that worked for me very well. I'm using openai/agents, but this can be used for any custom or alternative agent package.