Which tool sees better?

Same ugly page. Same prompt. 5 rounds. Three independent AI agents, each with a different observation tool. The only variable: how much the tool reveals about the page's visual state.

1
Starting Page
3
Tools
5
Rounds Each
0
Human Hints
Before
20+ deliberate design problems: Comic Sans, invisible CTA, clashing colors, tiny text, no visual hierarchy View full page
▼ ▼ ▼
After 5 rounds
Playwright snapshot
Experiment B
chrome-cdp-ex
chrome-cdp perceive
Experiment A
Tool C snapshot
Experiment C

What each tool revealed

Playwright
browser_snapshot returns element roles and names. No colors, no spacing, no font sizes, no layout.
Clean white/blue theme. Proper hierarchy. Professional — but relied on reading CSS source to find visual issues.
chrome-cdp-ex
perceive returns layout dimensions, background colors, font sizes, contrast hints, scroll position, and bounding coordinates — the agent knows what things look like.
Polished dark SaaS theme. Google Fonts, gradient text, SVG icons, differentiated CTA styles, social proof, multi-column footer.
Tool C
snapshot returns accessibility tree with element refs. Similar to Playwright but more compact (~400 vs ~3,500 tokens).
Dark/white hybrid. Solid layout, icon placeholders, "Most Popular" badge. Clean but less refined.

Experiment Design

VariableControl
Starting fileIdentical challenge.html — no hint comments
AgentIndependent Claude Code session per tool
PromptSame structure — only tool names differ
RoundsExactly 5 per agent
Source accessAll agents can read HTML source
Observation toolThe only variable

All three agents dramatically improved the page. The difference is in how quickly they identified visual issues and how much polish they achieved — the agent with richer visual context made more nuanced design decisions.