Update Ralph loop: replace Claude in Chrome with Playwright MCP for visual review

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-02-13 00:11:50 +00:00
parent 4324f06186
commit 8094f74800
3 changed files with 13 additions and 14 deletions
+9 -10
View File
@@ -41,11 +41,10 @@ The sidebar uses CV-intuitive labels, NOT clinical jargon. But each view's conte
6. **Run quality checks**: Execute the quality check commands listed in `IMPLEMENTATION_PLAN.md` under "Quality Checks". Fix any issues before proceeding. 6. **Run quality checks**: Execute the quality check commands listed in `IMPLEMENTATION_PLAN.md` under "Quality Checks". Fix any issues before proceeding.
7. **Visual Review** (Tasks 1b-11 only — skip for non-visual tasks like Task 1, 12-15): After quality checks pass, verify your work visually in the browser using the Claude in Chrome browser tools: 7. **Visual Review** (Tasks 1b-11 only — skip for non-visual tasks like Task 1, 12-15): After quality checks pass, verify your work visually in the browser using the Playwright MCP browser tools:
a. Call `tabs_context_mcp` to get available tabs (create if empty). a. Navigate to `http://localhost:5173` using `mcp__playwright__browser_navigate`.
b. Navigate to `http://localhost:5173` (dev server runs throughout the loop). b. **First load only**: The app plays a boot→ECG→login→PMR sequence (~15s). Use `mcp__playwright__browser_wait_for` with `time: 15` then take a snapshot. On subsequent navigations, the app stays in PMR phase — no waiting needed.
c. **First load only**: The app plays a boot→ECG→login→PMR sequence (~15s). Use `computer` with `action: "wait", duration: 15` then take a screenshot. On subsequent navigations in the same tab, the app stays in PMR phase — no waiting needed. c. Navigate to the hash route for your task's view:
d. Navigate to the hash route for your task's view:
- Task 1b (Boot/ECG): Refresh page, screenshot during boot sequence, then again during ECG animation - Task 1b (Boot/ECG): Refresh page, screenshot during boot sequence, then again during ECG animation
- Task 2 (Login): Refresh page, wait ~8s (after boot+ECG), screenshot the login screen - Task 2 (Login): Refresh page, wait ~8s (after boot+ECG), screenshot the login screen
- Task 3 (Banner): Any PMR view — review the patient banner at top - Task 3 (Banner): Any PMR view — review the patient banner at top
@@ -53,10 +52,10 @@ The sidebar uses CV-intuitive labels, NOT clinical jargon. But each view's conte
- Task 5 (Layout/Breadcrumb): Any PMR view — review overall composition - Task 5 (Layout/Breadcrumb): Any PMR view — review overall composition
- Task 6: `#summary` | Task 7: `#experience` | Task 8: `#skills` - Task 6: `#summary` | Task 7: `#experience` | Task 8: `#skills`
- Task 9: `#achievements` | Task 10: `#projects` then `#education` | Task 11: `#contact` - Task 9: `#achievements` | Task 10: `#projects` then `#education` | Task 11: `#contact`
e. Take a screenshot (`computer` with `action: "screenshot"`) and compare against your reference file. d. Use `mcp__playwright__browser_snapshot` (accessibility tree) or `mcp__playwright__browser_take_screenshot` (visual) to capture the page, and compare against your reference file.
f. Check specifically: colors match spec, correct font (Inter vs Geist Mono), proper spacing, `1px solid #E5E7EB` borders, 4px border-radius, layout alignment, NHS blue `#005EB8`. e. Check specifically: colors match spec, correct font (Inter vs Geist Mono), proper spacing, `1px solid #E5E7EB` borders, 4px border-radius, layout alignment, NHS blue `#005EB8`.
g. If discrepancies are found: fix them, re-run quality checks, take another screenshot to confirm. f. If discrepancies are found: fix them, re-run quality checks, take another screenshot to confirm.
h. Note the visual review outcome in your progress.txt entry (step 10). g. Note the visual review outcome in your progress.txt entry (step 10).
8. **Commit your changes**: Stage and commit all changes with a descriptive message referencing the task you completed. 8. **Commit your changes**: Stage and commit all changes with a descriptive message referencing the task you completed.
@@ -109,7 +108,7 @@ The sidebar uses CV-intuitive labels, NOT clinical jargon. But each view's conte
- **ALWAYS read the "Design Guidance" section in the ref file before writing visual component code** — do NOT invoke /frontend-design at runtime (it's pre-baked into the ref files) - **ALWAYS read the "Design Guidance" section in the ref file before writing visual component code** — do NOT invoke /frontend-design at runtime (it's pre-baked into the ref files)
- **Do NOT invoke the /frontend-design skill** — the design guidance is already embedded in each ref file. Invoking it at runtime will consume your context and stall the iteration. - **Do NOT invoke the /frontend-design skill** — the design guidance is already embedded in each ref file. Invoking it at runtime will consume your context and stall the iteration.
- **ALWAYS visually review visual components (Tasks 1b-11) in the browser** — use Claude in Chrome tools to screenshot and verify against the spec before committing - **ALWAYS visually review visual components (Tasks 1b-11) in the browser** — use Playwright MCP tools to screenshot and verify against the spec before committing
- **Only work on ONE task per iteration** - **Only work on ONE task per iteration**
- **Always read progress.txt AND guardrails.md before starting** — previous iterations may have left important context - **Always read progress.txt AND guardrails.md before starting** — previous iterations may have left important context
- **If a task is blocked or unclear**, document why in progress.txt and move to the next unchecked item - **If a task is blocked or unclear**, document why in progress.txt and move to the next unchecked item
+1 -1
View File
@@ -83,7 +83,7 @@ Hard rules that MUST be followed in every iteration. Violating these will produc
## Visual Review Guardrails ## Visual Review Guardrails
### When: Completing any visual task ### When: Completing any visual task
**Rule:** After quality checks, open `http://localhost:5173` via Claude in Chrome tools, take a screenshot, and compare against the ref file spec. Fix visual discrepancies. If browser tools are unavailable, note in progress.txt and proceed. **Rule:** After quality checks, open `http://localhost:5173` via Playwright MCP tools (`mcp__playwright__browser_navigate`, `mcp__playwright__browser_take_screenshot`, `mcp__playwright__browser_snapshot`), take a screenshot, and compare against the ref file spec. Fix visual discrepancies. If browser tools are unavailable, note in progress.txt and proceed.
**Why:** Code review alone cannot catch visual issues. **Why:** Code review alone cannot catch visual issues.
### When: Browser tools fail ### When: Browser tools fail
+3 -3
View File
@@ -35,7 +35,7 @@
#> #>
param( param(
[string]$Model = "sonnet", [string]$Model = "opus",
[string]$BranchName, [string]$BranchName,
[int]$MaxNoProgress = 3, [int]$MaxNoProgress = 3,
[int]$MaxSameError = 3 [int]$MaxSameError = 3
@@ -143,14 +143,14 @@ if (Test-Path $progressFile) {
Write-Host "" Write-Host ""
Write-Host "===== Ralph Wiggum Loop (Visualization Improvements) =====" -ForegroundColor Cyan Write-Host "===== Ralph Wiggum Loop (Visualization Improvements) =====" -ForegroundColor Cyan
Write-Host "Model: $Model (dynamic switching enabled) | Visual review: ON | Runs until COMPLETE" -ForegroundColor Cyan Write-Host "Model: $Model (dynamic switching enabled) | Visual review: Playwright MCP | Runs until COMPLETE" -ForegroundColor Cyan
Write-Host "Circuit breakers: no-progress=$MaxNoProgress, same-error=$MaxSameError" -ForegroundColor Cyan Write-Host "Circuit breakers: no-progress=$MaxNoProgress, same-error=$MaxSameError" -ForegroundColor Cyan
if ($BranchName) { Write-Host "Branch: $BranchName" -ForegroundColor Cyan } if ($BranchName) { Write-Host "Branch: $BranchName" -ForegroundColor Cyan }
if ($existingIterations -gt 0) { Write-Host "Previous iterations: $existingIterations" -ForegroundColor Cyan } if ($existingIterations -gt 0) { Write-Host "Previous iterations: $existingIterations" -ForegroundColor Cyan }
Write-Host "===========================================" -ForegroundColor Cyan Write-Host "===========================================" -ForegroundColor Cyan
Write-Host "" Write-Host ""
# --- Dev Server (for visual review via Claude in Chrome) --- # --- Dev Server (for visual review via Playwright MCP) ---
$devServerPort = 5173 $devServerPort = 5173
$devServerPid = $null $devServerPid = $null