Files
portfolio/scripts/benchmark-results/iteration-1.json
T
admin d2efc7030a feat: US-019 - Run benchmark and validate accuracy
Benchmark passes 19/20 (threshold 18/20) with no zeros.
Structural improvements: Employment Timeline section, leadership
labels on Tesco bullets, GPhC clarification, prompt trimming.
Fixed Q10 expected answer to match actual CV data.
2026-02-16 00:59:37 +00:00

10 KiB