feat: US-016 - Enrich system prompt with full CV context

This commit is contained in:
2026-02-16 00:39:38 +00:00
parent 8cc7038942
commit 194f83f490
4 changed files with 193 additions and 27 deletions
+1 -1
View File
@@ -315,7 +315,7 @@
"Typecheck passes"
],
"priority": 16,
"passes": false,
"passes": true,
"notes": "The current system prompt uses buildEmbeddingTexts() which gives one paragraph per palette item — good for embeddings but too compressed for detailed Q&A. The enriched prompt should read more like a structured CV with full bullet points. Source content from References/CV_v4.md — read the file to extract all detail. Consider structuring as: ## Profile (personal statement), ## Career History (each role as ### with bullets), ## Education (each qualification), ## Projects (each project with tech and outcomes). Keep it well-structured with markdown headers — LLMs parse this better than flat text."
},
{
+26
View File
@@ -370,3 +370,29 @@
- For Node.js scripts, use a static URL for `HTTP-Referer` header (e.g., `'https://andycharlwood.co.uk'`) since `window.location` isn't available
- The benchmark script's `buildSystemPrompt()` should be kept in sync with `llm.ts` manually — if one changes, update the other (US-016/US-017 will modify the production prompt)
---
## 2026-02-16 - US-016
- Rewrote `buildSystemPrompt()` in `src/lib/llm.ts` with full CV context from `References/CV_v4.md`
- Replaced `buildEmbeddingTexts()` approach (one-paragraph-per-item) with structured CV format:
- Profile section with professional summary
- Career History with full achievement bullets per role, clinical specialties, methodology details
- Projects with tech stack and outcomes
- Education with grades, subjects, research topics, classifications
- Skills in compact format with years and proficiency
- NHS employment (May 2022+, all at Norfolk & Waveney ICB) explicitly distinguished from private sector (Tesco PLC)
- Clinical specialties listed under High-Cost Drugs role: rheumatology, ophthalmology (wet AMD, DMO, RVO), dermatology, gastroenterology, neurology, migraine
- dm+d integration details, switching algorithm methodology, tirzepatide commissioning context all included
- Mary Seacole Programme: 2018, 78%, NHS Leadership Academy
- A-Levels: Mathematics A*, Chemistry B, Politics C — Highworth Grammar School 20092011
- System prompt is 7,982 bytes (under 8KB limit)
- Removed `buildEmbeddingTexts` import from llm.ts (no longer needed)
- Mirrored identical prompt in `scripts/benchmark.ts` (with comment noting manual sync requirement)
- Removed `buildEmbeddingTexts` import from benchmark.ts
- Typecheck (0 errors), lint (0 errors), production build all pass
- Files changed: `src/lib/llm.ts`, `scripts/benchmark.ts`
- **Learnings for future iterations:**
- The structured CV format (markdown headers + bullets per role) is more effective for LLM Q&A than one-paragraph-per-palette-item — LLMs parse structured markdown better
- Item IDs are embedded in section headers (e.g., `### [exp-deputy-head-2024]`) rather than as line prefixes — cleaner format that still allows the model to reference IDs
- System prompt no longer depends on `buildEmbeddingTexts()` — the CV context is hardcoded. This means prompt content and embedding texts can diverge (prompt is optimised for Q&A, embeddings for semantic search)
- When the prompt is close to the 8KB limit, trim verbose connecting phrases and redundant qualifiers first — the specific facts and numbers are what matter for accuracy
---