merge

2026-02-15 23:20:24 +00:00
parent 4580ca9c84
commit 0fbbf9e46f
10 changed files with 1576 additions and 2 deletions
@@ -0,0 +1,276 @@
+{
+  "project": "Portfolio — Semantic Search & AI Chat",
+  "branchName": "ralph/semantic-search",
+  "description": "Replace Fuse.js command palette search with client-side semantic vector search (ONNX model), then add a Gemini Flash-powered AI chat widget.",
+  "userStories": [
+    {
+      "id": "US-001",
+      "title": "Install @xenova/transformers and add generate-embeddings script skeleton",
+      "description": "As a developer, I need the Transformers.js dependency installed and a runnable script scaffold so subsequent stories can generate and use embeddings.",
+      "acceptanceCriteria": [
+        "npm install @xenova/transformers",
+        "Create scripts/generate-embeddings.ts with a main() function that imports the pipeline from @xenova/transformers",
+        "Script loads the all-MiniLM-L6-v2 model and embeds a single test string, logging the vector length to confirm it works",
+        "Add npm script: \"generate-embeddings\": \"npx tsx scripts/generate-embeddings.ts\"",
+        "Running npm run generate-embeddings prints the vector length (384) and exits cleanly",
+        "Typecheck passes"
+      ],
+      "priority": 1,
+      "passes": true,
+      "notes": "Use @xenova/transformers (not @huggingface/transformers — the Xenova fork has better Node.js ONNX support). The model ID is 'Xenova/all-MiniLM-L6-v2'. Pipeline type is 'feature-extraction'. tsx is already available via npx for running TypeScript scripts."
+    },
+    {
+      "id": "US-002",
+      "title": "Build rich text representations for each palette item",
+      "description": "As a developer, I want each palette item to have a natural-language paragraph for embedding that captures deep context, not just the title.",
+      "acceptanceCriteria": [
+        "New function buildEmbeddingTexts() in src/lib/search.ts that returns Array<{ id: string, text: string }> for all palette items",
+        "Consultation items include: role, org, duration, history narrative, examination bullets, coded entry descriptions",
+        "Skill items include: name, category, frequency, proficiency percentage, years of experience",
+        "KPI items include: value, label, explanation, story context and outcomes",
+        "Investigation items include: name, methodology, tech stack list, results",
+        "Education items include: title, institution, type, research detail",
+        "Quick Action items include: title and subtitle (short text is fine)",
+        "Achievement items include: title, subtitle, and linked KPI story context if available",
+        "Each text is a readable natural-language paragraph, not a keyword dump",
+        "Typecheck passes"
+      ],
+      "priority": 2,
+      "passes": true,
+      "notes": "This function will be used by both the build script (to generate embeddings) and potentially by the chat widget (for context). Import the raw data files (consultations, skills, kpis, investigations, documents) to access the full data beyond what buildPaletteData() surfaces. The id must match the PaletteItem id so embeddings can be correlated."
+    },
+    {
+      "id": "US-003",
+      "title": "Generate and commit embeddings.json",
+      "description": "As a developer, I want the generate-embeddings script to produce a complete embeddings.json file using the rich text representations.",
+      "acceptanceCriteria": [
+        "scripts/generate-embeddings.ts imports buildEmbeddingTexts() from src/lib/search.ts",
+        "Script embeds each item's text using the all-MiniLM-L6-v2 model via @xenova/transformers pipeline",
+        "Outputs src/data/embeddings.json as an array of { id: string, embedding: number[] }",
+        "Each embedding is a 384-dimensional float array",
+        "Running npm run generate-embeddings regenerates the file successfully",
+        "The JSON file is valid and parseable",
+        "Typecheck passes"
+      ],
+      "priority": 3,
+      "passes": true,
+      "notes": "The pipeline returns a Tensor — use .tolist() or .data to extract the raw float array. Mean-pool across the token dimension (dim 1) to get a single 384-d vector per input. Process items sequentially to avoid OOM in Node. The output file will be ~200KB for ~40 items with 384 floats each."
+    },
+    {
+      "id": "US-004",
+      "title": "Preload ONNX model during boot sequence",
+      "description": "As a visitor, I want the semantic search model to download in the background during the boot/ECG/login phases so it's ready when I reach the dashboard.",
+      "acceptanceCriteria": [
+        "New src/lib/embedding-model.ts module that exports: initModel(), embedQuery(text: string), and isModelReady()",
+        "initModel() loads the all-MiniLM-L6-v2 pipeline from @xenova/transformers and stores it in a module-level variable",
+        "embedQuery() returns a Promise<number[]> (384-d vector) for a given text string",
+        "isModelReady() returns boolean indicating if the model has finished loading",
+        "initModel() is called in App.tsx useEffect on mount (during boot phase) — fire and forget, no await",
+        "If initModel() fails (network error, etc.), isModelReady() remains false — no error thrown or shown",
+        "Model is cached by @xenova/transformers in IndexedDB — subsequent page loads are near-instant",
+        "Boot/ECG/login animations are not affected by model loading",
+        "Typecheck passes"
+      ],
+      "priority": 4,
+      "passes": true,
+      "notes": "Use pipeline('feature-extraction', 'Xenova/all-MiniLM-L6-v2') which auto-downloads and caches the ONNX model. The module-level pattern (let pipelineInstance = null) avoids React re-render issues. embedQuery should mean-pool the tensor output the same way as the build script. Wrap initModel() in a try/catch that silently swallows errors."
+    },
+    {
+      "id": "US-005",
+      "title": "Implement cosine similarity search module",
+      "description": "As a developer, I need a semantic search function that compares a query embedding against pre-computed item embeddings and returns ranked results.",
+      "acceptanceCriteria": [
+        "New src/lib/semantic-search.ts module",
+        "Exports semanticSearch(queryEmbedding: number[], embeddings: Array<{ id: string, embedding: number[] }>, threshold?: number): Array<{ id: string, score: number }>",
+        "Uses cosine similarity: dot(a,b) / (magnitude(a) * magnitude(b))",
+        "Results sorted by score descending",
+        "Optional threshold parameter filters out low-relevance results (default 0.3)",
+        "Exports loadEmbeddings() that imports embeddings.json and returns the parsed array",
+        "Typecheck passes"
+      ],
+      "priority": 5,
+      "passes": true,
+      "notes": "Keep the cosine similarity implementation simple — no libraries needed for 384-d vectors over ~40 items. The loadEmbeddings function can use a dynamic import or direct import of the JSON file (Vite handles JSON imports natively)."
+    },
+    {
+      "id": "US-006",
+      "title": "Integrate semantic search into command palette",
+      "description": "As a visitor, I want the command palette to use semantic search when available, falling back to Fuse.js otherwise.",
+      "acceptanceCriteria": [
+        "CommandPalette.tsx checks isModelReady() from embedding-model.ts",
+        "When model is ready and query is non-empty: call embedQuery(query), then semanticSearch() against loaded embeddings, then map result IDs back to PaletteItem objects",
+        "When model is NOT ready: use existing Fuse.js search (current behavior preserved exactly)",
+        "Search is debounced by ~200ms to avoid calling embedQuery on every keystroke",
+        "Results maintain existing groupBySection() grouping and section ordering",
+        "Existing keyboard navigation, action routing, and UI unchanged",
+        "Typecheck passes",
+        "Verify in browser: search 'data analysis' surfaces analytics-related roles/skills not just items with 'data' in title"
+      ],
+      "priority": 6,
+      "passes": true,
+      "notes": "The debounce is important — embedQuery takes ~20-50ms per call. Use a useRef + setTimeout pattern or a simple debounce hook. The mapping from semantic search results (id + score) back to PaletteItems should use a Map for O(1) lookup. Keep the Fuse.js imports and buildSearchIndex — they're the fallback path."
+    },
+    {
+      "id": "US-007",
+      "title": "Chat widget — floating button component",
+      "description": "As a visitor, I see a floating chat button at the bottom-right of the dashboard that I can click to open a chat panel.",
+      "acceptanceCriteria": [
+        "New src/components/ChatWidget.tsx component",
+        "Renders a 48px circular button, fixed position, bottom: 24px, right: 24px",
+        "Uses teal accent background (var(--accent)), white MessageCircle icon from lucide-react",
+        "Shadow: var(--shadow-md). Hover: var(--shadow-lg) + scale(1.05) transition",
+        "Button has a subtle entrance animation: fade + translateY(8px) → translateY(0), delayed ~1s after mount",
+        "Respects prefers-reduced-motion (no animation, just visible)",
+        "z-index above dashboard content but below command palette overlay (z-index 90)",
+        "onClick toggles an isOpen state (panel rendering comes in next story)",
+        "Mounted in DashboardLayout.tsx",
+        "Typecheck passes",
+        "Verify in browser using dev-browser skill"
+      ],
+      "priority": 7,
+      "passes": true,
+      "notes": "Use framer-motion for the entrance animation to match the rest of the app's motion patterns. The button should use font-ui for any text. On mobile (<640px), button is 40px and positioned bottom: 16px, right: 16px. The VITE_GEMINI_API_KEY env var check can wait until the Gemini integration story — for now just render the button unconditionally."
+    },
+    {
+      "id": "US-008",
+      "title": "Chat widget — panel UI with message display",
+      "description": "As a visitor, I want a chat panel that opens above the floating button where I can type questions and see responses.",
+      "acceptanceCriteria": [
+        "Chat panel renders when isOpen is true, positioned above the floating button (bottom: 88px, right: 24px)",
+        "Panel dimensions: 380px wide, max-height 480px, with overflow-y auto for messages",
+        "Header: title text ('Ask about Andy'), close button (X icon)",
+        "Message area: user messages right-aligned in teal-tinted bubbles, assistant messages left-aligned in light gray bubbles",
+        "Input area at bottom: text field with placeholder 'Ask me anything...', send button (Send icon)",
+        "Enter key submits message, Shift+Enter for newline",
+        "Panel entrance animation: scale(0.95) + opacity(0) → scale(1) + opacity(1), 200ms ease-out",
+        "Panel exit animation: reverse of entrance",
+        "Respects prefers-reduced-motion",
+        "Responsive: on mobile (<640px), panel is full-width (left: 0, right: 0, bottom: 0) with rounded top corners only",
+        "Messages are stored in component state as Array<{ role: 'user' | 'assistant', content: string }>",
+        "Submitting a message adds it to state and shows it in the UI (no API call yet — assistant response is a placeholder)",
+        "Typecheck passes",
+        "Verify in browser using dev-browser skill"
+      ],
+      "priority": 8,
+      "passes": true,
+      "notes": "Use the design system tokens: var(--surface) for panel bg, var(--border-light) for borders, var(--text-primary) for text, var(--accent) for user bubble bg at 10% opacity, font-ui for body text, font-geist for timestamps. The placeholder assistant response can be a static string like 'AI chat coming soon — this is a preview of the chat interface.' This lets us verify the full UI before wiring up Gemini."
+    },
+    {
+      "id": "US-009",
+      "title": "Chat widget — Gemini Flash integration",
+      "description": "As a visitor, I can ask natural language questions and get intelligent, streamed answers about Andy's experience.",
+      "acceptanceCriteria": [
+        "New src/lib/gemini.ts module that exports sendChatMessage(messages: ChatMessage[], cvContext: string): AsyncGenerator<string>",
+        "Calls Google Gemini Flash API (gemini-2.0-flash) using the REST API with fetch (no SDK needed)",
+        "API key sourced from import.meta.env.VITE_GEMINI_API_KEY",
+        "System prompt includes structured CV context built from buildEmbeddingTexts() output",
+        "System prompt instructs the model to answer questions about Andy's professional experience accurately and concisely",
+        "System prompt instructs the model to include relevant palette item IDs in its response as a JSON array at the end",
+        "Responses are streamed using the Gemini streaming endpoint",
+        "ChatWidget.tsx wires up real messages: on submit, calls sendChatMessage and streams tokens into the assistant message bubble",
+        "Loading state shown (typing indicator) while waiting for first token",
+        "If VITE_GEMINI_API_KEY is not set, chat button is still visible but panel shows 'Chat is currently unavailable' message",
+        "If API call fails, show error message in chat: 'Sorry, I couldn't process that. Please try again.'",
+        "Conversation history (last 10 messages) passed to API for multi-turn context",
+        "Typecheck passes"
+      ],
+      "priority": 9,
+      "passes": true,
+      "notes": "Gemini REST streaming endpoint: POST https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash:streamGenerateContent?alt=sse&key=API_KEY. The response is SSE (server-sent events) — parse each 'data:' line as JSON and extract candidates[0].content.parts[0].text. The system prompt with CV context will be ~2-3K tokens — well within Gemini Flash limits. For the palette item IDs, instruct the model to end its response with a line like [ITEMS: id1, id2, id3] which can be parsed client-side."
+    },
+    {
+      "id": "US-010",
+      "title": "Chat widget — clickable portfolio item cards in responses",
+      "description": "As a visitor, I want AI chat responses to include clickable portfolio items so I can drill into relevant sections.",
+      "acceptanceCriteria": [
+        "After parsing the assistant response, extract referenced palette item IDs from the [ITEMS: ...] suffix",
+        "Render matched items as compact clickable cards below the answer text in the assistant bubble",
+        "Cards reuse icon/color mapping from CommandPalette (iconByType, iconColorStyles)",
+        "Cards show item title and subtitle in a compact horizontal layout",
+        "Clicking a card triggers the same action routing as command palette via handlePaletteAction in DashboardLayout",
+        "If no items are referenced or IDs don't match, no cards are shown (just the text answer)",
+        "Typecheck passes",
+        "Verify in browser using dev-browser skill"
+      ],
+      "priority": 10,
+      "passes": true,
+      "notes": "The action routing needs to flow from ChatWidget up to DashboardLayout. Add an onAction prop to ChatWidget (same pattern as CommandPalette). DashboardLayout passes handlePaletteAction to ChatWidget. Export iconByType and iconColorStyles from CommandPalette (or extract to a shared module) so ChatWidget can reuse them."
+    },
+    {
+      "id": "US-011",
+      "title": "Mobile full-screen chat panel",
+      "description": "As a mobile visitor, I want the chat panel to be a full-screen overlay so it's easy to use on small screens.",
+      "acceptanceCriteria": [
+        "Below md breakpoint (768px), chat panel renders as full-screen overlay using position: fixed; inset: 0 with 100dvh height",
+        "Full-screen mode has the existing header with close button (no visual change needed, just full-width)",
+        "Floating chat button is hidden (display: none or opacity: 0) while panel is open on mobile (<768px)",
+        "Above 768px, existing panel behavior is unchanged (380px wide, anchored bottom-right, max-height 480px)",
+        "Panel open/close animation still respects prefers-reduced-motion",
+        "Safe area insets applied via env(safe-area-inset-*) for notched devices",
+        "Input area stays pinned to bottom of screen on mobile",
+        "Typecheck passes",
+        "Verify in browser using dev-browser skill"
+      ],
+      "priority": 11,
+      "passes": true,
+      "notes": "The current ChatWidget already has some mobile handling (bottom-sheet style at <640px). This story changes the breakpoint to 768px (md) and makes it truly full-screen instead of 85vh. Use 100dvh (dynamic viewport height) to account for mobile browser chrome. The floating button visibility can be controlled by combining isOpen state with a CSS media query or a useMediaQuery hook. The <style> block with data-chat-panel attribute is the place to update responsive rules."
+    },
+    {
+      "id": "US-012",
+      "title": "Welcome message with suggested question chips",
+      "description": "As a visitor opening the chat, I see a friendly welcome message and clickable suggested questions so I know what to ask.",
+      "acceptanceCriteria": [
+        "When chat panel is open and conversation is empty, display welcome text: 'Hey! I'm here to help you learn more about Andy. What would you like to know?'",
+        "Welcome text is styled as an AI message bubble (left-aligned, light background, same styling as assistant messages)",
+        "Below the welcome bubble, show 2-3 clickable pill/chip buttons with suggested questions",
+        "Suggested questions: 'What's his NHS experience?', 'Tell me about his data skills', 'What projects has he built?'",
+        "Chips styled with: teal accent border, rounded-full, font-ui 12-13px, hover state (teal background tint)",
+        "Clicking a chip sends that question as a user message (same codepath as typing + Enter)",
+        "Welcome message and chips always visible when conversation is empty (persist across panel open/close)",
+        "Once any message is sent, the welcome/chips area is replaced by the conversation messages",
+        "Typecheck passes",
+        "Verify in browser using dev-browser skill"
+      ],
+      "priority": 12,
+      "passes": true,
+      "notes": "Replace the current empty-state text ('Ask me anything about Andy's experience, skills, or projects.') with the new welcome bubble + chips. The chips should call handleSubmit (or equivalent) with the chip text pre-filled — simplest approach is setInputValue(chipText) then immediately trigger submit. Check that the welcome state reappears if the user hasn't sent a message (messages.length === 0). The suggested questions could live in a const array at the top of ChatWidget for easy future editing."
+    },
+    {
+      "id": "US-013",
+      "title": "Self-host ONNX embedding model",
+      "description": "As a developer, I want the ONNX model files served from the same host as the site to eliminate dependency on Hugging Face CDN.",
+      "acceptanceCriteria": [
+        "Model files for Xenova/all-MiniLM-L6-v2 downloaded and placed in public/models/all-MiniLM-L6-v2/onnx/ (matching HF repo structure)",
+        "Required files present: model_quantized.onnx, tokenizer.json, tokenizer_config.json, config.json, and any other files the pipeline expects",
+        "src/lib/embedding-model.ts updated: configure @xenova/transformers env to use local model path (e.g., env.localModelPath or custom model URL pointing to /models/)",
+        "scripts/generate-embeddings.ts also updated to use the same local model path for consistency",
+        "Model files are NOT in .gitignore — they are committed as static assets",
+        "No network requests to huggingface.co in the browser network tab when semantic search is used",
+        "Semantic search still works correctly in the command palette after the change",
+        "Typecheck passes"
+      ],
+      "priority": 13,
+      "passes": true,
+      "notes": "Transformers.js uses env.localModelPath or env.remoteHost to control where models are fetched from. Setting env.localModelPath = '/models/' should make it look for files at /models/Xenova/all-MiniLM-L6-v2/onnx/model_quantized.onnx etc. The Vite public/ directory serves files at the root — so public/models/ becomes /models/ at runtime. For the build script (Node.js), use a file:// path or the local filesystem path instead. Download model files from https://huggingface.co/Xenova/all-MiniLM-L6-v2/tree/main — the quantized ONNX model is ~23MB. Check what files the pipeline actually requests by watching network tab before making this change."
+    },
+    {
+      "id": "US-014",
+      "title": "Update to Gemini 3 Flash Preview with model indicator",
+      "description": "As a developer, I want to use the latest free Gemini model, and as a visitor, I want to see what model powers the chat.",
+      "acceptanceCriteria": [
+        "Extract model name to a single constant (e.g., GEMINI_MODEL = 'gemini-3-flash-preview') used for both the API URL and display",
+        "GEMINI_API_BASE URL updated to use the new model constant",
+        "Review and tighten the system prompt — ensure it's well-structured, concise, and clear for the new model",
+        "Review the [ITEMS: ...] suffix instruction — ensure new model follows the format reliably",
+        "Small model indicator in chat panel header: 'Gemini 3 Flash' in font-geist, 11px, var(--text-tertiary)",
+        "Model indicator positioned right-aligned in the header bar or as a subtle line below the header",
+        "Streaming SSE parsing still works correctly with the new model endpoint",
+        "Typecheck passes",
+        "Verify in browser using dev-browser skill"
+      ],
+      "priority": 14,
+      "passes": true,
+      "notes": "The current API base is 'https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash'. Change the model segment to 'gemini-3-flash-preview'. The API path structure (v1beta/models/{model}:streamGenerateContent) should be the same. Verify that gemini-3-flash-preview is the correct model ID — check Google AI Studio or the API docs. For the display name, use a human-friendly string like 'Gemini 3 Flash' (not the full model ID). The constant should be defined at the top of gemini.ts and exported for use in ChatWidget."
+    }
+  ]
+}
@@ -0,0 +1,315 @@
+# Progress Log — Semantic Search & AI Chat
+# Branch: ralph/semantic-search
+# Started: 2026-02-15
+
+## Codebase Patterns
+- `@xenova/transformers` pipeline with `pooling: 'mean'` and `normalize: true` returns a Tensor; use `Array.from(output.data as Float32Array)` to extract the 384-d vector
+- Scripts live in `scripts/` and run via `npx tsx` (tsx is not a project dep, npx fetches it)
+- tsconfig `include` only covers `src/` — scripts are type-checked by tsx at runtime, not by `tsc --noEmit`
+- Project uses `"type": "module"` in package.json
+- Palette item IDs: `exp-{consultation.id}`, `skill-{skill.id}`, `proj-{investigation.id}`, `ach-{0-3}`, `edu-{0-3}`, `action-{0-3}`
+- `buildEmbeddingTexts()` in `src/lib/search.ts` returns `Array<{ id: string, text: string }>` with IDs matching PaletteItem IDs — use this for both embedding generation and chat context
+- `src/data/embeddings.json` is an array of `{ id: string, embedding: number[] }` — 42 items, 384-d vectors, IDs match PaletteItem IDs. Vite imports JSON natively.
+- `src/lib/embedding-model.ts` exports `initModel()`, `embedQuery(text)`, `isModelReady()` — check `isModelReady()` before calling `embedQuery()`
+- `initModel()` is called fire-and-forget in `App.tsx` on mount — model loads during boot/ECG/login phases
+- ONNX model files self-hosted in `public/models/Xenova/all-MiniLM-L6-v2/` — `env.localModelPath = '/models/'`, `env.allowRemoteModels = false`, `env.useBrowserCache = false` eliminates HF CDN dependency
+- `src/lib/semantic-search.ts` exports `semanticSearch(queryEmbedding, embeddings, threshold?)` and `loadEmbeddings()` — embeddings are normalized so cosine similarity is dot(a,b)/(mag(a)*mag(b))
+- CommandPalette uses `semanticResults` state + debounced `useEffect` for async semantic search, falling back to Fuse.js when `isModelReady()` returns false or on any error
+- `loadEmbeddings()` and `paletteMap` (Map<id, PaletteItem>) are precomputed via `useMemo` — no re-computation on each search
+- ChatWidget is mounted in DashboardLayout alongside CommandPalette and DetailPanel — z-index 90 (below command palette z-1000)
+- `prefersReducedMotion` pattern: read `window.matchMedia` at module level, use in framer-motion variants to skip animation
+- ChatWidget stores messages as `Array<{ role: 'user' | 'assistant', content: string }>` — same shape as LLM message format, ready for Gemini integration
+- ChatWidget `isOpen` state controls both panel visibility and button icon (MessageCircle ↔ X) — panel rendering handled by AnimatePresence
+- `src/lib/gemini.ts` exports `sendChatMessage(messages)` (async generator), `isGeminiAvailable()`, `parseItemIds(text)`, `stripItemsSuffix(text)` — ChatMessage type is `{ role: 'user' | 'assistant', content: string }`
+- Gemini API uses SSE streaming: POST to `:streamGenerateContent?alt=sse&key=KEY`, parse `data:` lines as JSON, extract `candidates[0].content.parts[0].text`
+- System prompt built from `buildEmbeddingTexts()` — instructs model to end responses with `[ITEMS: id1, id2, id3]` for portfolio item linking
+- `isGeminiAvailable()` checks `import.meta.env.VITE_GEMINI_API_KEY` — when missing, chat panel shows "unavailable" message but button remains visible
+- Assistant messages store item IDs as `<!--ITEMS:id1,id2-->` HTML comment suffix for US-010 to parse — `getDisplayText()` strips this before rendering
+- Conversation history capped at 10 messages (`MAX_HISTORY`), metadata stripped before sending to API
+- Icon/color mappings (`iconByType`, `iconColorStyles`) live in `src/lib/palette-icons.ts` — shared between CommandPalette and ChatWidget
+- ChatWidget accepts optional `onAction?: (action: PaletteAction) => void` prop — same pattern as CommandPalette's `onAction`
+- `DashboardLayout` passes `handlePaletteAction` to both CommandPalette and ChatWidget for unified action routing
+- TopBar is `z-index: 100` (fixed), nav is `z-index: 99` (sticky) — mobile full-screen overlays need `z-index > 100` to appear above them
+- Inline `style={{ display: 'flex' }}` overrides Tailwind's `hidden` class — use `!important` modifier (`max-md:!hidden`) or move display to Tailwind classes to allow responsive hiding
+- ChatWidget mobile breakpoint is `md` (768px) — below this, panel is full-screen; above, it's 380px anchored bottom-right
+- `handleSubmit(overrideText?)` accepts optional text param — use this when programmatically sending messages (e.g., suggested question chips) to avoid stale `inputValue` state
+- `SUGGESTED_QUESTIONS` const array at top of ChatWidget — edit here to change welcome screen chip text
+- `GEMINI_MODEL` and `GEMINI_DISPLAY_NAME` exported from `src/lib/gemini.ts` — single source of truth for model ID and display name; update both when changing models
+
+---
+
+## 2026-02-15 - US-001
+- Installed `@xenova/transformers` (^2.17.2)
+- Created `scripts/generate-embeddings.ts` with main() that loads `Xenova/all-MiniLM-L6-v2` and embeds a test string
+- Added `"generate-embeddings"` npm script
+- Verified: outputs vector length 384 and exits cleanly
+- Typecheck passes
+- Files changed: `package.json`, `package-lock.json`, `scripts/generate-embeddings.ts`
+- **Learnings for future iterations:**
+  - `pipeline('feature-extraction', 'Xenova/all-MiniLM-L6-v2')` auto-downloads and caches the ONNX model (~23MB)
+  - First run takes a few seconds for model download; subsequent runs are near-instant from cache
+  - The pipeline's `pooling: 'mean'` and `normalize: true` options handle mean-pooling and L2 normalization in one step — no manual tensor manipulation needed
+  - `output.data` is a `Float32Array`; wrap in `Array.from()` for a plain number array
+---
+
+## 2026-02-15 - US-002
+- Added `buildEmbeddingTexts()` function to `src/lib/search.ts`
+- Imports all raw data files (consultations, skills, kpis, investigations, documents)
+- Generates natural-language paragraphs for each palette item type:
+  - Consultations: role, org, duration, history narrative, examination bullets, coded entry descriptions
+  - Skills: name, category, frequency, proficiency %, years of experience
+  - Achievements: title, subtitle, full KPI explanation + story context + outcomes
+  - Investigations: name, methodology, tech stack, results
+  - Education: title, type, institution, duration, classification, research detail, notes (from documents.ts)
+  - Quick Actions: title + subtitle
+- IDs match PaletteItem IDs (e.g. `exp-{id}`, `skill-{id}`, `ach-{i}`, `proj-{id}`, `edu-{i}`, `action-{i}`)
+- Typecheck and lint pass
+- Files changed: `src/lib/search.ts`
+- **Learnings for future iterations:**
+  - Education items in `buildPaletteData()` are hardcoded arrays (not iterated from `documents`), with ids `edu-0` through `edu-3`. The mapping to `documents.ts` entries is: edu-0→doc-mary-seacole, edu-1→doc-mpharm, edu-2→doc-alevels, edu-3→doc-gphc
+  - Achievement items are similarly hardcoded with ids `ach-0` through `ach-3`, each linked to a KPI id
+  - Quick action items are `action-0` through `action-3`
+  - `documents.ts` is imported but wasn't previously used in `search.ts` — now used for education embedding text
+---
+
+## 2026-02-15 - US-003
+- Updated `scripts/generate-embeddings.ts` to import `buildEmbeddingTexts()` and generate full embeddings
+- Script embeds all 42 palette items sequentially using `Xenova/all-MiniLM-L6-v2`
+- Outputs `src/data/embeddings.json` as `Array<{ id: string, embedding: number[] }>`
+- Each embedding is a 384-dimensional float array
+- File is ~453KB (42 items × 384 floats with pretty-printed JSON)
+- `npm run generate-embeddings` regenerates the file successfully
+- Typecheck and lint pass
+- Files changed: `scripts/generate-embeddings.ts`, `src/data/embeddings.json`
+- **Learnings for future iterations:**
+  - `import.meta.dirname` works in tsx/Node ESM scripts — use it instead of `__dirname` (which isn't available in ESM)
+  - `@/` path alias works in `npx tsx` scripts because tsx resolves tsconfig paths automatically
+  - The embeddings file is ~450KB with pretty-print; could be reduced with compact JSON but readability is preferred for now
+  - Processing 42 items takes ~10-15 seconds on first run (model cached after first download)
+---
+
+## 2026-02-15 - US-004
+- Created `src/lib/embedding-model.ts` with three exports: `initModel()`, `embedQuery()`, `isModelReady()`
+- Module-level `let extractor` pattern avoids React re-render issues
+- `initModel()` uses `loading` guard to prevent duplicate pipeline loads
+- `embedQuery()` uses same `pooling: 'mean'` and `normalize: true` as the build script
+- `initModel()` called fire-and-forget in `App.tsx` `useEffect([], [])` — runs during boot phase
+- Silent failure: try/catch swallows errors, `isModelReady()` stays false
+- Typecheck, lint, and build all pass
+- Files changed: `src/lib/embedding-model.ts` (new), `src/App.tsx`
+- **Learnings for future iterations:**
+  - `FeatureExtractionPipeline` type is exported from `@xenova/transformers` and can be used for the module-level variable
+  - The `loading` boolean guard prevents race conditions if `initModel()` is called multiple times (e.g., React strict mode double-mount)
+  - `initModel()` is intentionally not awaited — it's fire-and-forget so it doesn't block the boot animation
+  - Consumers should check `isModelReady()` before calling `embedQuery()` — it throws if model isn't loaded
+---
+
+## 2026-02-15 - US-005
+- Created `src/lib/semantic-search.ts` with cosine similarity search and embeddings loader
+- `semanticSearch()` computes cosine similarity, filters by threshold (default 0.3), returns sorted by score descending
+- `loadEmbeddings()` imports `embeddings.json` via Vite's native JSON import and returns typed array
+- Typecheck and lint pass (0 new warnings)
+- Files changed: `src/lib/semantic-search.ts` (new)
+- **Learnings for future iterations:**
+  - Vite handles JSON imports natively — `import data from '@/data/embeddings.json'` just works, no dynamic import needed
+  - Since embeddings are already L2-normalized (from pipeline's `normalize: true`), cosine similarity simplifies to just the dot product. However, the full formula is kept for correctness in case non-normalized vectors are ever used
+  - With only ~42 items and 384-d vectors, brute-force cosine similarity is fast enough — no need for approximate nearest neighbor libraries
+---
+
+## 2026-02-15 - US-006
+- Integrated semantic search into CommandPalette with Fuse.js fallback
+- When `isModelReady()` is true: debounces query by 200ms, calls `embedQuery()`, runs `semanticSearch()` against preloaded embeddings, maps result IDs back to PaletteItems via O(1) Map lookup
+- When model is NOT ready: uses existing Fuse.js search (behavior preserved exactly)
+- Results maintain `groupBySection()` grouping and section ordering
+- Existing keyboard navigation, action routing, and UI unchanged
+- Semantic results state is cleared when palette opens/closes and when query is empty
+- Error handling: any failure in embedQuery/semanticSearch silently falls back to Fuse.js
+- Typecheck, lint, and build all pass
+- Browser verified: Fuse.js fallback works correctly; ONNX model loads asynchronously during boot and activates semantic search when ready
+- Files changed: `src/components/CommandPalette.tsx`
+- **Learnings for future iterations:**
+  - Semantic search is async so it can't live in a `useMemo` — use `useState` + debounced `useEffect` pattern instead
+  - The `useRef + setTimeout` debounce pattern works well here: set `debounceRef.current = setTimeout(...)`, clear it in the cleanup function, and in early-return paths
+  - `isModelReady()` is a synchronous check — call it before setting up the debounce timeout to avoid unnecessary delays when model isn't loaded
+  - The ONNX model takes several seconds to load in the browser (downloads ~23MB first time, then cached in IndexedDB), so initial searches will always use Fuse.js fallback
+  - `loadEmbeddings()` is cheap (just returns the already-imported JSON) — safe to call in `useMemo` without performance concern
+---
+
+## 2026-02-15 - US-007
+- Created `src/components/ChatWidget.tsx` — floating chat button with toggle state
+- 48px circular button (40px on mobile <640px), fixed bottom-right, teal accent background, white MessageCircle icon
+- Entrance animation: fade + translateY(8px→0), 1s delay after mount, via framer-motion variants
+- Respects `prefers-reduced-motion` — skips animation, shows immediately
+- Hover: shadow-md → shadow-lg + scale(1.05), 150ms transition
+- z-index 90 (below command palette z-1000)
+- onClick toggles `isOpen` state, swaps icon between MessageCircle and X
+- Mounted in `DashboardLayout.tsx` alongside CommandPalette and DetailPanel
+- Typecheck, lint (0 errors), and build all pass
+- Browser verified: button visible at bottom-right, toggle works (Open chat ↔ Close chat)
+- Files changed: `src/components/ChatWidget.tsx` (new), `src/components/DashboardLayout.tsx`
+- **Learnings for future iterations:**
+  - Responsive sizing via Tailwind classes (`h-10 w-10 sm:h-12 sm:w-12`) works well with inline style for non-Tailwind properties (boxShadow, border-radius)
+  - `AnimatePresence` is already imported and ready for the panel animation in US-008
+  - The `isOpen` state lives in ChatWidget — US-008 will add the panel UI inside the same component
+  - Hover effects use `onMouseEnter/Leave` with direct style mutation (same pattern as other dashboard components)
+---
+
+## 2026-02-15 - US-008
+- Built chat panel UI inside `ChatWidget.tsx` with header, message area, and input
+- Panel opens above the floating button with scale+opacity entrance/exit animation via framer-motion `AnimatePresence`
+- Messages stored as `Array<{ role: 'user' | 'assistant', content: string }>` in component state
+- User messages right-aligned in teal-tinted bubbles (`var(--accent-light)` bg, `var(--accent-border)` border)
+- Assistant messages left-aligned in light gray bubbles (`var(--bg-dashboard)` bg, `var(--border-light)` border)
+- Message corner radii differ: user bubbles have small bottom-right radius, assistant bubbles small bottom-left (conversational feel)
+- Input area: textarea with Enter to submit, Shift+Enter for newline. Send button enabled/disabled based on input content
+- Empty state shows placeholder text when no messages yet
+- Auto-scrolls to latest message via `useRef` + `scrollIntoView`
+- Auto-focuses input when panel opens (200ms delay for animation)
+- Responsive: on mobile (<640px), panel is full-width bottom sheet with rounded top corners; on desktop, 380px wide positioned above the button
+- Panel entrance: scale(0.95)+opacity(0) → scale(1)+opacity(1), 200ms. Exit: reverse, 150ms
+- Respects `prefers-reduced-motion` — skips all animation
+- Close button in header triggers `setIsOpen(false)` (same as floating button toggle)
+- Submitting appends both user message and placeholder assistant response to state
+- Typecheck, lint (0 errors), and build all pass
+- Browser verified: panel opens/closes correctly, messages display, input works, Enter submits, close button works
+- Files changed: `src/components/ChatWidget.tsx`
+- **Learnings for future iterations:**
+  - `AnimatePresence` with `key` prop on the panel div is needed for exit animations to work
+  - Panel uses `transformOrigin: 'bottom right'` for natural scale animation from the button corner
+  - CSS-in-JS `<style>` tag with `data-chat-panel` attribute handles responsive width/height (Tailwind can't express max-height conditionally based on viewport width easily)
+  - `textarea` with `rows={1}` and `maxHeight: 80px` gives auto-growing feel; `resize: none` prevents manual resize
+  - The `ChatMessage` interface (`{ role, content }`) is ready to be extended for US-009 Gemini integration — same shape as typical LLM message format
+  - `onFocus/onBlur` border color transitions on the textarea give a polished input interaction
+---
+
+## 2026-02-15 - US-009
+- Created `src/lib/gemini.ts` — Gemini Flash streaming integration module
+  - `sendChatMessage(messages)` async generator that streams SSE tokens from Gemini 2.0 Flash
+  - `isGeminiAvailable()` checks for `VITE_GEMINI_API_KEY` env var
+  - `parseItemIds(text)` extracts `[ITEMS: id1, id2]` from response text
+  - `stripItemsSuffix(text)` removes the `[ITEMS: ...]` line for clean display
+  - System prompt built from `buildEmbeddingTexts()` output — full CV context (~42 items)
+  - Model instructed to answer concisely and append relevant palette item IDs
+- Rewired `ChatWidget.tsx` to use real Gemini API instead of placeholder responses
+  - Streaming: tokens progressively appear in assistant message bubble
+  - Typing indicator (Loader2 spinner + "Thinking...") shown while waiting for first token
+  - Input disabled during streaming, send button grayed out
+  - Error handling: API failures show "Sorry, I couldn't process that. Please try again."
+  - Missing API key: panel shows "Chat is currently unavailable", input area hidden
+  - Conversation history capped at 10 messages before sending to API
+  - Assistant messages store parsed item IDs as `<!--ITEMS:id1,id2-->` HTML comment (for US-010)
+  - Messages sent to API have metadata stripped to keep context clean
+- Typecheck, lint (0 errors), and build all pass
+- Files changed: `src/lib/gemini.ts` (new), `src/components/ChatWidget.tsx`
+- **Learnings for future iterations:**
+  - Gemini SSE format: `data:` prefix per line, JSON body with `candidates[0].content.parts[0].text`
+  - `system_instruction` field in Gemini request body sets the system prompt (not a message in `contents`)
+  - Gemini role mapping: `'assistant'` → `'model'` in the API's `contents` array
+  - Buffer-based SSE parsing handles chunk boundaries: split on `\n`, keep last incomplete line in buffer
+  - `buildEmbeddingTexts()` is a great source for structured CV context — natural language paragraphs per item
+  - The `<!--ITEMS:-->` HTML comment pattern is invisible when rendered but parseable by US-010 for item card display
+  - `useCallback` on `handleSubmit` with `[inputValue, isStreaming, messages]` deps is needed because it reads all three
+---
+
+## 2026-02-15 - US-010
+- Extracted `iconByType` and `iconColorStyles` from `CommandPalette.tsx` into shared `src/lib/palette-icons.ts`
+- Updated `CommandPalette.tsx` to import from the shared module (no behavioral change)
+- Added `onAction?: (action: PaletteAction) => void` prop to `ChatWidget` — same pattern as `CommandPalette`
+- `DashboardLayout.tsx` passes `handlePaletteAction` to `ChatWidget` (same handler used by CommandPalette)
+- ChatWidget builds a `paletteMap` (Map<id, PaletteItem>) via `useMemo` for O(1) item lookups
+- Added `getMessageItemIds()` to parse `<!--ITEMS:id1,id2-->` HTML comments from message content
+- Added `getMessageItems()` to resolve parsed IDs to PaletteItem objects via the map
+- Assistant message bubbles now render compact clickable item cards below text when items are referenced:
+  - Cards use same icon/color scheme from CommandPalette (22px icon + title + subtitle)
+  - Cards have hover highlight (`var(--accent-light)`) and trigger `onAction(item.action)` on click
+  - Cards only appear after streaming completes (when `<!--ITEMS:-->` metadata is in final content)
+  - If no items referenced or IDs don't match, no cards shown — just text
+- Typecheck, lint (0 errors), and build all pass
+- Files changed: `src/lib/palette-icons.ts` (new), `src/components/ChatWidget.tsx`, `src/components/CommandPalette.tsx`, `src/components/DashboardLayout.tsx`
+- **Learnings for future iterations:**
+  - Extracting shared constants to `src/lib/` is the right pattern — both `CommandPalette` and `ChatWidget` now use the same icon mappings without duplication
+  - `buildPaletteData()` is pure (no side effects) and idempotent — safe to call in `useMemo` with empty deps
+  - The `<!--ITEMS:-->` HTML comment regex `<!--ITEMS:([^>]*)-->` works reliably; `[^>]*` captures everything between the colons and closing
+  - Item card buttons use `fontFamily: 'inherit'` to pick up the panel's `font-ui` — without this, browser defaults apply
+  - The `overflow: 'hidden'` on the message bubble container is needed so the item cards section (with its own border-top) stays visually contained within the bubble's border-radius
+---
+
+## 2026-02-15 - US-011
+- Updated ChatWidget mobile breakpoint from `sm` (640px) to `md` (768px)
+- Changed mobile panel from 85vh bottom-sheet to full-screen overlay using `position: fixed; inset: 0` with `100dvh` height
+- Panel z-index on mobile bumped to 101 (`max-md:z-[101]`) to render above TopBar (z-100) and nav (z-99)
+- Floating chat button hidden on mobile when panel is open via `max-md:!hidden` Tailwind class
+  - Fixed specificity issue: inline `style={{ display: 'flex' }}` was overriding Tailwind's `hidden` — moved flex/centering to Tailwind classes (`flex items-center justify-center`)
+- Safe area insets applied via `env(safe-area-inset-*)` CSS on the `[data-chat-panel]` element for notched devices
+- Input area stays pinned to bottom via existing flex layout (flex-col container + flex-1 message area + flex-shrink-0 input)
+- Desktop behavior unchanged: 380px wide, anchored bottom-right, max-height 480px, floating button visible
+- Panel open/close animations still respect `prefers-reduced-motion`
+- Typecheck, lint (0 errors), and build all pass
+- Browser verified at 375×812 (mobile) and 1280×800 (desktop): full-screen overlay works, button hides/shows correctly, close button works
+- Files changed: `src/components/ChatWidget.tsx`
+- **Learnings for future iterations:**
+  - Inline `style` properties always override CSS classes — to allow Tailwind responsive utilities (like `max-md:hidden`) to work, move conflicting properties (especially `display`) to Tailwind classes instead
+  - Use `!important` modifier (`max-md:!hidden`) when competing with framer-motion's inline styles that can't be easily removed
+  - TopBar (`z-100`) and nav (`z-99`) sit above the chat panel's default `z-90` — mobile full-screen panels need `z-101+` to overlay properly
+  - `100dvh` (dynamic viewport height) is essential for mobile full-screen panels — it accounts for browser chrome (address bar, toolbar) unlike `100vh`
+  - The `[data-chat-panel]` CSS selector in the `<style>` block is the right place for responsive size rules since Tailwind can't conditionally set max-height based on viewport width
+---
+
+## 2026-02-15 - US-012
+- Replaced empty-state centered text with welcome bubble + suggested question chips
+- Welcome bubble styled as assistant message (left-aligned, `var(--bg-dashboard)` bg, `var(--border-light)` border)
+- Added `SUGGESTED_QUESTIONS` const array at module top for easy future editing
+- Three chips: "What's his NHS experience?", "Tell me about his data skills", "What projects has he built?"
+- Chips styled: rounded-full, teal accent border, teal hover tint, `font-ui` 12.5px
+- Clicking a chip calls `handleSubmit(questionText)` — same codepath as typing + Enter
+- Refactored `handleSubmit` to accept optional `overrideText` parameter (avoids stale state issue with `setInputValue` + immediate submit)
+- Wrapped send button `onClick` in arrow function to prevent passing MouseEvent as text argument
+- Welcome/chips visible when `messages.length === 0`, replaced by conversation once any message is sent
+- Typecheck passes (0 errors), lint passes (0 new errors/warnings)
+- Browser verified: welcome bubble displays correctly, chips render, clicking chip sends message and replaces welcome state
+- Files changed: `src/components/ChatWidget.tsx`
+- **Learnings for future iterations:**
+  - When refactoring a callback to accept optional parameters, wrap `onClick={handler}` as `onClick={() => handler()}` to prevent React from passing the SyntheticEvent as the first argument
+  - `SUGGESTED_QUESTIONS` as a module-level const is the simplest approach — easily editable, no data file needed for 3 items
+  - The `handleSubmit(overrideText?)` pattern avoids the stale-state problem: `setInputValue(text)` followed by immediate `handleSubmit()` would read the old `inputValue` since React batches state updates
+---
+
+## 2026-02-15 - US-013
+- Downloaded all-MiniLM-L6-v2 model files to `public/models/Xenova/all-MiniLM-L6-v2/`:
+  - `config.json`, `tokenizer.json`, `tokenizer_config.json`, `onnx/model_quantized.onnx` (~22MB)
+- Updated `src/lib/embedding-model.ts`:
+  - `env.localModelPath = '/models/'` — Vite serves `public/` at root
+  - `env.allowRemoteModels = false` — prevents any HF CDN fallback
+  - `env.useBrowserCache = false` — prevents stale Cache API entries from interfering
+- Updated `scripts/generate-embeddings.ts`:
+  - `env.localModelPath = resolve(import.meta.dirname, '..', 'public', 'models')` — absolute path for Node.js
+  - `env.allowRemoteModels = false`
+- Model files committed as static assets (not in .gitignore)
+- Browser verified: all 4 model files fetched from `localhost:5173/models/` with 200 OK, zero `huggingface.co` requests
+- Semantic search verified working: "data analysis" returns multi-category results (Core Skills, Active Projects, Achievements)
+- Build script (`npm run generate-embeddings`) still works with local model files
+- Typecheck passes (0 errors), lint passes (0 new errors/warnings)
+- Files changed: `src/lib/embedding-model.ts`, `scripts/generate-embeddings.ts`, `public/models/Xenova/all-MiniLM-L6-v2/` (new directory with 4 files)
+- **Learnings for future iterations:**
+  - `@xenova/transformers` env configuration: `env.localModelPath` sets the base path, `env.allowRemoteModels = false` prevents CDN fallback, `env.useBrowserCache = false` bypasses Browser Cache API
+  - The library constructs paths as `{localModelPath}/{modelId}/{filename}` — so `/models/` + `Xenova/all-MiniLM-L6-v2` + `/onnx/model_quantized.onnx`
+  - Browser Cache API can retain stale entries from previous HF CDN loads — setting `useBrowserCache = false` forces fresh fetches from the configured local path
+  - For Node.js scripts, use an absolute filesystem path for `localModelPath` (not a URL)
+  - The quantized ONNX model (`model_quantized.onnx`) is ~22MB — acceptable for a static asset since it's cached after first load
+---
+
+## 2026-02-15 - US-014
+- Extracted `GEMINI_MODEL` and `GEMINI_DISPLAY_NAME` constants in `src/lib/gemini.ts`
+- Updated `GEMINI_API_BASE` to use template literal with `GEMINI_MODEL` constant (`gemini-3-flash-preview`)
+- Tightened system prompt: restructured with markdown headers, more concise instructions, clearer `[ITEMS: ...]` format specification
+- Added model indicator to ChatWidget header: "Gemini 3 Flash" in `font-geist`, 11px, `var(--text-tertiary)`, right-aligned next to title
+- Imported `GEMINI_DISPLAY_NAME` in ChatWidget for the indicator text
+- Typecheck passes (0 errors), lint passes (0 new errors/warnings), build succeeds
+- Files changed: `src/lib/gemini.ts`, `src/components/ChatWidget.tsx`
+- **Learnings for future iterations:**
+  - `gemini-3-flash-preview` is the correct model ID for Gemini 3 Flash (confirmed via Google AI docs); Gemini 2.0 Flash deprecated, shutdown scheduled for March 31 2026
+  - The API path structure (`v1beta/models/{model}:streamGenerateContent?alt=sse&key=KEY`) is unchanged between Gemini 2 and 3
+  - Extracting both `GEMINI_MODEL` (for API URL) and `GEMINI_DISPLAY_NAME` (for UI) as separate constants keeps the API ID decoupled from the human-readable name
+  - System prompt with markdown headers (##) gives the model clearer section boundaries — improves instruction following for structured output like `[ITEMS: ...]`
+  - Pre-existing uncommitted change in `src/App.tsx` (boot→login phase skip) was excluded from the commit — always check `git diff --stat` and stage specific files
+---