From ae15ccf961b0c42eb0956b10f272945bdcdbae5b Mon Sep 17 00:00:00 2001 From: Andy Charlwood Date: Sun, 15 Feb 2026 17:59:11 +0000 Subject: [PATCH] chore: mark US-004 complete, update progress log --- Ralph/prd.json | 2 +- Ralph/progress.txt | 18 ++++++++++++++++++ 2 files changed, 19 insertions(+), 1 deletion(-) diff --git a/Ralph/prd.json b/Ralph/prd.json index 0276a32..c7549ee 100644 --- a/Ralph/prd.json +++ b/Ralph/prd.json @@ -72,7 +72,7 @@ "Typecheck passes" ], "priority": 4, - "passes": false, + "passes": true, "notes": "Use pipeline('feature-extraction', 'Xenova/all-MiniLM-L6-v2') which auto-downloads and caches the ONNX model. The module-level pattern (let pipelineInstance = null) avoids React re-render issues. embedQuery should mean-pool the tensor output the same way as the build script. Wrap initModel() in a try/catch that silently swallows errors." }, { diff --git a/Ralph/progress.txt b/Ralph/progress.txt index 881027b..b2d9467 100644 --- a/Ralph/progress.txt +++ b/Ralph/progress.txt @@ -10,6 +10,8 @@ - Palette item IDs: `exp-{consultation.id}`, `skill-{skill.id}`, `proj-{investigation.id}`, `ach-{0-3}`, `edu-{0-3}`, `action-{0-3}` - `buildEmbeddingTexts()` in `src/lib/search.ts` returns `Array<{ id: string, text: string }>` with IDs matching PaletteItem IDs — use this for both embedding generation and chat context - `src/data/embeddings.json` is an array of `{ id: string, embedding: number[] }` — 42 items, 384-d vectors, IDs match PaletteItem IDs. Vite imports JSON natively. +- `src/lib/embedding-model.ts` exports `initModel()`, `embedQuery(text)`, `isModelReady()` — check `isModelReady()` before calling `embedQuery()` +- `initModel()` is called fire-and-forget in `App.tsx` on mount — model loads during boot/ECG/login phases --- @@ -62,3 +64,19 @@ - The embeddings file is ~450KB with pretty-print; could be reduced with compact JSON but readability is preferred for now - Processing 42 items takes ~10-15 seconds on first run (model cached after first download) --- + +## 2026-02-15 - US-004 +- Created `src/lib/embedding-model.ts` with three exports: `initModel()`, `embedQuery()`, `isModelReady()` +- Module-level `let extractor` pattern avoids React re-render issues +- `initModel()` uses `loading` guard to prevent duplicate pipeline loads +- `embedQuery()` uses same `pooling: 'mean'` and `normalize: true` as the build script +- `initModel()` called fire-and-forget in `App.tsx` `useEffect([], [])` — runs during boot phase +- Silent failure: try/catch swallows errors, `isModelReady()` stays false +- Typecheck, lint, and build all pass +- Files changed: `src/lib/embedding-model.ts` (new), `src/App.tsx` +- **Learnings for future iterations:** + - `FeatureExtractionPipeline` type is exported from `@xenova/transformers` and can be used for the module-level variable + - The `loading` boolean guard prevents race conditions if `initModel()` is called multiple times (e.g., React strict mode double-mount) + - `initModel()` is intentionally not awaited — it's fire-and-forget so it doesn't block the boot animation + - Consumers should check `isModelReady()` before calling `embedQuery()` — it throws if model isn't loaded +---