# Progress Log — Reflex → Dash Migration ## Project Context Migrating the HCD Analysis frontend from Reflex to Dash (Plotly) + Dash Mantine Components. Pipeline/analysis logic in `src/` is untouched, but shared utilities (data queries, figure construction) should be added TO `src/` so Dash callbacks call into them rather than duplicating code. **Previous state**: Fully working Reflex app with pre-computed pathway architecture (SQLite), dual chart types (directory + indication), drug-aware indication matching. All pipeline work is done. **New goal**: Replace Reflex with Dash for better control over layout, CSS, and component behavior. Add a dmc.Drawer-based "card browser" for drug/indication selection organized by clinical directorate. ## Key Data Patterns ### SQLite pathway_nodes table - ~3,600 rows across 12 datasets (6 date filters × 2 chart types) - Key columns: `parents, ids, labels, level, value, cost, costpp, cost_pp_pa, colour, first_seen, last_seen, first_seen_parent, last_seen_parent, average_spacing, trust_name, directory, drug_sequence, chart_type, date_filter_id` - Level 0 = Root, Level 1 = Trust, Level 2 = Directory/Indication, Level 3 = Drug, Level 4+ = Pathway - `chart_type`: "directory" or "indication" - `date_filter_id`: "all_6mo" (default), "all_12mo", "1yr_6mo", "1yr_12mo", "2yr_6mo", "2yr_12mo" - UNIQUE constraint: (date_filter_id, chart_type, ids) ### DimSearchTerm.csv (for card browser) - Located at `data/DimSearchTerm.csv` - Columns: Search_Term, CleanedDrugName (pipe-separated drug fragments), PrimaryDirectorate - ~165 rows; some Search_Terms appear twice (e.g., "diabetes" under DIABETIC MEDICINE and OPHTHALMOLOGY) - Drug fragments are UPPERCASE substrings matched against standardized drug names - SEARCH_TERM_MERGE_MAP in `src/data_processing/diagnosis_lookup.py` merges asthma variants: {"allergic asthma": "asthma", "severe persistent allergic asthma": "asthma"} ### Data loading logic to extract - `pathways_app/pathways_app.py` lines 407-488: `load_data()` — loads available drugs, directorates, indications, total records, last updated from SQLite - `pathways_app/pathways_app.py` lines 490-642: `load_pathway_data()` — queries pathway_nodes with date_filter_id + chart_type + optional drug/directory filters - `pathways_app/pathways_app.py` lines 769-920: `icicle_figure` — builds go.Icicle with 10-field customdata, NHS colorscale, texttemplate, hovertemplate ### CSS from 01_nhs_classic.html - Lines 8-314 contain the full CSS (copy to dash_app/assets/nhs.css) - Google Fonts: `Source Sans 3` weights 300,400,600,700,900 - CSS variables: `--nhs-blue: #005EB8`, `--nhs-dark-blue: #003087`, `--nhs-light-blue: #41B6E6`, etc. - Key classes: `.top-header`, `.sidebar`, `.main`, `.kpi-row`, `.kpi-card`, `.filter-bar`, `.toggle-pill`, `.chart-card`, `.chart-tab`, `.page-footer` - Remove `.icicle`, `.icicle__row`, `.icicle__cell`, `.lvl-*` classes — those are mock chart CSS, Plotly handles the real chart ### Dash-specific patterns - State via `dcc.Store`: 3 stores (app-state, chart-data, reference-data) - Callbacks: unidirectional flow (filter change → app-state → chart-data → UI components) - DMC components: `dmc.MantineProvider` wraps everything, `dmc.Drawer` for card browser - Pattern-matching callbacks: `{"type": "drug-chip", "index": drug_name}` for dynamic drug chip selection - Assets auto-served from `dash_app/assets/` directory ### Database path from dash_app/ - From `dash_app/data/queries.py`: `Path(__file__).resolve().parents[2] / "data" / "pathways.db"` - From `dash_app/data/card_browser.py`: same pattern for `data/DimSearchTerm.csv` ### Existing src/ code to build on (not duplicate) - `src/visualization/plotly_generator.py` already has `create_icicle_figure(ice_df, title)` that takes a DataFrame with columns like `"First seen"`, `"Last seen (Parent)"` (with spaces). The Reflex AppState `icicle_figure` (pathways_app.py:769) takes list-of-dicts with keys like `first_seen`, `last_seen_parent` (underscores). For Dash, add a NEW function `create_icicle_from_nodes(nodes, title)` that accepts list-of-dicts. Don't modify the existing DataFrame-based function. - `src/data_processing/database.py` has `DatabaseManager` class. Add standalone query functions here (or a new `pathway_queries.py`) so Dash and Reflex share the same SQL. - The existing `create_icicle_figure` uses Viridis colorscale; the Reflex version uses NHS blue gradient. The Dash version should use the NHS blue gradient from pathways_app.py. ### Architecture decision: shared code in src/ - Pipeline/analysis logic is OFF LIMITS: pathway_pipeline.py, transforms.py, diagnosis_lookup.py (matching), pathway_analyzer.py, statistics.py, refresh_pathways.py - Shared utilities are ENCOURAGED to add to src/: plotly_generator.py (new dict-based function), database.py (query functions) - dash_app/data/queries.py should be a thin wrapper that calls into src/ ## Iteration Log ## Iteration 1 — 2026-02-06 ### Task: Phase 0 — Tasks 0.1 + 0.2 (Project Scaffolding + CSS Extraction) ### Why this task: - This is the first iteration. Phase 0 scaffolding is the foundation everything else depends on. - Tasks 0.1 and 0.2 are tightly coupled (CSS needs the assets dir from 0.1), so both done together. ### Status: COMPLETE ### What was done: - Created `dash_app/` directory with subdirectories: `assets/`, `data/`, `components/`, `callbacks/`, `utils/` - Created `__init__.py` in all packages - Created `run_dash.py` entry point at project root - Updated `pyproject.toml` with `dash>=2.14.0` and `dash-mantine-components>=0.14.0` - Ran `uv sync` — installed Dash 4.0.0 and DMC 2.5.1 (newer than plan expected) - Created `dash_app/app.py` with `MantineProvider` wrapper, 3 `dcc.Store` components (app-state, chart-data, reference-data), and placeholder layout - Extracted CSS from `01_nhs_classic.html` into `dash_app/assets/nhs.css` (7.5KB) - Added Google Fonts `@import` for Source Sans 3 - Removed mock icicle chart CSS (`.icicle`, `.icicle__row`, `.icicle__cell`, `.lvl-*`) - Kept all real component CSS: header, sidebar, KPI, filter bar, chart card, footer, responsive ### Validation results: - Tier 1 (Code): `python -c "from dash_app.app import app"` — OK, layout type is MantineProvider - Tier 1 (App starts): `python run_dash.py` — serves at http://127.0.0.1:8050/ - CSS: nhs.css (7,491 bytes) found in Dash assets folder, auto-served ### Files changed: - `pyproject.toml` — added dash + dmc dependencies - `run_dash.py` — new entry point - `dash_app/__init__.py` — new package - `dash_app/app.py` — new app with MantineProvider + stores + placeholder - `dash_app/assets/nhs.css` — extracted from 01_nhs_classic.html - `dash_app/data/__init__.py` — empty package - `dash_app/components/__init__.py` — empty package - `dash_app/callbacks/__init__.py` — empty package - `dash_app/utils/__init__.py` — empty package ### Committed: 1c3ece6 "feat: create dash_app skeleton with nhs.css and MantineProvider (Phase 0)" ### Patterns discovered: - Dash 4.0.0 and DMC 2.5.1 installed (plan said 2.x and 0.14.x). The API is compatible — `MantineProvider`, `Drawer`, `Accordion`, `Chip` all available. DMC 2.x is based on Mantine v7. - `dmc.MantineProvider(children=[...])` works for wrapping the layout in both DMC versions. - `suppress_callback_exceptions=True` needed in Dash app since callbacks reference IDs not yet in layout during phased development. ### Next iteration should: - Start Phase 1: Task 1.1 — Create shared data access functions - Read `pathways_app/pathways_app.py` lines 407-642 to extract `load_data()` and `load_pathway_data()` logic - Create shared functions in `src/data_processing/pathway_queries.py` (database.py is already used for DatabaseManager) - Create thin wrapper `dash_app/data/queries.py` that calls the shared functions - Verify with `python -c "from dash_app.data.queries import load_initial_data; print(load_initial_data())"` ### Blocked items: - None ## Iteration 2 — 2026-02-06 ### Task: Phase 1 — Task 1.1 (Create shared data access functions) ### Why this task: - Phase 0 complete in iteration 1; Phase 1 is next in dependency order - progress.txt from iteration 1 explicitly recommended this task - All UI components and callbacks (Phases 2-5) depend on having data access working ### Status: COMPLETE ### What was done: - Created `src/data_processing/pathway_queries.py` with two shared functions: - `load_initial_data(db_path)` — returns available drugs (42), directorates (14), indications (32), total_records, last_updated - `load_pathway_nodes(db_path, filter_id, chart_type, selected_drugs, selected_directorates)` — returns nodes list, unique_patients, total_drugs, total_cost, last_updated - Both functions extracted directly from AppState methods in pathways_app.py (lines 407-642), with Reflex `self.*` references replaced by function parameters - All return values are plain dicts/lists — JSON-serializable for dcc.Store - Created thin wrapper `dash_app/data/queries.py` that resolves DB_PATH and delegates to shared functions - Used separate file (pathway_queries.py) rather than adding to database.py because database.py is connection management (240 lines), queries are a distinct concern ### Validation results: - Tier 1 (Code): `python -c "from dash_app.data.queries import load_initial_data"` — OK (requires uv run for .pth file) - Tier 1 (App starts): `from dash_app.app import app` — OK, layout type is MantineProvider - Tier 3 (Functional): - `load_initial_data()`: 42 drugs, 14 directorates, 32 indications, last_updated=2026-02-06T00:08:55 - `load_pathway_data("all_6mo", "directory")`: 293 nodes, 11,118 patients, 39 drugs, £130.5M cost - `load_pathway_data("all_6mo", "indication")`: 438 nodes, 11,252 patients - `load_pathway_data("all_6mo", "directory", selected_drugs=["ADALIMUMAB"])`: 70 nodes (drug filter works) ### Files changed: - `src/data_processing/pathway_queries.py` — NEW: shared query functions - `dash_app/data/queries.py` — NEW: thin Dash wrapper with DB_PATH resolution - `IMPLEMENTATION_PLAN.md` — Task 1.1 marked [x] ### Committed: (pending) ### Patterns discovered: - `src/` is on sys.path only when using `uv run` (via .pth file created by setup_dev.py). Running `python` directly won't find `data_processing` module. Always use `uv run python` for testing. - `total_records` from `pathway_refresh_log` returns 0 — the refresh log's `source_row_count` field appears empty despite `completed_at` having a value. This is cosmetic — the KPI can use `unique_patients` from chart-data instead. - Drug filtering correctly includes nodes with NULL drug_sequence (root, trust, directory levels) alongside matching drug nodes. Root node patient count becomes 0 when drug filter is active — this matches Reflex behavior. ### Next iteration should: - Start Task 1.2 — Build directorate card tree from DimSearchTerm.csv - Create `dash_app/data/card_browser.py` with `build_directorate_tree()` and `get_all_drugs()` - Read `data/DimSearchTerm.csv` to understand the data format - Import SEARCH_TERM_MERGE_MAP from `data_processing.diagnosis_lookup` for asthma normalization - Remember: drug fragments in CleanedDrugName are UPPERCASE substrings, not exact matches ### Blocked items: - None