feat: remove Reflex, archive old app, update docs for Dash migration (Task 5.4)

- Remove reflex dependency from pyproject.toml
- Move pathways_app/ and rxconfig.py to archive/
- Update CLAUDE.md: Dash app structure, callback chain, run command
- All completion criteria validated (10/10 pass)
This commit is contained in:
Andrew Charlwood
2026-02-06 14:35:43 +00:00
parent 7a4f9ce567
commit fe8642dfaf
11 changed files with 121 additions and 1080 deletions
+105 -70
View File
@@ -11,7 +11,7 @@ NHS High-Cost Drug Patient Pathway Analysis Tool - a web-based application that
- **Pre-computed pathway architecture**: Treatment pathways pre-processed and stored in SQLite for instant filtering - **Pre-computed pathway architecture**: Treatment pathways pre-processed and stored in SQLite for instant filtering
- **GP diagnosis matching**: Patient indications matched from GP records using SNOMED cluster codes queried directly from Snowflake (~93% match rate) - **GP diagnosis matching**: Patient indications matched from GP records using SNOMED cluster codes queried directly from Snowflake (~93% match rate)
- Data pipeline: Snowflake → pre-computed SQLite pathway nodes (CSV/Parquet file loading retained for legacy compatibility) - Data pipeline: Snowflake → pre-computed SQLite pathway nodes (CSV/Parquet file loading retained for legacy compatibility)
- Interactive browser-based UI using Reflex framework - Interactive browser-based UI using Dash (Plotly) + Dash Mantine Components
- 6 pre-defined date filter combinations × 2 chart types = 12 pre-computed datasets with sub-50ms response times - 6 pre-defined date filter combinations × 2 chart types = 12 pre-computed datasets with sub-50ms response times
## Running the Application ## Running the Application
@@ -29,11 +29,11 @@ python -m data_processing.migrate
# Refresh pathway data from Snowflake (requires SSO auth) # Refresh pathway data from Snowflake (requires SSO auth)
python -m cli.refresh_pathways python -m cli.refresh_pathways
# Run the Reflex web application # Run the Dash web application
reflex run python run_dash.py
``` ```
The application requires Python 3.10+ and runs on http://localhost:3000 by default. The application requires Python 3.10+ and runs on http://localhost:8050 by default.
### CLI Commands ### CLI Commands
@@ -68,7 +68,7 @@ The refresh command:
2. Applies UPID, drug name, and directory transformations (~6 minutes) 2. Applies UPID, drug name, and directory transformations (~6 minutes)
3. For indication charts: queries GP records via SNOMED clusters (~9 minutes for 37K patients) 3. For indication charts: queries GP records via SNOMED clusters (~9 minutes for 37K patients)
4. Processes 6 date filter combinations × selected chart types 4. Processes 6 date filter combinations × selected chart types
5. Inserts pathway nodes to SQLite for fast Reflex filtering 5. Inserts pathway nodes to SQLite for fast Dash filtering
## Architecture ## Architecture
@@ -103,20 +103,41 @@ The refresh command:
│ │ └── statistics.py # Statistical calculation functions │ │ └── statistics.py # Statistical calculation functions
│ │ │ │
│ ├── visualization/ # Chart generation │ ├── visualization/ # Chart generation
│ │ └── plotly_generator.py # create_icicle_figure, save_figure_html │ │ └── plotly_generator.py # create_icicle_figure, create_icicle_from_nodes
│ │ │ │
│ └── cli/ # CLI tools │ └── cli/ # CLI tools
│ └── refresh_pathways.py # Data refresh command │ └── refresh_pathways.py # Data refresh command
├── pathways_app/ # Reflex web app (stays at root — framework requirement) ├── dash_app/ # Dash web application
│ ├── pathways_app.py # AppState + page components │ ├── app.py # Dash app, layout root, dcc.Store, register_callbacks
── components/ # Layout and navigation components ── assets/
│ │ └── nhs.css # NHS design system CSS (from 01_nhs_classic.html)
│ ├── data/
│ │ ├── queries.py # Thin wrapper calling src/data_processing/pathway_queries.py
│ │ └── card_browser.py # DimSearchTerm.csv → directorate tree for drawer
│ ├── components/
│ │ ├── header.py # Top header bar with data freshness indicator
│ │ ├── sidebar.py # Left navigation with drawer triggers
│ │ ├── kpi_row.py # 4 KPI cards (patients, drugs, cost, match rate)
│ │ ├── filter_bar.py # Chart type toggle pills + date filter dropdowns
│ │ ├── chart_card.py # Chart area with tabs + dcc.Graph + loading spinner
│ │ ├── drawer.py # dmc.Drawer with drug/trust chips + directorate cards
│ │ └── footer.py # Page footer
│ ├── callbacks/
│ │ ├── __init__.py # register_callbacks(app)
│ │ ├── filters.py # Reference data loading + filter state management
│ │ ├── chart.py # Pathway data loading + icicle chart rendering
│ │ ├── drawer.py # Drawer open/close + drug/trust selection
│ │ └── kpi.py # KPI card value updates
│ └── utils/
│ └── __init__.py
├── run_dash.py # Entry point: python run_dash.py
├── tests/ # Test suite (113 tests) ├── tests/ # Test suite (113 tests)
├── data/ # Reference data + SQLite DB ├── data/ # Reference data + SQLite DB
├── docs/ # Documentation ├── docs/ # Documentation
├── assets/ # Static assets (logo, favicon) ├── assets/ # Static assets (logo, favicon)
├── archive/ # Historical/deprecated ├── archive/ # Historical/deprecated (includes old Reflex app)
└── logs/ # Runtime logs └── logs/ # Runtime logs
``` ```
@@ -127,11 +148,11 @@ All imports use package names directly: `from core import ...`, `from data_proce
The application uses a pre-computed pathway architecture for performance: The application uses a pre-computed pathway architecture for performance:
**Architecture:** `Snowflake → Pathway Processing → SQLite (pre-computed) → Reflex (filter & view)` **Architecture:** `Snowflake → Pathway Processing → SQLite (pre-computed) → Dash (filter & view)`
**Key Benefits:** **Key Benefits:**
- **Performance**: Pathway calculation done once during data refresh, not on every filter change - **Performance**: Pathway calculation done once during data refresh, not on every filter change
- **Simplicity**: Reflex filters pre-computed data with simple SQL WHERE clauses - **Simplicity**: Dash callbacks filter pre-computed data with simple SQL WHERE clauses
- **Full Pathways**: Sequential treatment pathways (drug_0 → drug_1 → drug_2...) with statistics - **Full Pathways**: Sequential treatment pathways (drug_0 → drug_1 → drug_2...) with statistics
**Chart Types:** **Chart Types:**
@@ -203,7 +224,7 @@ Each node in `pathway_nodes` contains:
- `process_all_date_filters()` - Convenience function to process all 6 filters - `process_all_date_filters()` - Convenience function to process all 6 filters
**Data Loaders:** **Data Loaders:**
- `FileDataLoader` - Loads from CSV/Parquet files (used by legacy pipeline, not by Reflex app) - `FileDataLoader` - Loads from CSV/Parquet files (used by legacy pipeline, not by Dash app)
- Factory function `get_loader()` creates a `FileDataLoader` - Factory function `get_loader()` creates a `FileDataLoader`
**Snowflake Integration:** **Snowflake Integration:**
@@ -233,25 +254,51 @@ Refactored from the original 267-line `generate_graph()` function:
### Visualization Module (`visualization/`) ### Visualization Module (`visualization/`)
- **create_icicle_figure()** - Generate Plotly icicle chart figure - **create_icicle_figure(ice_df)** - Generate Plotly icicle chart from DataFrame (legacy/pipeline use)
- **create_icicle_from_nodes(nodes, title)** - Generate icicle chart from list-of-dicts (Dash use). Accepts JSON-serializable node dicts from `dcc.Store`. Uses NHS blue gradient colorscale, 10-field customdata, Source Sans 3 font.
- **save_figure_html()** - Save interactive HTML file - **save_figure_html()** - Save interactive HTML file
- **open_figure_in_browser()** - Open chart in default browser - **open_figure_in_browser()** - Open chart in default browser
### Reflex Application (`pathways_app/`) ### Shared Data Queries (`data_processing/pathway_queries.py`)
The `AppState` class manages all application state: Shared query functions used by both the Dash app and potentially other consumers:
- **Chart type**: `selected_chart_type` ("directory" or "indication"), toggled via `set_chart_type()` - **load_initial_data(db_path)** - Returns available drugs (42), directorates (14), indications (32), trusts (7), total_patients, last_updated
- **Computed vars**: `chart_hierarchy_label` (dynamic "Trust → Directorate → ..." or "Trust → Indication → ..."), `chart_type_label` - **load_pathway_nodes(db_path, filter_id, chart_type, selected_drugs, selected_directorates, selected_trusts)** - Returns pathway nodes, unique_patients, total_drugs, total_cost, last_updated. Parameterized SQL with optional drug/directorate/trust filters.
- Filter variables: dates, drugs, trusts, directories
- Reference data: available options loaded from pathway_nodes and CSV files
- Analysis state: running flag, status messages, chart data
- `load_data()` sources available drugs/directorates from `pathway_nodes` and `total_records` from `pathway_refresh_log.source_row_count`
**Chart Type Toggle** (`chart_type_toggle()` component): ### Dash Application (`dash_app/`)
- Segmented control with "By Directory" and "By Indication" pill buttons
- Placed first in the filter strip before date filters **State Management** via 3 `dcc.Store` components:
- Switching reloads pathway data from SQLite filtered by `chart_type` - **app-state** (session): `chart_type`, `initiated`, `last_seen`, `date_filter_id`, `selected_drugs`, `selected_directorates`, `selected_trusts`
- Note: Directory filter only applies to directory charts (indication charts store Search_Terms in the directory column) - **chart-data** (memory): `nodes[]`, `unique_patients`, `total_drugs`, `total_cost`, `last_updated`
- **reference-data** (session): `available_drugs`, `available_directorates`, `available_indications`, `available_trusts`, `total_patients`, `last_updated`
**Callback Chain** (unidirectional):
```
Page Load → load_reference_data → reference-data store + header indicators
→ update_app_state → app-state store (default filters)
→ load_pathway_data → chart-data store
├→ update_kpis → KPI cards
└→ update_chart → dcc.Graph
Filter change → update_app_state → app-state → load_pathway_data → (chain above)
Drawer selection → all-drugs-chips/trust-chips → update_app_state → (chain above)
```
**Key Components:**
- **Header** (`header.py`): NHS branding, data freshness indicator (patient count + relative time)
- **Sidebar** (`sidebar.py`): Navigation items with drawer trigger IDs for Drug Selection, Trust Selection, Indications
- **Filter Bar** (`filter_bar.py`): Chart type toggle pills (By Directory / By Indication) + date filter dropdowns
- **KPI Row** (`kpi_row.py`): 4 cards — Unique Patients, Drug Types, Total Cost, Indication Match Rate (~93%)
- **Chart Card** (`chart_card.py`): Icicle chart with `dcc.Loading` spinner, dynamic subtitle, tab row
- **Drawer** (`drawer.py`): `dmc.Drawer` with drug chips (`dmc.ChipGroup`), trust chips, directorate accordion with indication sub-items and drug fragment badges
- **Footer** (`footer.py`): NHS Norfolk and Waveney ICB branding
**Drawer Drug Browser:**
- "All Drugs" section: flat `dmc.ChipGroup` with 42 drugs from pathway_nodes level 3
- "Trusts" section: `dmc.ChipGroup` with 7 trusts
- "By Directorate" section: nested `dmc.Accordion` — 19 directorates → indications → drug fragment `dmc.Badge` items
- Clicking a drug fragment badge selects all full drug names containing that fragment (substring match)
- "Clear All Filters" button resets drug and trust selections
### Data Transformations (`data_processing/transforms.py`) ### Data Transformations (`data_processing/transforms.py`)
@@ -317,38 +364,33 @@ Core data transformation functions used by the pipeline:
└──────────────────────────────────────────┘ └──────────────────────────────────────────┘
[Reflex App: reflex run] [Dash App: python run_dash.py]
┌──────────────────────────────────────────┐ ┌──────────────────────────────────────────┐
Chart Type Toggle (segmented control) Filter Bar + Drawer (toggle pills,
→ "By Directory" | "By Indication" date dropdowns, drug/trust chips)
│ → Triggers set_chart_type() handler │ → Triggers update_app_state callback
└──────────────────────────────────────────┘ └──────────────────────────────────────────┘
┌──────────────────────────────────────────┐ ┌──────────────────────────────────────────┐
AppState.load_pathway_data() │ load_pathway_data callback
│ → Query pathway_nodes WHERE │ → Input: app-state dcc.Store
date_filter AND chart_type → Calls pathway_queries.load_pathway_
→ Apply drug/directory filters nodes() with filters
│ → recalculate_parent_totals() │ → Output: chart-data dcc.Store
└──────────────────────────────────────────┘ └──────────────────────────────────────────┘
├──────────────────────────────┐
┌──────────────────────────────────────────┐ ▼ ▼
│ AppState.icicle_figure │ ┌────────────────────┐ ┌──────────────────────┐
→ Plotly icicle chart update_kpis │ │ update_chart
│ → 10-field customdata structure │ → 4 KPI cards │ → create_icicle_
│ → Full hover/text templates │ → formatted │ │ from_nodes()
└──────────────────────────────────────────┘ │ counts/costs │ │ → 10-field custom- │
└────────────────────┘ │ data + NHS blue
│ → dcc.Graph figure │
┌────────────────────────────────────────── ──────────────────────
│ Reflex UI (rx.plotly component) │
│ → <50ms filter response time │
│ → Treatment statistics in tooltips │
│ → Dynamic hierarchy label updates │
└──────────────────────────────────────────┘
``` ```
### Reference Data Files (`data/`) ### Reference Data Files (`data/`)
@@ -386,7 +428,7 @@ The `department_identification()` function has 5 levels of fallback:
3. Build `indication_df` mapping UPID → Search_Term (matched) or Directorate + " (no GP dx)" (unmatched) 3. Build `indication_df` mapping UPID → Search_Term (matched) or Directorate + " (no GP dx)" (unmatched)
4. Pass to `generate_icicle_chart_indication()` for pathway hierarchy building 4. Pass to `generate_icicle_chart_indication()` for pathway hierarchy building
**Data Source Fallback Chain** (for raw data loading, not used by Reflex app): **Data Source Fallback Chain** (for raw data loading, not used by Dash app):
1. Query cache for recent results 1. Query cache for recent results
2. Attempt Snowflake connection 2. Attempt Snowflake connection
3. Fall back to CSV/Parquet files 3. Fall back to CSV/Parquet files
@@ -491,26 +533,19 @@ The pre-computed pathway architecture introduces these changes:
- **Impact**: Data is as fresh as the last `python -m cli.refresh_pathways` run - **Impact**: Data is as fresh as the last `python -m cli.refresh_pathways` run
- **Benefit**: Sub-50ms filter response time vs multi-minute calculations - **Benefit**: Sub-50ms filter response time vs multi-minute calculations
### State Variables ### State Management (Dash)
- **Removed**: `start_date`, `end_date`, `set_start_date()`, `set_end_date()` - State lives in 3 `dcc.Store` components: `app-state`, `chart-data`, `reference-data`
- **Added**: `selected_initiated`, `selected_last_seen`, `date_filter_id` - Filter state: `chart_type`, `initiated`, `last_seen`, `date_filter_id`, `selected_drugs`, `selected_directorates`, `selected_trusts`
- **Added**: `selected_chart_type` ("directory" or "indication"), `chart_type_options` - Chart type toggle: "By Directory" / "By Indication" pills in filter bar
- **Added**: `set_chart_type()` - switches chart type and reloads data - Dynamic subtitle: "Trust → Directorate → Drug → Pathway" or "Trust → Indication → Drug → Pathway"
- **Added**: `chart_hierarchy_label`, `chart_type_label` - computed vars for dynamic UI text - Drug/trust selection via `dmc.ChipGroup` in right-side drawer
- **Added**: `load_pathway_data()` - queries pre-computed `pathway_nodes` filtered by `date_filter_id` AND `chart_type`
- **Added**: `recalculate_parent_totals()` - adjusts hierarchy after filtering
### Chart Type Toggle
- **New**: Segmented control ("By Directory" | "By Indication") in filter strip
- **Added**: `selected_chart_type` state variable, `set_chart_type()` handler
- **Added**: Dynamic hierarchy label ("Trust → Directorate → ..." or "Trust → Indication → ...")
- **Note**: Directory filter only applies to directory charts; for indication charts the `directory` column stores Search_Terms
### Icicle Chart ### Icicle Chart
- **Enhanced**: Now includes full 10-field customdata structure - Full 10-field customdata structure (value, colour, cost, costpp, first_seen, last_seen, first_seen_parent, last_seen_parent, average_spacing, cost_pp_pa)
- **Added**: Treatment statistics (average_spacing, cost_pp_pa) in hover tooltips - NHS blue gradient colorscale: Heritage Blue #003087 → Pale Blue #E3F2FD
- **Added**: First/last seen dates for drug nodes - Treatment statistics (average_spacing, cost_pp_pa) in hover tooltips
- **Added**: Indication chart uses `generate_icicle_chart_indication()` with Search_Term hierarchy - First/last seen dates for drug nodes
- `create_icicle_from_nodes()` in `src/visualization/plotly_generator.py` — shared function accepting list-of-dicts
## Development ## Development
+16 -16
View File
@@ -248,12 +248,12 @@ Drawer selection → update_drug_selection → app-state store → load_pathway_
- **Checkpoint**: Header shows correct data freshness - **Checkpoint**: Header shows correct data freshness
### 5.4 Remove Reflex + final validation ### 5.4 Remove Reflex + final validation
- [ ] Remove `reflex` from `pyproject.toml` dependencies - [x] Remove `reflex` from `pyproject.toml` dependencies
- [ ] Delete or archive `pathways_app/` directory (move to `archive/`) - [x] Delete or archive `pathways_app/` directory (move to `archive/`)
- [ ] Delete `pathways_app/styles.py` and any Reflex-specific files - [x] Delete `pathways_app/styles.py` and any Reflex-specific files
- [ ] Update project `CLAUDE.md` to document Dash app structure, new run command, callback architecture - [x] Update project `CLAUDE.md` to document Dash app structure, new run command, callback architecture
- [ ] Verify: `python run_dash.py` starts cleanly, full end-to-end workflow works - [x] Verify: `python run_dash.py` starts cleanly, full end-to-end workflow works
- [ ] Verify: No Reflex imports anywhere in `dash_app/` - [x] Verify: No Reflex imports anywhere in `dash_app/`
- **Checkpoint**: Full application works, no Reflex remnants, CLAUDE.md updated - **Checkpoint**: Full application works, no Reflex remnants, CLAUDE.md updated
--- ---
@@ -261,16 +261,16 @@ Drawer selection → update_drug_selection → app-state store → load_pathway_
## Completion Criteria ## Completion Criteria
All tasks marked `[x]` AND: All tasks marked `[x]` AND:
- [ ] `python run_dash.py` starts cleanly at localhost:8050 - [x] `python run_dash.py` starts cleanly at localhost:8050
- [ ] Layout matches 01_nhs_classic.html (header, sidebar, KPIs, filter bar, chart card, footer) - [x] Layout matches 01_nhs_classic.html (header, sidebar, KPIs, filter bar, chart card, footer)
- [ ] Icicle chart renders with real SQLite data (pathway_nodes) - [x] Icicle chart renders with real SQLite data (pathway_nodes)
- [ ] Date filters + chart type toggle update chart correctly - [x] Date filters + chart type toggle update chart correctly
- [ ] dmc.Drawer opens, shows directorate cards with indications/drugs - [x] dmc.Drawer opens, shows directorate cards with indications/drugs
- [ ] Selecting a drug from drawer filters the chart - [x] Selecting a drug from drawer filters the chart
- [ ] "All Drugs" card allows selecting any drug across all contexts - [x] "All Drugs" card allows selecting any drug across all contexts
- [ ] "Clear Filters" resets all selections - [x] "Clear Filters" resets all selections
- [ ] KPIs update dynamically (patients, drugs, cost) - [x] KPIs update dynamically (patients, drugs, cost)
- [ ] No Reflex imports in `dash_app/` - [x] No Reflex imports in `dash_app/`
--- ---
View File
-1
View File
@@ -13,7 +13,6 @@ dependencies = [
"pillow>=10.0.0", "pillow>=10.0.0",
"plotly>=5.15.0", "plotly>=5.15.0",
"pyarrow>=20.0.0", "pyarrow>=20.0.0",
"reflex>=0.6.0",
"snowflake-connector-python>=3.0.0", "snowflake-connector-python>=3.0.0",
"tomli>=2.0.0", "tomli>=2.0.0",
] ]
Generated
-993
View File
File diff suppressed because it is too large Load Diff