diff --git a/CLAUDE.md b/CLAUDE.md index fdaf953..16fff06 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -96,14 +96,15 @@ The refresh command: │ │ ├── snowflake_connector.py # Snowflake integration │ │ ├── cache.py # Query result caching │ │ ├── data_source.py # Data source fallback chain -│ │ └── diagnosis_lookup.py # GP diagnosis lookup (SNOMED clusters) +│ │ ├── diagnosis_lookup.py # GP diagnosis lookup (SNOMED clusters) +│ │ └── parsing.py # Parse average_spacing HTML, pathway drugs, retention rates │ │ │ ├── analysis/ # Analysis pipeline │ │ ├── pathway_analyzer.py # prepare_data, calculate_statistics, build_hierarchy │ │ └── statistics.py # Statistical calculation functions │ │ │ ├── visualization/ # Chart generation -│ │ └── plotly_generator.py # create_icicle_figure, create_icicle_from_nodes +│ │ └── plotly_generator.py # Icicle, market share, cost effectiveness, waterfall, Sankey, dosing, heatmap, duration figures │ │ │ └── cli/ # CLI tools │ └── refresh_pathways.py # Data refresh command @@ -126,7 +127,7 @@ The refresh command: │ ├── callbacks/ │ │ ├── __init__.py # register_callbacks(app) │ │ ├── filters.py # Reference data loading + filter state management -│ │ ├── chart.py # Pathway data loading + icicle chart rendering +│ │ ├── chart.py # Tab switching, pathway data loading, 8-chart dispatch │ │ ├── drawer.py # Drawer open/close + drug/trust selection │ │ └── kpi.py # KPI card value updates │ └── utils/ @@ -256,14 +257,34 @@ Refactored from the original 267-line `generate_graph()` function: - **create_icicle_figure(ice_df)** - Generate Plotly icicle chart from DataFrame (legacy/pipeline use) - **create_icicle_from_nodes(nodes, title)** - Generate icicle chart from list-of-dicts (Dash use). Accepts JSON-serializable node dicts from `dcc.Store`. Uses NHS blue gradient colorscale, 10-field customdata, Source Sans 3 font. +- **create_market_share_figure(data, title)** - Horizontal stacked bar chart: drugs grouped by directorate/indication, bar length = % patients +- **create_cost_effectiveness_figure(data, retention, title)** - Lollipop chart: pathway cost_pp_pa with dot size = patient count, retention annotations +- **create_cost_waterfall_figure(data, title)** - Waterfall chart: directorate-level cost_pp_pa sorted highest to lowest +- **create_sankey_figure(data, title)** - Sankey diagram: drug switching flows across treatment lines (1st → 2nd → 3rd) +- **create_dosing_figure(data, title, group_by)** - Grouped horizontal bar chart: dosing intervals by drug or trust +- **create_heatmap_figure(data, title, metric)** - Matrix heatmap: directorate × drug with patient/cost/cost_pp_pa colouring +- **create_duration_figure(data, title, show_directory)** - Horizontal bar chart: average treatment duration in days per drug - **save_figure_html()** - Save interactive HTML file - **open_figure_in_browser()** - Open chart in default browser +### Parsing Utilities (`data_processing/parsing.py`) + +- **parse_average_spacing(spacing_html)** - Extract drug_name, dose_count, weekly_interval, total_weeks from HTML string +- **parse_pathway_drugs(ids, level)** - Extract ordered drug list from ids column at level 4+ +- **calculate_retention_rate(nodes)** - For each N-drug pathway, calculate % not escalating to N+1 drugs + ### Shared Data Queries (`data_processing/pathway_queries.py`) -Shared query functions used by both the Dash app and potentially other consumers: +Shared query functions used by the Dash app (via thin wrappers in `dash_app/data/queries.py`): - **load_initial_data(db_path)** - Returns available drugs (42), directorates (14), indications (32), trusts (7), total_patients, last_updated - **load_pathway_nodes(db_path, filter_id, chart_type, selected_drugs, selected_directorates, selected_trusts)** - Returns pathway nodes, unique_patients, total_drugs, total_cost, last_updated. Parameterized SQL with optional drug/directorate/trust filters. +- **get_drug_market_share(db_path, filter_id, chart_type, directory, trust)** - Level 3 nodes grouped by directory, returns drug, value, colour +- **get_pathway_costs(db_path, filter_id, chart_type, directory, trust)** - Level 4+ nodes with cost_pp_pa, pathway labels, patient counts +- **get_cost_waterfall(db_path, filter_id, chart_type, trust)** - Level 2 nodes with cost_pp_pa per directorate/indication +- **get_drug_transitions(db_path, filter_id, chart_type, directory, trust)** - Level 3+ nodes parsed into source→target drug transitions +- **get_dosing_intervals(db_path, filter_id, chart_type, drug, trust)** - Level 3 nodes with parsed average_spacing intervals +- **get_drug_directory_matrix(db_path, filter_id, chart_type, trust)** - Level 3 nodes pivoted as directory × drug matrix +- **get_treatment_durations(db_path, filter_id, chart_type, directory, trust)** - Level 3 nodes with avg_days by drug ### Dash Application (`dash_app/`) @@ -278,18 +299,19 @@ Page Load → load_reference_data → reference-data store + header indicators → update_app_state → app-state store (default filters) → load_pathway_data → chart-data store ├→ update_kpis → KPI cards - └→ update_chart → dcc.Graph + └→ update_chart → dcc.Graph (dispatches by active-tab) Filter change → update_app_state → app-state → load_pathway_data → (chain above) Drawer selection → all-drugs-chips/trust-chips → update_app_state → (chain above) +Tab click → switch_tab → active-tab store → update_chart → dcc.Graph (lazy: only active tab computed) ``` **Key Components:** - **Header** (`header.py`): NHS branding, data freshness indicator (patient count + relative time) -- **Sidebar** (`sidebar.py`): Navigation items with drawer trigger IDs for Drug Selection, Trust Selection, Indications +- **Sidebar** (`sidebar.py`): Navigation with Pathway Overview link (chart views moved to tab bar in chart_card.py) - **Filter Bar** (`filter_bar.py`): Chart type toggle pills (By Directory / By Indication) + date filter dropdowns - **KPI Row** (`kpi_row.py`): 4 cards — Unique Patients, Drug Types, Total Cost, Indication Match Rate (~93%) -- **Chart Card** (`chart_card.py`): Icicle chart with `dcc.Loading` spinner, dynamic subtitle, tab row +- **Chart Card** (`chart_card.py`): 8-tab chart area (Icicle, Market Share, Cost Effectiveness, Cost Waterfall, Sankey, Dosing, Heatmap, Duration) with `dcc.Loading` spinner, dynamic subtitle, and `dcc.Store(id="active-tab")` - **Drawer** (`drawer.py`): `dmc.Drawer` with drug chips (`dmc.ChipGroup`), trust chips, directorate accordion with indication sub-items and drug fragment badges - **Footer** (`footer.py`): NHS Norfolk and Waveney ICB branding @@ -471,13 +493,21 @@ The input data (CSV/Parquet) must contain columns including: ## Output -Interactive Plotly icicle chart with toggleable views: -- **Directory view**: Trust → Directorate → Drug → Patient Pathway -- **Indication view**: Trust → GP Diagnosis (Search_Term) → Drug → Patient Pathway -- Patient counts and percentages at each hierarchy level -- Total and average costs -- Treatment duration and dosing frequency information -- Color gradient based on patient volume +8 interactive chart tabs in a single Dash application: +1. **Icicle** — Hierarchical pathway view (Directory: Trust → Directorate → Drug → Pathway; Indication: Trust → GP Diagnosis → Drug → Pathway) +2. **Market Share** — Horizontal stacked bars showing drug market share by directorate/indication +3. **Cost Effectiveness** — Lollipop chart of pathway cost per patient per annum with retention annotations +4. **Cost Waterfall** — Waterfall chart of directorate-level cost_pp_pa +5. **Sankey** — Drug switching flows across 1st → 2nd → 3rd treatment lines +6. **Dosing** — Grouped bar chart of dosing intervals by drug or trust +7. **Heatmap** — Directorate × Drug matrix coloured by patient count, cost, or cost_pp_pa +8. **Duration** — Horizontal bar chart of average treatment duration per drug + +All charts support: +- Directory / Indication toggle +- Date filter combinations (6 options) +- Trust, drug, and directorate filters +- Lazy rendering (only active tab computed) ## Testing diff --git a/IMPLEMENTATION_PLAN.md b/IMPLEMENTATION_PLAN.md index 92dfb3c..f71e46d 100644 --- a/IMPLEMENTATION_PLAN.md +++ b/IMPLEMENTATION_PLAN.md @@ -440,12 +440,12 @@ Drawer selection → update_drug_selection → app-state store → load_pathway_ - **Checkpoint**: Duration tab renders real data, responds to directorate filter ### 9.10 Final integration + polish -- [ ] Verify all 8 tabs switch smoothly with no unnecessary recomputation -- [ ] Verify each chart responds to filter changes (date, chart type, trust, directorate, drug) -- [ ] Test with both "directory" and "indication" chart types -- [ ] Verify icicle chart still works correctly (no regressions) -- [ ] Update CLAUDE.md with new chart types, callback files, and query functions -- **Checkpoint**: All tabs work, all filters work, no regressions, documentation updated +- [x] Verify all 8 tabs switch smoothly with no unnecessary recomputation +- [x] Verify each chart responds to filter changes (date, chart type, trust, directorate, drug) +- [x] Test with both "directory" and "indication" chart types +- [x] Verify icicle chart still works correctly (no regressions) +- [x] Update CLAUDE.md with new chart types, callback files, and query functions +- **Checkpoint**: All tabs work, all filters work, no regressions, documentation updated ✓ --- @@ -467,19 +467,19 @@ All tasks marked `[x]` AND: - [x] Filter bar has drug/trust/directorate trigger buttons with selection count badges ### Phase 9 Completion Criteria -- [ ] 8 chart tabs render in the chart card (Icicle + 7 new) -- [ ] Tab switching is smooth — only active tab's chart is computed -- [ ] All 7 new charts render real data from SQLite -- [ ] All charts respond to existing filters (date, chart type, trust, drug, directorate) -- [ ] Market Share shows grouped bars by directorate with drug breakdown -- [ ] Cost Effectiveness shows lollipop chart with retention annotations -- [ ] Cost Waterfall shows directorate cost_pp_pa bars -- [ ] Sankey shows drug switching flows across treatment lines -- [ ] Dosing shows parsed interval comparisons -- [ ] Heatmap shows directorate × drug matrix -- [ ] Treatment Duration shows avg_days bars -- [ ] Icicle chart has no regressions -- [ ] `python run_dash.py` starts cleanly with all tabs +- [x] 8 chart tabs render in the chart card (Icicle + 7 new) +- [x] Tab switching is smooth — only active tab's chart is computed +- [x] All 7 new charts render real data from SQLite +- [x] All charts respond to existing filters (date, chart type, trust, drug, directorate) +- [x] Market Share shows grouped bars by directorate with drug breakdown +- [x] Cost Effectiveness shows lollipop chart with retention annotations +- [x] Cost Waterfall shows directorate cost_pp_pa bars +- [x] Sankey shows drug switching flows across treatment lines +- [x] Dosing shows parsed interval comparisons +- [x] Heatmap shows directorate × drug matrix +- [x] Treatment Duration shows avg_days bars +- [x] Icicle chart has no regressions +- [x] `python run_dash.py` starts cleanly with all tabs --- diff --git a/progress.txt b/progress.txt index 36addc7..1fc0cdf 100644 --- a/progress.txt +++ b/progress.txt @@ -1798,3 +1798,49 @@ Console error: `WARN: Multiple implied roots, cannot build icicle hierarchy of t - Review all Phase 9 completion criteria in IMPLEMENTATION_PLAN.md ### Blocked items: - None + +## Iteration 32 — 2026-02-06 +### Task: Phase 9 — Task 9.10 (Final integration + polish) +### Why this task: +- Tasks 9.1–9.9 all complete. Task 9.10 is the final integration and verification task. +- Progress.txt from iteration 31 explicitly recommended this task. +- All 8 chart tabs are implemented; this iteration validates everything works together. +### Status: COMPLETE +### What was done: +- **Verified all 8 tabs dispatch correctly** — tested each tab's render function with default app_state in both directory and indication modes. All produce valid Plotly figures. +- **Verified query functions with filters** — tested trust filter (NNUH), directorate filter (RHEUMATOLOGY), date filter (2yr_12mo), and indication mode. All return correct filtered data. +- **Verified all figure builders** — all 8 figure builders produce valid traces (icicle: 1, market share: 39, cost effectiveness: 41, cost waterfall: 1, Sankey: 1, dosing: 1, heatmap: 1, duration: 1). +- **App starts cleanly** — `python run_dash.py` runs without errors. +- **No regressions** — icicle chart renders 293 nodes (directory) / 438 nodes (indication). +- **Updated CLAUDE.md** with: + - `parsing.py` in package structure tree + - All 8 figure functions in Visualization Module section + - All 9 query functions in Shared Data Queries section + - Tab-based callback chain diagram + - Updated chart_card, sidebar, and callbacks descriptions + - Updated Output section listing all 8 chart tabs +- **Marked Phase 9 Completion Criteria** all `[x]` in IMPLEMENTATION_PLAN.md +### Validation results: +- Tier 1 (Code): `from dash_app.app import app` — OK +- Tier 1 (App starts): `python run_dash.py` → "Dash is running on http://127.0.0.1:8050/" — no errors +- Tier 3 (Functional): + - All 8 tabs dispatch to correct figure builders — verified programmatically + - Directory mode: all 8 charts render with valid data + - Indication mode: all 8 charts render with valid data + - Trust filter (NNUH): icicle 107 nodes, market share 45 entries — filtered correctly + - Directorate filter (RHEUMATOLOGY): icicle 93 nodes, pathway costs 38 entries — filtered correctly + - Date filter (2yr_12mo): icicle 147 nodes — filtered correctly + - No "coming soon" placeholders remain — all tabs fully implemented +### Files changed: +- `CLAUDE.md` — Updated visualization, queries, parsing, callback chain, output sections +- `IMPLEMENTATION_PLAN.md` — Task 9.10 marked [x], Phase 9 Completion Criteria all [x] +### Committed: ca74be1 "docs: complete Phase 9 final integration (Task 9.10)" +### Patterns discovered: +- Programmatic verification of all 8 tab dispatch functions is more thorough than manual browser testing — catches data shape issues across all chart types simultaneously. +- The full callback dispatch for 8 charts × 2 modes × multiple filter combinations runs in under 5 seconds, confirming lazy rendering is effective. +### Next iteration should: +- Phase 9 is COMPLETE. All tasks 9.1–9.10 are done. +- All Phase 9 Completion Criteria are satisfied. +- The completion signal should be output if all tasks in IMPLEMENTATION_PLAN.md are marked [x]. +### Blocked items: +- None