docs: update CLAUDE.md with indication chart architecture and CLI docs (Task 5.2)

This commit is contained in:
Andrew Charlwood
2026-02-05 20:49:49 +00:00
parent 4884e0a8cc
commit fed909481e
3 changed files with 185 additions and 44 deletions
+136 -35
View File
@@ -4,14 +4,15 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co
## Project Overview ## Project Overview
NHS High-Cost Drug Patient Pathway Analysis Tool - a web-based application that analyzes secondary care patient treatment pathways. It processes clinical activity data to visualize hierarchical treatment patterns (Trust → Directory/Specialty → Drug → Patient pathway) as interactive Plotly icicle charts. NHS High-Cost Drug Patient Pathway Analysis Tool - a web-based application that analyzes secondary care patient treatment pathways. It processes clinical activity data to visualize hierarchical treatment patterns as interactive Plotly icicle charts.
**Key Features:** **Key Features:**
- Multi-source data loading: CSV/Parquet files, SQLite database, Snowflake data warehouse - **Dual chart types**: Directory-based (Trust → Directory → Drug → Pathway) and Indication-based (Trust → GP Diagnosis → Drug → Pathway) views with toggle
- **Pre-computed pathway architecture**: Treatment pathways pre-processed and stored in SQLite for instant filtering - **Pre-computed pathway architecture**: Treatment pathways pre-processed and stored in SQLite for instant filtering
- GP diagnosis integration for indication validation via SNOMED clusters - **GP diagnosis matching**: Patient indications matched from GP records using SNOMED cluster codes queried directly from Snowflake (~93% match rate)
- Multi-source data loading: CSV/Parquet files, SQLite database, Snowflake data warehouse
- Interactive browser-based UI using Reflex framework - Interactive browser-based UI using Reflex framework
- 6 pre-defined date filter combinations with sub-50ms response times - 6 pre-defined date filter combinations × 2 chart types = 12 pre-computed datasets with sub-50ms response times
## Running the Application ## Running the Application
@@ -37,11 +38,17 @@ The application requires Python 3.10+ and runs on http://localhost:3000 by defau
**Refresh Pathway Data:** **Refresh Pathway Data:**
```bash ```bash
# Full refresh with default filters (all trusts, default drugs) # Full refresh — both chart types (directory + indication), all date filters
python -m cli.refresh_pathways python -m cli.refresh_pathways --chart-type all
# Directory charts only (faster, skips GP diagnosis lookup)
python -m cli.refresh_pathways --chart-type directory
# Indication charts only
python -m cli.refresh_pathways --chart-type indication
# Dry run (test without database changes) # Dry run (test without database changes)
python -m cli.refresh_pathways --dry-run -v python -m cli.refresh_pathways --chart-type all --dry-run -v
# Custom minimum patient threshold # Custom minimum patient threshold
python -m cli.refresh_pathways --minimum-patients 10 python -m cli.refresh_pathways --minimum-patients 10
@@ -50,11 +57,17 @@ python -m cli.refresh_pathways --minimum-patients 10
python -m cli.refresh_pathways --help python -m cli.refresh_pathways --help
``` ```
The `--chart-type` argument controls which pathway types are processed:
- `all` (default) — generates both directory and indication charts (~15 minutes)
- `directory` — directory-based charts only (~5 minutes)
- `indication` — indication-based charts only (~12 minutes, includes GP lookup)
The refresh command: The refresh command:
1. Fetches activity data from Snowflake (656K+ records, ~7 seconds) 1. Fetches activity data from Snowflake (656K+ records, ~7 seconds)
2. Applies UPID, drug name, and directory transformations (~6 minutes) 2. Applies UPID, drug name, and directory transformations (~6 minutes)
3. Processes 6 date filter combinations (all_6mo, all_12mo, 1yr_6mo, etc.) 3. For indication charts: queries GP records via SNOMED clusters (~9 minutes for 37K patients)
4. Inserts pathway nodes to SQLite for fast Reflex filtering 4. Processes 6 date filter combinations × selected chart types
5. Inserts pathway nodes to SQLite for fast Reflex filtering
## Architecture ## Architecture
@@ -121,6 +134,15 @@ The application uses a pre-computed pathway architecture for performance:
- **Simplicity**: Reflex filters pre-computed data with simple SQL WHERE clauses - **Simplicity**: Reflex filters pre-computed data with simple SQL WHERE clauses
- **Full Pathways**: Sequential treatment pathways (drug_0 → drug_1 → drug_2...) with statistics - **Full Pathways**: Sequential treatment pathways (drug_0 → drug_1 → drug_2...) with statistics
**Chart Types:**
| Type | Hierarchy | Level 2 Source |
|------|-----------|----------------|
| `directory` | Trust → Directory → Drug → Pathway | Assigned directorate (5-level fallback) |
| `indication` | Trust → GP Diagnosis → Drug → Pathway | SNOMED cluster Search_Term from GP records |
For indication charts, ~93% of patients are matched to a GP diagnosis (Search_Term). Unmatched patients use their directorate as a fallback label (e.g., "RHEUMATOLOGY (no GP dx)").
**Date Filter Combinations:** **Date Filter Combinations:**
| ID | Initiated | Last Seen | Default | | ID | Initiated | Last Seen | Default |
|----|-----------|-----------|---------| |----|-----------|-----------|---------|
@@ -131,14 +153,18 @@ The application uses a pre-computed pathway architecture for performance:
| `2yr_6mo` | Last 2 years | Last 6 months | No | | `2yr_6mo` | Last 2 years | Last 6 months | No |
| `2yr_12mo` | Last 2 years | Last 12 months | No | | `2yr_12mo` | Last 2 years | Last 12 months | No |
Total pre-computed datasets: 6 date filters × 2 chart types = 12 datasets (~3,600 pathway nodes).
**Pathway Node Structure:** **Pathway Node Structure:**
Each node in `pathway_nodes` contains: Each node in `pathway_nodes` contains:
- Hierarchy: `parents`, `ids`, `labels`, `level` (0=Root, 1=Trust, 2=Directory, 3=Drug, 4+=Pathway) - Routing: `chart_type` ("directory" or "indication"), `date_filter_id`
- Hierarchy: `parents`, `ids`, `labels`, `level` (0=Root, 1=Trust, 2=Directory/Indication, 3=Drug, 4+=Pathway)
- Counts: `value` (patient count) - Counts: `value` (patient count)
- Costs: `cost`, `costpp`, `cost_pp_pa` (per patient per annum) - Costs: `cost`, `costpp`, `cost_pp_pa` (per patient per annum)
- Dates: `first_seen`, `last_seen`, `first_seen_parent`, `last_seen_parent` - Dates: `first_seen`, `last_seen`, `first_seen_parent`, `last_seen_parent`
- Statistics: `average_spacing`, `average_administered`, `avg_days` - Statistics: `average_spacing`, `average_administered`, `avg_days`
- Denormalized: `trust_name`, `directory`, `drug_sequence` (for efficient filtering) - Denormalized: `trust_name`, `directory`, `drug_sequence` (for efficient filtering)
- Unique constraint: `UNIQUE(date_filter_id, chart_type, ids)`
### Core Module (`core/`) ### Core Module (`core/`)
@@ -166,9 +192,14 @@ Each node in `pathway_nodes` contains:
- `DATE_FILTER_CONFIGS` - All 6 pre-defined date combinations - `DATE_FILTER_CONFIGS` - All 6 pre-defined date combinations
- `compute_date_ranges(config, max_date)` - Computes actual ISO dates from config - `compute_date_ranges(config, max_date)` - Computes actual ISO dates from config
- `fetch_and_transform_data()` - Snowflake fetch + UPID/drug/directory transformations - `fetch_and_transform_data()` - Snowflake fetch + UPID/drug/directory transformations
- Directory chart functions:
- `process_pathway_for_date_filter()` - Processes single date filter using `generate_icicle_chart()` - `process_pathway_for_date_filter()` - Processes single date filter using `generate_icicle_chart()`
- `extract_denormalized_fields()` - Parses `ids` column to extract trust, directory, drug_sequence - `extract_denormalized_fields()` - Parses `ids` column to extract trust, directory, drug_sequence
- `convert_to_records()` - Converts ice_df to list of dicts for SQLite insertion - Indication chart functions:
- `process_indication_pathway_for_date_filter()` - Processes single date filter using `generate_icicle_chart_indication()`
- `extract_indication_fields()` - Parses `ids` for indication charts (trust, search_term, drug_sequence)
- Shared functions:
- `convert_to_records(ice_df, chart_type)` - Converts ice_df to list of dicts with `chart_type` column
- `process_all_date_filters()` - Convenience function to process all 6 filters - `process_all_date_filters()` - Convenience function to process all 6 filters
**Data Loaders:** **Data Loaders:**
@@ -182,20 +213,25 @@ Each node in `pathway_nodes` contains:
- Query caching with TTL-based invalidation - Query caching with TTL-based invalidation
- Fallback chain: cache → Snowflake → local files - Fallback chain: cache → Snowflake → local files
**GP Diagnosis Validation:** **GP Diagnosis Lookup (`diagnosis_lookup.py`):**
- Uses pre-built SNOMED clusters from `ClinicalCodingClusterSnomedCodes` - `CLUSTER_MAPPING_SQL` - Embedded SQL constant with ~148 Search_Term → Cluster_ID mappings plus explicit SNOMED codes
- `patient_has_indication(patient_pseudonym, cluster_ids)` checks GP records - `get_patient_indication_groups(patient_pseudonyms)` - Batch queries Snowflake to match patients to GP diagnoses:
- `validate_indication(patient_pseudonym, drug_name)` returns full validation result - Embeds cluster mapping as CTE, joins with `PrimaryCareClinicalCoding`
- Adds `Indication_Source` column: "GP_SNOMED" | "HCD_SNOMED" | "NONE" - Uses `PseudoNHSNoLinked` (not PersonKey) to match `PatientPseudonym` in GP records
- Returns most recent match per patient via `QUALIFY ROW_NUMBER()`
- Batches 500 patients per query, returns DataFrame with PatientPseudonym, Search_Term, EventDateTime
- `patient_has_indication(patient_pseudonym, cluster_ids)` - Single-patient GP record check (legacy)
- `validate_indication(patient_pseudonym, drug_name)` - Full validation result with source tracking (legacy)
### Analysis Module (`analysis/`) ### Analysis Module (`analysis/`)
Refactored from the original 267-line `generate_graph()` function: Refactored from the original 267-line `generate_graph()` function:
- **prepare_data()** - Filter DataFrame by date range, trusts, drugs, directories - **prepare_data()** - Filter DataFrame by date range, trusts, drugs, directories (copies df to prevent mutation)
- **calculate_statistics()** - Compute frequency, cost, duration statistics - **calculate_statistics()** - Compute frequency, cost, duration statistics
- **build_hierarchy()** - Create Trust → Directory → Drug → Pathway structure - **build_hierarchy()** - Create Trust → Directory → Drug → Pathway structure
- **prepare_chart_data()** - Format data for Plotly icicle chart - **prepare_chart_data()** - Format data for Plotly icicle chart
- **generate_icicle_chart_indication(df, indication_df, ...)** - Build indication-based hierarchy using Search_Term instead of Directory. Takes an `indication_df` (UPID → Search_Term mapping) alongside the main activity DataFrame.
### Visualization Module (`visualization/`) ### Visualization Module (`visualization/`)
@@ -205,12 +241,20 @@ Refactored from the original 267-line `generate_graph()` function:
### Reflex Application (`pathways_app/`) ### Reflex Application (`pathways_app/`)
The `State` class manages all application state: The `AppState` class manages all application state:
- **Chart type**: `selected_chart_type` ("directory" or "indication"), toggled via `set_chart_type()`
- **Computed vars**: `chart_hierarchy_label` (dynamic "Trust → Directorate → ..." or "Trust → Indication → ..."), `chart_type_label`
- Filter variables: dates, drugs, trusts, directories - Filter variables: dates, drugs, trusts, directories
- Reference data: available options loaded from CSV/SQLite - Reference data: available options loaded from CSV/SQLite
- Analysis state: running flag, status messages, chart data - Analysis state: running flag, status messages, chart data
- Data source state: file path, source type, row counts - Data source state: file path, source type, row counts
**Chart Type Toggle** (`chart_type_toggle()` component):
- Segmented control with "By Directory" and "By Indication" pill buttons
- Placed first in the filter strip before date filters
- Switching reloads pathway data from SQLite filtered by `chart_type`
- Note: Directory filter only applies to directory charts (indication charts store Search_Terms in the directory column)
### Legacy Modules (`tools/`) ### Legacy Modules (`tools/`)
Still used during transition: Still used during transition:
@@ -227,7 +271,7 @@ Still used during transition:
**Pre-Computed Pathway Architecture (Current):** **Pre-Computed Pathway Architecture (Current):**
``` ```
[CLI: python -m cli.refresh_pathways] [CLI: python -m cli.refresh_pathways --chart-type all]
Snowflake Data Warehouse Snowflake Data Warehouse
@@ -239,19 +283,42 @@ Still used during transition:
│ → department_identification() → Dir │ │ → department_identification() → Dir │
└──────────────────────────────────────────┘ └──────────────────────────────────────────┘
▼ (process_all_date_filters) ├─── Directory Charts ──────────────────────────────────────┐
┌──────────────────────────────────────────┐ │ │
│ Pathway Pipeline (pathway_pipeline.py) │ ┌──────────────────────────────────────────┐
For each of 6 date filter combos: │ For each of 6 date filter combos:
│ → generate_icicle_chart() │ → generate_icicle_chart()
│ → extract_denormalized_fields() │ → extract_denormalized_fields()
│ → convert_to_records() │ → convert_to_records("directory")
└──────────────────────────────────────────┘ └──────────────────────────────────────────┘
│ │
├─── Indication Charts ─────────────────────────────────────┤
│ │
│ ┌──────────────────────────────────────────┐ │
│ │ GP Diagnosis Lookup (diagnosis_lookup.py)│ │
│ │ → Extract PseudoNHSNoLinked from HCD │ │
│ │ → get_patient_indication_groups() │ │
│ │ (SNOMED cluster CTE + GP records) │ │
│ │ → Build indication_df: UPID → Search │ │
│ │ Term (matched) or Directorate (no GP)│ │
│ └──────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌──────────────────────────────────────────┐ │
│ │ For each of 6 date filter combos: │ │
│ │ → generate_icicle_chart_indication() │ │
│ │ → extract_indication_fields() │ │
│ │ → convert_to_records("indication") │ │
│ └──────────────────────────────────────────┘ │
│ │
└───────────────────────┬───────────────────────────────────┘
▼ (insert_pathway_records) ▼ (insert_pathway_records)
┌──────────────────────────────────────────┐ ┌──────────────────────────────────────────┐
│ SQLite: pathway_nodes table │ │ SQLite: pathway_nodes table │
│ → 293 nodes for all_6mo filter │ → ~3,600 nodes across 12 datasets
│ → UNIQUE(date_filter_id, chart_type, │
│ ids) prevents cross-type overwrites │
│ → Indexed for fast filtering │ │ → Indexed for fast filtering │
└──────────────────────────────────────────┘ └──────────────────────────────────────────┘
@@ -259,8 +326,16 @@ Still used during transition:
[Reflex App: reflex run] [Reflex App: reflex run]
┌──────────────────────────────────────────┐ ┌──────────────────────────────────────────┐
│ Chart Type Toggle (segmented control) │
│ → "By Directory" | "By Indication" │
│ → Triggers set_chart_type() handler │
└──────────────────────────────────────────┘
┌──────────────────────────────────────────┐
│ AppState.load_pathway_data() │ │ AppState.load_pathway_data() │
│ → Query pathway_nodes WHERE date_filter │ → Query pathway_nodes WHERE
│ date_filter AND chart_type │
│ → Apply drug/directory filters │ │ → Apply drug/directory filters │
│ → recalculate_parent_totals() │ │ → recalculate_parent_totals() │
└──────────────────────────────────────────┘ └──────────────────────────────────────────┘
@@ -278,6 +353,7 @@ Still used during transition:
│ Reflex UI (rx.plotly component) │ │ Reflex UI (rx.plotly component) │
│ → <50ms filter response time │ │ → <50ms filter response time │
│ → Treatment statistics in tooltips │ │ → Treatment statistics in tooltips │
│ → Dynamic hierarchy label updates │
└──────────────────────────────────────────┘ └──────────────────────────────────────────┘
``` ```
@@ -339,7 +415,17 @@ The `department_identification()` function has 5 levels of fallback:
4. **UPID_INFERENCE** - Inferred from other records with same UPID 4. **UPID_INFERENCE** - Inferred from other records with same UPID
5. **UNDEFINED** - No directory could be determined 5. **UNDEFINED** - No directory could be determined
**Indication Validation Workflow:** **Indication Lookup Workflow (for indication charts):**
1. Extract unique `PseudoNHSNoLinked` values from HCD activity data
2. Query Snowflake in batches of 500 patients:
- Embed `CLUSTER_MAPPING_SQL` (~148 Search_Term → Cluster_ID mappings) as CTE
- Join `ClinicalCodingClusterSnomedCodes` to get SNOMED codes per cluster
- Join `PrimaryCareClinicalCoding` on `PatientPseudonym` = `PseudoNHSNoLinked`
- Use `QUALIFY ROW_NUMBER() OVER (PARTITION BY PatientPseudonym ORDER BY EventDateTime DESC) = 1` for most recent match
3. Build `indication_df` mapping UPID → Search_Term (matched) or Directorate + " (no GP dx)" (unmatched)
4. Pass to `generate_icicle_chart_indication()` for pathway hierarchy building
**Indication Validation Workflow (legacy, per-patient):**
1. Map drug → SNOMED cluster IDs (e.g., ADALIMUMAB → RARTH_COD, PSORIASIS_COD) 1. Map drug → SNOMED cluster IDs (e.g., ADALIMUMAB → RARTH_COD, PSORIASIS_COD)
2. Get all SNOMED codes for those clusters 2. Get all SNOMED codes for those clusters
3. Check GP records (PrimaryCareClinicalCoding) for matching codes 3. Check GP records (PrimaryCareClinicalCoding) for matching codes
@@ -369,18 +455,20 @@ The `department_identification()` function has 5 levels of fallback:
### File Tracking ### File Tracking
- `processed_files` - Hash-based tracking for incremental loading - `processed_files` - Hash-based tracking for incremental loading
### Pathway Tables (New) ### Pathway Tables
- `pathway_date_filters` - 6 pre-defined date filter combinations - `pathway_date_filters` - 6 pre-defined date filter combinations
- Columns: `id`, `initiated`, `last_seen`, `is_default`, `description` - Columns: `id`, `initiated`, `last_seen`, `is_default`, `description`
- Auto-populated via migration - Auto-populated via migration
- `pathway_nodes` - Pre-computed pathway hierarchy nodes - `pathway_nodes` - Pre-computed pathway hierarchy nodes (~3,600 rows for 12 datasets)
- Routing: `chart_type` ("directory" or "indication"), `date_filter_id`
- Hierarchy: `parents`, `ids`, `labels`, `level` - Hierarchy: `parents`, `ids`, `labels`, `level`
- Metrics: `value`, `cost`, `costpp`, `cost_pp_pa`, `colour` - Metrics: `value`, `cost`, `costpp`, `cost_pp_pa`, `colour`
- Dates: `first_seen`, `last_seen`, `first_seen_parent`, `last_seen_parent` - Dates: `first_seen`, `last_seen`, `first_seen_parent`, `last_seen_parent`
- Statistics: `average_spacing`, `average_administered`, `avg_days` - Statistics: `average_spacing`, `average_administered`, `avg_days`
- Denormalized: `trust_name`, `directory`, `drug_sequence` - Denormalized: `trust_name`, `directory`, `drug_sequence`
- Foreign key: `date_filter_id``pathway_date_filters.id` - Foreign key: `date_filter_id``pathway_date_filters.id`
- Indexed for: date_filter_id, trust_name, directory, level - Unique constraint: `UNIQUE(date_filter_id, chart_type, ids)` — critical for INSERT OR REPLACE correctness
- Indexed for: date_filter_id, chart_type, trust_name, directory, level
- `pathway_refresh_log` - Tracks data refresh status - `pathway_refresh_log` - Tracks data refresh status
- Columns: `refresh_id`, `started_at`, `completed_at`, `status`, `records_processed`, `error_message` - Columns: `refresh_id`, `started_at`, `completed_at`, `status`, `records_processed`, `error_message`
@@ -388,6 +476,7 @@ The `department_identification()` function has 5 levels of fallback:
The input data (CSV/Parquet) must contain columns including: The input data (CSV/Parquet) must contain columns including:
- `Provider Code`, `PersonKey` - Used to create UPID - `Provider Code`, `PersonKey` - Used to create UPID
- `PseudoNHSNoLinked` - NHS pseudonym for GP record matching (indication charts)
- `Drug Name`, `Intervention Date`, `Price Actual` - `Drug Name`, `Intervention Date`, `Price Actual`
- `OrganisationName` - `OrganisationName`
- Various `Additional Detail/Description` columns for directory extraction - Various `Additional Detail/Description` columns for directory extraction
@@ -395,7 +484,9 @@ The input data (CSV/Parquet) must contain columns including:
## Output ## Output
Interactive Plotly icicle chart showing: Interactive Plotly icicle chart with toggleable views:
- **Directory view**: Trust → Directorate → Drug → Patient Pathway
- **Indication view**: Trust → GP Diagnosis (Search_Term) → Drug → Patient Pathway
- Patient counts and percentages at each hierarchy level - Patient counts and percentages at each hierarchy level
- Total and average costs - Total and average costs
- Treatment duration and dosing frequency information - Treatment duration and dosing frequency information
@@ -458,13 +549,23 @@ The pre-computed pathway architecture introduces these changes:
### State Variables ### State Variables
- **Removed**: `start_date`, `end_date`, `set_start_date()`, `set_end_date()` - **Removed**: `start_date`, `end_date`, `set_start_date()`, `set_end_date()`
- **Added**: `selected_initiated`, `selected_last_seen`, `date_filter_id` - **Added**: `selected_initiated`, `selected_last_seen`, `date_filter_id`
- **Added**: `load_pathway_data()` - queries pre-computed `pathway_nodes` - **Added**: `selected_chart_type` ("directory" or "indication"), `chart_type_options`
- **Added**: `set_chart_type()` - switches chart type and reloads data
- **Added**: `chart_hierarchy_label`, `chart_type_label` - computed vars for dynamic UI text
- **Added**: `load_pathway_data()` - queries pre-computed `pathway_nodes` filtered by `date_filter_id` AND `chart_type`
- **Added**: `recalculate_parent_totals()` - adjusts hierarchy after filtering - **Added**: `recalculate_parent_totals()` - adjusts hierarchy after filtering
### Chart Type Toggle
- **New**: Segmented control ("By Directory" | "By Indication") in filter strip
- **Added**: `selected_chart_type` state variable, `set_chart_type()` handler
- **Added**: Dynamic hierarchy label ("Trust → Directorate → ..." or "Trust → Indication → ...")
- **Note**: Directory filter only applies to directory charts; for indication charts the `directory` column stores Search_Terms
### Icicle Chart ### Icicle Chart
- **Enhanced**: Now includes full 10-field customdata structure - **Enhanced**: Now includes full 10-field customdata structure
- **Added**: Treatment statistics (average_spacing, cost_pp_pa) in hover tooltips - **Added**: Treatment statistics (average_spacing, cost_pp_pa) in hover tooltips
- **Added**: First/last seen dates for drug nodes - **Added**: First/last seen dates for drug nodes
- **Added**: Indication chart uses `generate_icicle_chart_indication()` with Search_Term hierarchy
## Development ## Development
+4 -4
View File
@@ -158,10 +158,10 @@ python -m reflex compile
- reflex run crashes on Windows due to Granian/watchfiles FileNotFoundError (environment issue, not code) - reflex run crashes on Windows due to Granian/watchfiles FileNotFoundError (environment issue, not code)
### 5.2 Update Documentation ### 5.2 Update Documentation
- [ ] Update CLAUDE.md with new architecture - [x] Update CLAUDE.md with new architecture
- [ ] Document new CLI arguments - [x] Document new CLI arguments
- [ ] Document chart_type toggle behavior - [x] Document chart_type toggle behavior
- [ ] Update data flow diagrams - [x] Update data flow diagrams
--- ---
+40
View File
@@ -416,3 +416,43 @@ The previous `batch_lookup_indication_groups()` function in `diagnosis_lookup.py
- The app is feature-complete — only documentation and optional visual polish remain - The app is feature-complete — only documentation and optional visual polish remain
### Blocked items: ### Blocked items:
- Visual testing at multiple viewport sizes blocked by Granian/watchfiles Windows crash - Visual testing at multiple viewport sizes blocked by Granian/watchfiles Windows crash
## Iteration 8 — 2026-02-05
### Task: 5.2 Update Documentation
### Why this task:
- This is the only remaining non-blocked task in the implementation plan
- Previous iteration explicitly recommended completing Task 5.2 next
- All code is complete and validated — only documentation remains
### Status: COMPLETE
### What was done:
Updated CLAUDE.md comprehensively to document all indication chart features:
1. **Project Overview** — Added dual chart types as key feature, GP diagnosis matching with ~93% match rate
2. **CLI Commands** — Documented `--chart-type` argument (all/directory/indication) with timing estimates
3. **Pathway Data Architecture** — Added chart types table, indication hierarchy, chart_type in node structure, UNIQUE constraint
4. **Data Processing Module** — Documented `get_patient_indication_groups()`, `CLUSTER_MAPPING_SQL`, indication pipeline functions
5. **Analysis Module** — Added `generate_icicle_chart_indication()` documentation
6. **Reflex Application** — Documented chart type state, toggle component, known limitation (directory filter)
7. **Data Flow Diagram** — Rewrote to show dual chart type processing with GP diagnosis lookup branch
8. **Database Schema** — Added `chart_type` column, updated UNIQUE constraint, indexes
9. **Key Patterns** — Added indication lookup workflow, kept legacy validation workflow
10. **Breaking Changes** — Added chart type toggle section, updated state variables
11. **Input Data** — Added `PseudoNHSNoLinked` column requirement
12. **Output** — Updated to describe both chart views
### Validation results:
- Tier 1 (Code): N/A (documentation only)
- Tier 2 (Data): N/A
- Tier 3 (Functional): N/A
### Files changed:
- `CLAUDE.md` — comprehensive documentation updates for indication chart features
- `IMPLEMENTATION_PLAN.md` — marked Task 5.2 items complete
### Committed: 315065c "docs: update CLAUDE.md with indication chart architecture and CLI docs (Task 5.2)"
### Patterns discovered:
- None (documentation task)
### Next iteration should:
- All implementation plan tasks are now complete except the deferred viewport testing (blocked by Granian crash)
- Check if `<promise>COMPLETE</promise>` signal can be issued (all tasks [x] except one deferred sub-item)
- If the deferred viewport testing sub-item prevents completion, consider whether it should be marked [B] (blocked)
### Blocked items:
- Task 5.1 sub-item "Test at multiple viewport sizes" still blocked by Granian/watchfiles Windows crash