HighCostDrugsDemo/IMPROVEMENT_RECOMMENDATIONS.md

# Patient Pathway Analysis - Improvement Recommendations

This document outlines recommended improvements to modernize the Patient Pathway Analysis application, based on multi-domain expert analysis.

---

## Executive Summary

| Area | Current State | Recommended Change | Priority |
|------|--------------|-------------------|----------|
| **GUI Framework** | CustomTkinter | **Reflex** (browser-based, native Plotly) | High |
| **Data Storage** | CSV files (90MB+) | SQLite with caching | High |
| **Data Source** | Manual CSV export | Direct Snowflake connection | Medium |
| **Directory Assignment** | Multi-stage fallback | GP diagnosis codes as primary | Medium |
| **Code Quality** | Monolithic, no types | Modular, typed, tested | Low |

---

## 1. GUI Framework: Replace CustomTkinter with Reflex or Flet

### What
Replace the CustomTkinter-based GUI with a modern Python framework. Two strong options:
- **[Reflex](https://reflex.dev)** - React-based, runs in browser
- **[Flet](https://flet.dev)** - Flutter-based, native desktop or browser

### Why

Since Python is approved and standalone `.exe` distribution isn't required, **both frameworks are viable**.

| Criterion | CustomTkinter | Reflex | Flet |
|-----------|---------------|--------|------|
| UI paradigm | Native desktop | Browser (localhost) | Desktop or browser |
| Component richness | Limited | 60+ React components | Material Design |
| Styling | Manual/limited | Full CSS/Tailwind | Flutter theming |
| Plotly integration | External HTML | **Native embed** | WebView needed |
| State management | Manual | Automatic re-render | Manual updates |
| Learning curve | Low | Moderate (React-like) | Low-moderate |
| Community | Small | 22k+ GitHub stars | 12k+ GitHub stars |
| Maturity | Stable | Active (v0.6+) | Active (v0.80+) |

### Recommendation: **Reflex**

Given that:
1. Python is approved for users
2. Standalone `.exe` not required
3. **Interactive Plotly is required** (Reflex has native `rx.plotly()` component)

Reflex is now the better choice because:
- **Native Plotly support** - no need to open external browser windows
- **Modern React-based UI** - cleaner, more customizable
- **Simpler state management** - automatic re-rendering on state changes
- **Better for data apps** - designed for dashboards and data visualization

### How (Reflex)

**Basic app structure:**

```python
import reflex as rx

class State(rx.State):
    """Application state."""
    start_date: str = "2019-04-01"
    end_date: str = "2025-04-30"
    selected_drugs: list[str] = []
    selected_trusts: list[str] = []
    analysis_running: bool = False
    chart_data: dict = {}

    async def run_analysis(self):
        self.analysis_running = True
        yield  # Update UI

        # Run analysis (async)
        df = await self.load_and_process_data()
        self.chart_data = generate_plotly_figure(df)

        self.analysis_running = False

def index() -> rx.Component:
    return rx.box(
        rx.hstack(
            # Sidebar with filters
            rx.vstack(
                rx.date_picker(
                    value=State.start_date,
                    on_change=State.set_start_date,
                ),
                rx.checkbox_group(
                    items=drug_list,
                    value=State.selected_drugs,
                    on_change=State.set_selected_drugs,
                ),
                rx.button(
                    "Run Analysis",
                    on_click=State.run_analysis,
                    loading=State.analysis_running,
                ),
                width="300px",
            ),
            # Main content - interactive Plotly chart
            rx.plotly(data=State.chart_data, layout=chart_layout),
            width="100%",
        )
    )

app = rx.App()
app.add_page(index)
```

**Key components mapping:**

| Current Component | Reflex Equivalent |
|-------------------|-------------------|
| `CTkFrame` | `rx.box`, `rx.vstack`, `rx.hstack` |
| `CTkButton` | `rx.button` |
| `CTkCheckBox` | `rx.checkbox` |
| `CTkSlider` | `rx.slider` |
| `DateEntry` | `rx.date_picker` |
| `CTkScrollableFrame` | `rx.scroll_area` |
| `filedialog` | `rx.upload` |
| Plotly HTML file | **`rx.plotly()`** - native embed! |

**Running the app:**

```bash
# Install
pip install reflex

# Initialize (first time)
reflex init

# Run development server
reflex run
# Opens http://localhost:3000 in browser
```

**Background tasks with progress:**

```python
class State(rx.State):
    progress: int = 0
    status: str = ""

    async def run_analysis(self):
        self.status = "Loading data..."
        self.progress = 10
        yield

        df = load_data()
        self.status = "Processing..."
        self.progress = 50
        yield

        result = process_data(df)
        self.status = "Complete"
        self.progress = 100
        yield
```

### Alternative: Flet

If you prefer a more desktop-like feel, Flet remains a good option:

```python
import flet as ft

def main(page: ft.Page):
    page.title = "HCD Analysis"

    async def run_analysis(e):
        # Background task
        page.run_task(do_analysis)

    page.add(
        ft.Row([
            # Sidebar
            ft.Column([
                ft.DatePicker(),
                ft.ElevatedButton("Run", on_click=run_analysis),
            ]),
            # Chart area (opens in browser for interactivity)
            ft.ElevatedButton("View Chart", on_click=open_chart),
        ])
    )

ft.app(target=main)  # Desktop window
# OR
ft.app(target=main, view=ft.WEB_BROWSER)  # Browser
```

### Effort Estimate
- Learning Reflex basics: 2-3 days
- Rewriting GUI: 1-2 weeks
- Testing and polish: 3-5 days

---

## 2. Data Storage: SQLite Architecture

### What
Replace CSV-based data loading with a SQLite database that stores reference data in normalized tables and caches processed patient data.

### Why

| Aspect | Current (CSV) | SQLite |
|--------|---------------|--------|
| Startup time | 90MB+ file read + full processing | Load reference data once (< 1MB) |
| Memory usage | Entire dataset in memory | Incremental queries |
| Incremental updates | Full reprocess required | Only process new/changed records |
| Query performance | Pandas groupby/merge | Indexed SQL with CTEs |
| Data consistency | Multiple CSVs can drift | Single source of truth with FK constraints |
| Caching | None | Materialized views |

**Expected improvements:**
- 60-80% faster startup
- 50-70% memory reduction
- 90%+ time savings on incremental updates

### How

**Recommended schema (simplified):**

```sql
-- Reference tables
CREATE TABLE ref_drug_names (
    drug_name_raw TEXT PRIMARY KEY,
    drug_name_std TEXT NOT NULL
);

CREATE TABLE ref_organizations (
    org_code TEXT PRIMARY KEY,
    org_name TEXT NOT NULL
);

CREATE TABLE ref_directories (
    directory_id INTEGER PRIMARY KEY,
    directory_name TEXT UNIQUE NOT NULL
);

CREATE TABLE ref_drug_directory_map (
    drug_name_std TEXT,
    directory_id INTEGER,
    is_single_valid BOOLEAN DEFAULT FALSE,
    PRIMARY KEY (drug_name_std, directory_id)
);

-- Patient data (fact table)
CREATE TABLE fact_interventions (
    intervention_id INTEGER PRIMARY KEY,
    upid TEXT NOT NULL,
    provider_code TEXT,
    drug_name_std TEXT NOT NULL,
    intervention_date DATE NOT NULL,
    price_actual REAL,
    directory_id INTEGER,
    directory_assignment_method TEXT,
    data_load_batch_id INTEGER
);

-- Critical indexes
CREATE INDEX idx_upid ON fact_interventions(upid);
CREATE INDEX idx_upid_drug ON fact_interventions(upid, drug_name_std);
CREATE INDEX idx_intervention_date ON fact_interventions(intervention_date);

-- Materialized view for patient summaries (cached aggregations)
CREATE TABLE mv_patient_treatment_summary (
    upid TEXT PRIMARY KEY,
    first_seen DATE,
    last_seen DATE,
    total_cost REAL,
    drug_count INTEGER,
    last_refresh TIMESTAMP
);

-- File tracking for incremental updates
CREATE TABLE processed_files (
    file_path TEXT PRIMARY KEY,
    file_hash TEXT NOT NULL,
    last_processed TIMESTAMP
);
```

**Migration strategy:**

1. **Phase 1**: Create schema, load reference tables from existing CSVs
2. **Phase 2**: Develop incremental load scripts for patient data
3. **Phase 3**: Build materialized views for aggregations
4. **Phase 4**: Modify `dashboard_gui.py` to query SQLite instead of processing CSVs

**Key query replacing pandas aggregation:**

```sql
-- Replaces ~200 lines of pandas groupby/merge
WITH patient_drugs AS (
    SELECT
        upid,
        drug_name_std,
        MIN(intervention_date) as first_date,
        MAX(intervention_date) as last_date,
        COUNT(*) as intervention_count,
        SUM(price_actual) as drug_cost
    FROM fact_interventions
    WHERE intervention_date BETWEEN :start_date AND :end_date
        AND provider_code IN (:trust_filters)
    GROUP BY upid, drug_name_std
)
SELECT * FROM patient_drugs;
```

### Effort Estimate
- Schema design and setup: 2-3 days
- Migration scripts: 3-4 days
- Query optimization: 2-3 days
- Integration testing: 2-3 days

---

## 3. Snowflake Integration

### What
Enable direct download of HCD activity data from Snowflake servers, replacing manual CSV exports.

### Why
- Eliminates manual export step
- Enables date-range filtering at query level (faster)
- Automatic caching with TTL
- Graceful fallback to local files if Snowflake unavailable

### How

**Authentication: SSO Browser Login**

Using `externalbrowser` authenticator - opens system browser for SSO authentication:

```python
import snowflake.connector

conn = snowflake.connector.connect(
    account="your_account.region",
    user="your.email@nhs.net",
    authenticator="externalbrowser",
    warehouse="ANALYTICS_WH",
    database="data_hub",
    schema="dwh"
)
```

**Note**: User will see browser popup on first connection each session.

**Configuration (`config/snowflake.toml`):**

```toml
[snowflake]
account = "your_account.region"
warehouse = "ANALYTICS_WH"
database = "DataWarehouse"
schema = "dwh"

[query]
default_timeout = 300
chunk_size = 100000

[cache]
enabled = true
ttl_hours = 24
directory = "./data/cache"
```

**Core connector pattern:**

```python
from snowflake.connector import connect

class SnowflakeConnector:
    def fetch_activity_data(self, start_date, end_date, provider_codes=None):
        query = """
        SELECT
            "Provider Code",
            "PersonKey",
            "ProductDescription" as "Drug Name",
            "Intervention Date",
            "Price Actual",
            -- ... other columns
        FROM DataWarehouse.dwh.FactHighCostDrugs
        WHERE "Intervention Date" BETWEEN :start_date AND :end_date
        """

        with self.connect() as conn:
            cursor = conn.cursor()
            cursor.execute(query, {'start_date': start_date, 'end_date': end_date})
            return cursor.fetch_pandas_all()
```

**Caching strategy:**

| Scenario | Action |
|----------|--------|
| Same date range within 24 hours | Use cache |
| Date range includes today | Query Snowflake (data may be updating) |
| User clicks "Refresh" | Query Snowflake |
| Snowflake unavailable | Fallback to local CSV/Parquet |

**Data loader with fallback:**

```python
class DataLoader:
    def load_data(self, start_date, end_date, force_refresh=False):
        # 1. Try cache
        if self.cache and not force_refresh:
            cached = self.cache.get(start_date, end_date)
            if cached is not None:
                return cached, "cache"

        # 2. Try Snowflake
        try:
            df = self.snowflake.fetch_activity_data(start_date, end_date)
            self.cache.set(df, start_date, end_date)
            return df, "snowflake"
        except SnowflakeConnectionError:
            pass

        # 3. Fallback to local files
        if self.fallback_file.exists():
            return pd.read_parquet(self.fallback_file), "local_file"

        raise RuntimeError("No data source available")
```

**Dependencies to add:**

```toml
dependencies = [
    "snowflake-connector-python[pandas]>=3.12.0",
    "cryptography>=42.0.0",
]
```

### Effort Estimate
- Snowflake connector setup: 2-3 days
- Caching layer: 1-2 days
- GUI integration (data source selector): 1-2 days
- Testing with real data: 2-3 days

---

## 4. GP Diagnosis Code Integration

### What
Use GP diagnosis codes as the **primary source** for directory/specialty assignment, with existing logic as fallback.

### Why
- More accurate: Diagnosis directly indicates specialty
- Reduces "Undefined" assignments
- Leverages existing NHS data linkage
- Maintains current logic as safety net

### How

**NHS diagnosis code landscape:**

| Code System | Usage | Notes |
|-------------|-------|-------|
| **SNOMED CT** | GP systems (mandatory since 2018) | Primary source |
| **ICD-10** | Secondary care | Maps FROM SNOMED CT |
| **Read Codes** | Legacy only | Historical records |

**New priority chain:**

```
1. Drug has single valid directory → use that (unchanged)
2. [NEW] GP diagnosis available → map SNOMED/ICD-10 to directory
3. Extract from clinical data fields (existing)
4. Most frequent for same patient/drug (existing)
5. UPID-based inference (existing)
6. Default to "Undefined" (existing)
```

**ICD-10 to Directory mapping (examples):**

```python
ICD10_TO_DIRECTORY = {
    # Neoplasms (Chapter II)
    "C": ["MEDICAL ONCOLOGY", "CLINICAL ONCOLOGY", "CLINICAL HAEMATOLOGY"],

    # Blood diseases (Chapter III)
    "D5": ["CLINICAL HAEMATOLOGY"],
    "D6": ["CLINICAL HAEMATOLOGY"],

    # Endocrine (Chapter IV)
    "E10": ["DIABETIC MEDICINE"],  # Type 1 diabetes
    "E11": ["DIABETIC MEDICINE"],  # Type 2 diabetes

    # Eye (Chapter VII)
    "H0": ["OPHTHALMOLOGY"],
    "H1": ["OPHTHALMOLOGY"],
    "H2": ["OPHTHALMOLOGY"],
    "H3": ["OPHTHALMOLOGY"],

    # Musculoskeletal (Chapter XIII)
    "M05": ["RHEUMATOLOGY"],  # Rheumatoid arthritis
    "M06": ["RHEUMATOLOGY"],
    "M32": ["RHEUMATOLOGY"],  # SLE

    # Genitourinary (Chapter XIV)
    "N0": ["NEPHROLOGY"],
    "N1": ["NEPHROLOGY"],
    "N18": ["NEPHROLOGY"],  # CKD
}
```

**Multi-diagnosis resolution:**

```python
def resolve_directory_from_diagnoses(diagnoses, drug_valid_dirs):
    """
    When patient has multiple diagnoses:
    1. Filter to diagnoses mapping to directories valid for this drug
    2. Oncology diagnoses take priority (ICD-10 chapter C)
    3. Use most recent active diagnosis
    4. Default to first alphabetically (deterministic)
    """
    valid_matches = []

    for dx in diagnoses:
        icd10_prefix = dx.icd10_code[:3]
        possible_dirs = ICD10_TO_DIRECTORY.get(icd10_prefix, [])
        matching = set(possible_dirs) & set(drug_valid_dirs)

        if matching:
            valid_matches.append({
                'directories': matching,
                'is_oncology': dx.icd10_code.startswith('C'),
                'date': dx.diagnosis_date
            })

    if not valid_matches:
        return None  # Fall back to existing logic

    # Oncology priority
    oncology = [m for m in valid_matches if m['is_oncology']]
    if oncology:
        return sorted(oncology[0]['directories'])[0]

    # Most recent
    valid_matches.sort(key=lambda x: x['date'], reverse=True)
    return sorted(valid_matches[0]['directories'])[0]
```

**Data source options:**

1. **Snowflake linked data** (recommended): Query `data_hub.dwh.DimClinicalCoding` joined via `PatientPseudo`
2. **Local CSV cache**: Pre-extracted GP diagnosis data for offline use
3. **Hybrid**: Cache with Snowflake refresh

**GP Diagnosis Query (confirm column names via Snowflake MCP):**

```sql
SELECT
    PatientPseudo,
    SNOMEDCode,           -- or similar
    ICD10Code,            -- may need mapping from SNOMED
    DiagnosisDate,
    DiagnosisStatus       -- Active/Resolved if available
FROM data_hub.dwh.DimClinicalCoding
WHERE PatientPseudo IN (:patient_pseudo_list)
ORDER BY DiagnosisDate DESC
```

**New reference file needed (`./data/diagnosis_directory_map.csv`):**

```csv
icd10_prefix,directory,priority,notes
C,MEDICAL ONCOLOGY,1,All malignancies
C81,CLINICAL HAEMATOLOGY,1,Hodgkin lymphoma
C90,CLINICAL HAEMATOLOGY,1,Multiple myeloma
E10,DIABETIC MEDICINE,1,Type 1 diabetes
E11,DIABETIC MEDICINE,1,Type 2 diabetes
G35,NEUROLOGY,1,Multiple sclerosis
H0,OPHTHALMOLOGY,1,Eye disorders
M05,RHEUMATOLOGY,1,Rheumatoid arthritis
N18,NEPHROLOGY,1,Chronic kidney disease
```

**Tracking assignment source (for audit):**

```python
df['Directory_Source'] = pd.NA  # New column

# After each assignment step:
df.loc[assigned_mask, 'Directory_Source'] = 'DRUG_SINGLE'      # Step 1
df.loc[assigned_mask, 'Directory_Source'] = 'GP_DIAGNOSIS'     # Step 2 (NEW)
df.loc[assigned_mask, 'Directory_Source'] = 'CLINICAL_EXTRACT' # Step 3
# ... etc
```

### Prerequisites
- Explore `data_hub.dwh.DimClinicalCoding` schema to confirm exact column names (use Snowflake MCP)
- Map `PatientPseudo` to your HCD data (may need to add PatientPseudo to your data extract)
- Obtain SNOMED CT to ICD-10 mapping table from NHS TRUD (if DimClinicalCoding only has SNOMED)

### Effort Estimate
- Mapping table creation: 2-3 days
- Snowflake GP query development: 2-3 days
- Integration with existing logic: 2-3 days
- Validation and testing: 3-5 days

---

## 5. Code Quality Improvements

### What
Modernize the codebase with better structure, type hints, error handling, and testing.

### Why
- `generate_graph()` is 267 lines with complexity >30
- Zero type hints across entire codebase
- Global variables create hidden state
- No automated tests
- Print statements instead of logging

### How

**Quick wins (implement first):**

1. **Replace global variables** with dataclass:
```python
@dataclass
class AnalysisFilters:
    start_date: date
    end_date: date
    last_seen: date
    minimum_patients: int
    selected_trusts: list[str]
    selected_drugs: list[str]
    selected_directories: list[str]
    custom_title: str = ""

    def validate(self) -> list[str]:
        errors = []
        if self.start_date >= self.end_date:
            errors.append("Start date must be before end date")
        return errors
```

2. **Externalize configuration:**
```python
@dataclass
class PathConfig:
    data_dir: Path = Path("./data")

    @property
    def drug_names_file(self) -> Path:
        return self.data_dir / "include.csv"

    @property
    def org_codes_file(self) -> Path:
        return self.data_dir / "org_codes.csv"

    # ... etc for all 7 reference files

    def validate(self) -> list[str]:
        """Check all required files exist at startup."""
        errors = []
        for file_path in [self.drug_names_file, self.org_codes_file, ...]:
            if not file_path.exists():
                errors.append(f"Required file not found: {file_path}")
        return errors
```

3. **Add logging:**
```python
import logging

logging.basicConfig(
    level=logging.INFO,
    format='%(asctime)s - %(name)s - %(levelname)s - %(message)s',
    handlers=[
        logging.FileHandler("./logs/analysis.log"),
        logging.StreamHandler()
    ]
)
logger = logging.getLogger("PatientPathway")

# Replace all print() with:
logger.info("Starting analysis...")
logger.error(f"Failed to load file: {e}")
```

4. **Extract `generate_graph()` into smaller functions:**
```python
def generate_graph(df, filters: AnalysisFilters, config: PathConfig):
    df = prepare_data(df, filters)           # ~50 lines
    stats = calculate_statistics(df)          # ~80 lines
    hierarchy = build_hierarchy(df, stats)    # ~60 lines
    chart_data = prepare_chart_data(hierarchy) # ~40 lines
    return render_icicle_chart(chart_data, filters.custom_title)  # ~40 lines
```

**Recommended project structure:**

```
project/
├── gui.py                    # Entry point only
├── core/
│   ├── config.py            # PathConfig, AnalysisFilters
│   ├── models.py            # Data models
│   └── exceptions.py        # Custom exceptions
├── data_processing/
│   ├── loader.py            # File/Snowflake loading
│   ├── transformer.py       # Data transformations
│   └── validator.py         # Data validation
├── analysis/
│   ├── pathway_analyzer.py  # Patient pathway calculations
│   └── statistics.py        # Statistical calculations
├── visualization/
│   └── plotly_generator.py  # Graph generation
└── tests/
    ├── test_data_processing.py
    ├── test_analysis.py
    └── test_config.py
```

**Add development dependencies:**

```toml
[project.optional-dependencies]
dev = [
    "pytest>=8.0.0",
    "pytest-cov>=4.1.0",
    "mypy>=1.8.0",
    "black>=24.0.0",
    "ruff>=0.2.0",
]
```

**Priority tests to write:**

```python
# tests/test_data_processing.py
def test_drop_duplicate_treatments_ascending():
    """Verify first intervention kept when ascending=True."""
    # ...

def test_drop_duplicate_treatments_descending():
    """Verify last intervention kept when ascending=False."""
    # ...

# tests/test_config.py
def test_path_config_validates_missing_files():
    """Verify validation catches missing reference files."""
    # ...

def test_analysis_filters_validates_date_range():
    """Verify start date must be before end date."""
    # ...
```

### Effort Estimate
- Dataclasses and config: 1-2 days
- Logging setup: 0.5 days
- Extract `generate_graph()`: 2-3 days
- Add type hints (public API): 1-2 days
- Basic test coverage: 2-3 days

---

## Implementation Roadmap

### Phase 1: Foundation (2-3 weeks)
1. Create `PathConfig` and `AnalysisFilters` dataclasses
2. Set up logging infrastructure
3. Design and create SQLite schema
4. Migrate reference data CSVs to SQLite

### Phase 2: Data Layer (2-3 weeks)
1. Implement Snowflake connector with SSO browser auth
2. Build caching layer with TTL
3. Create data loader with fallback chain
4. Migrate `dashboard_gui.py` to use SQLite queries

### Phase 3: Diagnosis Integration (2-3 weeks)
1. Explore `data_hub.dwh.DimClinicalCoding` schema via Snowflake MCP
2. Create ICD-10 to directory mapping table
3. Implement GP diagnosis lookup using `PatientPseudo` linkage
4. Integrate into `department_identification()` as step 2
5. Add `Directory_Source` tracking column

### Phase 4: GUI Modernization (3-4 weeks)
1. Learn Reflex fundamentals
2. Recreate main window and navigation with `rx.vstack`/`rx.hstack`
3. Implement filter panels (date pickers, checkbox groups)
4. Integrate Plotly charts with native `rx.plotly()` component
5. Test with `reflex run`

### Phase 5: Quality & Polish (1-2 weeks)
1. Add type hints to public API
2. Write priority unit tests
3. Extract `generate_graph()` into smaller functions
4. Documentation and cleanup

---

## Configuration Decisions

Based on requirements, the following decisions have been made:

| Question | Decision |
|----------|----------|
| **Snowflake auth** | SSO browser login (`authenticator='externalbrowser'`) |
| **GP diagnosis data** | `data_hub.dwh.DimClinicalCoding` |
| **Patient linkage** | Use `PatientPseudo` (anonymized identifier) - NOT UPID |
| **Plotly interactivity** | Must be interactive - **Reflex has native `rx.plotly()` component** |
| **Distribution** | Python script (`reflex run`) - no .exe needed |

### Implications

**Snowflake SSO**: Connection code becomes:
```python
conn = snowflake.connector.connect(
    account="your_account.region",
    user=os.environ.get("SNOWFLAKE_USER"),
    authenticator="externalbrowser",  # Opens browser for SSO
    warehouse="ANALYTICS_WH",
    database="data_hub",
    schema="dwh"
)
```

**Patient Linkage**: The GP diagnosis query needs to join on `PatientPseudo`, not UPID:
```sql
SELECT
    cc.PatientPseudo,
    cc.SNOMEDCode,      -- Confirm actual column names
    cc.ICD10Code,
    cc.DiagnosisDate
FROM data_hub.dwh.DimClinicalCoding cc
WHERE cc.PatientPseudo IN (:patient_list)
```

**Note**: You'll need to confirm the exact column names in `DimClinicalCoding` - explore via Snowflake MCP or SQL client.

**Plotly Interactivity**: Reflex solves this elegantly with native embedding:
```python
# Interactive Plotly chart directly in the Reflex app
rx.plotly(data=State.chart_data, layout=chart_layout)
```
Full interactivity (zoom, pan, hover tooltips) works in the browser-based app - no external HTML files needed.

---

## References

- [Reflex Documentation](https://reflex.dev/docs/)
- [Reflex Plotly Component](https://reflex.dev/docs/library/graphing/plotly/)
- [Flet Documentation](https://flet.dev/docs/) (alternative)
- [Snowflake Python Connector](https://docs.snowflake.com/en/developer-guide/python-connector/python-connector)
- [NHS SNOMED CT](https://digital.nhs.uk/services/terminology-and-classifications/snomed-ct)
- [NHS ICD-10 Classifications](https://isd.digital.nhs.uk/trud/users/guest/filters/0/categories/28)