- Update pathways_app/__init__.py to re-export app from app_v2 - Verified reflex run compiles 33/33 components successfully - App runs on localhost:3003 (frontend) and :8002 (backend) - Mark completion criteria "App compiles" as verified
NHS High-Cost Drug Patient Pathway Analysis Tool
A web-based application for analyzing secondary care patient treatment pathways. It processes clinical activity data to visualize hierarchical treatment patterns (Trust → Directory/Specialty → Drug → Patient pathway) as interactive Plotly icicle charts.
Features
- Interactive Visualization: Plotly icicle charts showing patient treatment hierarchies with cost and frequency statistics
- Multi-Source Data Loading: CSV/Parquet files, SQLite database, or direct Snowflake integration
- GP Diagnosis Validation: Validate patient indications against GP SNOMED codes via NHS Snowflake
- Modern Web Interface: Browser-based UI using Reflex framework with NHS branding
- Flexible Filtering: Filter by date range, NHS trusts, drugs, and medical directories
- Export Options: Export charts as interactive HTML or data as CSV
Requirements
- Python 3.10 or higher
- pip or uv package manager
Optional (for Snowflake integration)
snowflake-connector-pythonpackage- Access to NHS Snowflake data warehouse with SSO authentication
Installation
Using pip
# Clone the repository
git clone <repository-url>
cd patient-pathway-analysis
# Install dependencies
pip install -r requirements.txt
Using uv (recommended)
# Install uv if not already installed
pip install uv
# Sync dependencies
uv sync
Install with test dependencies
pip install -e ".[test]"
Quick Start
1. Run the Web Application (Recommended)
reflex run
Open http://localhost:3000 in your browser.
Usage
Web Interface (Reflex)
-
Load Data: On the home page, select your data source:
- SQLite Database: Uses pre-loaded data from
data/pathways.db - File Upload: Drag and drop a CSV or Parquet file
- Snowflake: Fetch data directly from NHS Snowflake (requires configuration)
- SQLite Database: Uses pre-loaded data from
-
Configure Filters:
- Set date range (Start Date, End Date, Last Seen After)
- Navigate to Drug/Trust/Directory selection pages using the sidebar
- Use search boxes to find and select items
- Set minimum patient threshold to filter small groups
-
Run Analysis: Click "Run Analysis" to generate the icicle chart
-
Export Results:
- Export HTML: Save the interactive chart as a standalone HTML file
- Export CSV: Export the filtered data as a CSV file
Data Migration
To populate the SQLite database from CSV files:
# Initialize database schema
python -m data_processing.migrate
# Load reference data from CSV files
python -m data_processing.migrate --reference-data --verify
# Load patient data from a CSV/Parquet file
python -m data_processing.migrate --load-patient-data path/to/data.csv
Snowflake Configuration
To use Snowflake integration, edit config/snowflake.toml:
[connection]
account = "your-account-identifier"
warehouse = "your-warehouse"
database = "DATA_HUB"
schema = "CDM"
authenticator = "externalbrowser" # NHS SSO authentication
Project Structure
.
├── core/ # Core configuration and models
├── data_processing/ # Data layer (SQLite, Snowflake, loaders)
├── analysis/ # Analysis pipeline (refactored from generate_graph)
├── visualization/ # Chart generation (Plotly)
├── pathways_app/ # Reflex web application
├── tools/ # Legacy modules (original analysis engine)
├── config/ # Configuration files
├── data/ # Reference data and SQLite database
├── docs/ # Additional documentation
└── tests/ # Test suite
See CLAUDE.md for detailed architecture documentation.
Documentation
- docs/USER_GUIDE.md - End-user guide for using the web interface
- docs/DEPLOYMENT.md - Production deployment guide (Docker, nginx, cloud)
- CLAUDE.md - Technical architecture documentation for developers
Deployment
Quick production start:
# Run in production mode
reflex run --env prod
Running Tests
# Run all tests
python -m pytest tests/ -v
# Run with coverage
python -m pytest tests/ -v --cov=core --cov=data_processing --cov=analysis
# Run only fast tests (exclude slow/integration)
python -m pytest tests/ -v -m "not slow"
Reference Data Files
The data/ directory contains essential reference files:
| File | Purpose |
|---|---|
include.csv |
Drug filter list with default selections |
defaultTrusts.csv |
NHS Trust list for filtering |
directory_list.csv |
Medical specialties/directories |
drugnames.csv |
Drug name standardization mapping |
org_codes.csv |
Provider code to organization name mapping |
drug_directory_list.csv |
Valid drug-to-directory mappings |
drug_indication_clusters.csv |
Drug to SNOMED cluster mappings |
ta-recommendations.xlsx |
NICE TA recommendations |
Troubleshooting
Reflex compilation errors
If you encounter compilation errors when running reflex run:
# Clear the build cache and restart
rm -rf .web
reflex run
Snowflake connection issues
-
Ensure
snowflake-connector-pythonis installed:pip install snowflake-connector-python -
Check that
config/snowflake.tomlhas the correct account identifier -
For SSO authentication, a browser window will open automatically
SQLite database not found
If data/pathways.db doesn't exist, create it:
python -m data_processing.migrate
python -m data_processing.migrate --reference-data
Development
Code Quality
# Type checking
python -m mypy core/ data_processing/ analysis/ --ignore-missing-imports
# Run tests with coverage report
python -m pytest tests/ -v --cov=core --cov=data_processing --cov-report=html
Adding New Reference Data
- Add CSV file to
data/directory - Define schema in
data_processing/schema.py - Create migration function in
data_processing/reference_data.py - Add path to
PathConfigincore/config.py
License
Internal NHS use only. Not for distribution.
Support
For questions or issues, contact the Medicines Intelligence team.