Initial commit before Ralph loop
This commit is contained in:
@@ -0,0 +1,229 @@
|
||||
# NHS High-Cost Drug Patient Pathway Analysis Tool
|
||||
|
||||
A web-based application for analyzing secondary care patient treatment pathways. It processes clinical activity data to visualize hierarchical treatment patterns (Trust → Directory/Specialty → Drug → Patient pathway) as interactive Plotly icicle charts.
|
||||
|
||||
## Features
|
||||
|
||||
- **Interactive Visualization**: Plotly icicle charts showing patient treatment hierarchies with cost and frequency statistics
|
||||
- **Multi-Source Data Loading**: CSV/Parquet files, SQLite database, or direct Snowflake integration
|
||||
- **GP Diagnosis Validation**: Validate patient indications against GP SNOMED codes via NHS Snowflake
|
||||
- **Modern Web Interface**: Browser-based UI using Reflex framework with NHS branding
|
||||
- **Flexible Filtering**: Filter by date range, NHS trusts, drugs, and medical directories
|
||||
- **Export Options**: Export charts as interactive HTML or data as CSV
|
||||
|
||||
## Requirements
|
||||
|
||||
- Python 3.10 or higher
|
||||
- pip or uv package manager
|
||||
|
||||
### Optional (for Snowflake integration)
|
||||
- `snowflake-connector-python` package
|
||||
- Access to NHS Snowflake data warehouse with SSO authentication
|
||||
|
||||
## Installation
|
||||
|
||||
### Using pip
|
||||
|
||||
```bash
|
||||
# Clone the repository
|
||||
git clone <repository-url>
|
||||
cd patient-pathway-analysis
|
||||
|
||||
# Install dependencies
|
||||
pip install -r requirements.txt
|
||||
```
|
||||
|
||||
### Using uv (recommended)
|
||||
|
||||
```bash
|
||||
# Install uv if not already installed
|
||||
pip install uv
|
||||
|
||||
# Sync dependencies
|
||||
uv sync
|
||||
```
|
||||
|
||||
### Install with test dependencies
|
||||
|
||||
```bash
|
||||
pip install -e ".[test]"
|
||||
```
|
||||
|
||||
## Quick Start
|
||||
|
||||
### 1. Run the Web Application (Recommended)
|
||||
|
||||
```bash
|
||||
reflex run
|
||||
```
|
||||
|
||||
Open http://localhost:3000 in your browser.
|
||||
|
||||
## Usage
|
||||
|
||||
### Web Interface (Reflex)
|
||||
|
||||
1. **Load Data**: On the home page, select your data source:
|
||||
- **SQLite Database**: Uses pre-loaded data from `data/pathways.db`
|
||||
- **File Upload**: Drag and drop a CSV or Parquet file
|
||||
- **Snowflake**: Fetch data directly from NHS Snowflake (requires configuration)
|
||||
|
||||
2. **Configure Filters**:
|
||||
- Set date range (Start Date, End Date, Last Seen After)
|
||||
- Navigate to Drug/Trust/Directory selection pages using the sidebar
|
||||
- Use search boxes to find and select items
|
||||
- Set minimum patient threshold to filter small groups
|
||||
|
||||
3. **Run Analysis**: Click "Run Analysis" to generate the icicle chart
|
||||
|
||||
4. **Export Results**:
|
||||
- **Export HTML**: Save the interactive chart as a standalone HTML file
|
||||
- **Export CSV**: Export the filtered data as a CSV file
|
||||
|
||||
### Data Migration
|
||||
|
||||
To populate the SQLite database from CSV files:
|
||||
|
||||
```bash
|
||||
# Initialize database schema
|
||||
python -m data_processing.migrate
|
||||
|
||||
# Load reference data from CSV files
|
||||
python -m data_processing.migrate --reference-data --verify
|
||||
|
||||
# Load patient data from a CSV/Parquet file
|
||||
python -m data_processing.migrate --load-patient-data path/to/data.csv
|
||||
```
|
||||
|
||||
### Snowflake Configuration
|
||||
|
||||
To use Snowflake integration, edit `config/snowflake.toml`:
|
||||
|
||||
```toml
|
||||
[connection]
|
||||
account = "your-account-identifier"
|
||||
warehouse = "your-warehouse"
|
||||
database = "DATA_HUB"
|
||||
schema = "CDM"
|
||||
authenticator = "externalbrowser" # NHS SSO authentication
|
||||
```
|
||||
|
||||
## Project Structure
|
||||
|
||||
```
|
||||
.
|
||||
├── core/ # Core configuration and models
|
||||
├── data_processing/ # Data layer (SQLite, Snowflake, loaders)
|
||||
├── analysis/ # Analysis pipeline (refactored from generate_graph)
|
||||
├── visualization/ # Chart generation (Plotly)
|
||||
├── pathways_app/ # Reflex web application
|
||||
├── tools/ # Legacy modules (original analysis engine)
|
||||
├── config/ # Configuration files
|
||||
├── data/ # Reference data and SQLite database
|
||||
├── docs/ # Additional documentation
|
||||
└── tests/ # Test suite
|
||||
```
|
||||
|
||||
See `CLAUDE.md` for detailed architecture documentation.
|
||||
|
||||
## Documentation
|
||||
|
||||
- [docs/USER_GUIDE.md](docs/USER_GUIDE.md) - End-user guide for using the web interface
|
||||
- [docs/DEPLOYMENT.md](docs/DEPLOYMENT.md) - Production deployment guide (Docker, nginx, cloud)
|
||||
- [CLAUDE.md](CLAUDE.md) - Technical architecture documentation for developers
|
||||
|
||||
## Deployment
|
||||
|
||||
Quick production start:
|
||||
|
||||
```bash
|
||||
# Run in production mode
|
||||
reflex run --env prod
|
||||
```
|
||||
|
||||
## Running Tests
|
||||
|
||||
```bash
|
||||
# Run all tests
|
||||
python -m pytest tests/ -v
|
||||
|
||||
# Run with coverage
|
||||
python -m pytest tests/ -v --cov=core --cov=data_processing --cov=analysis
|
||||
|
||||
# Run only fast tests (exclude slow/integration)
|
||||
python -m pytest tests/ -v -m "not slow"
|
||||
```
|
||||
|
||||
## Reference Data Files
|
||||
|
||||
The `data/` directory contains essential reference files:
|
||||
|
||||
| File | Purpose |
|
||||
|------|---------|
|
||||
| `include.csv` | Drug filter list with default selections |
|
||||
| `defaultTrusts.csv` | NHS Trust list for filtering |
|
||||
| `directory_list.csv` | Medical specialties/directories |
|
||||
| `drugnames.csv` | Drug name standardization mapping |
|
||||
| `org_codes.csv` | Provider code to organization name mapping |
|
||||
| `drug_directory_list.csv` | Valid drug-to-directory mappings |
|
||||
| `drug_indication_clusters.csv` | Drug to SNOMED cluster mappings |
|
||||
| `ta-recommendations.xlsx` | NICE TA recommendations |
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Reflex compilation errors
|
||||
|
||||
If you encounter compilation errors when running `reflex run`:
|
||||
|
||||
```bash
|
||||
# Clear the build cache and restart
|
||||
rm -rf .web
|
||||
reflex run
|
||||
```
|
||||
|
||||
### Snowflake connection issues
|
||||
|
||||
1. Ensure `snowflake-connector-python` is installed:
|
||||
```bash
|
||||
pip install snowflake-connector-python
|
||||
```
|
||||
|
||||
2. Check that `config/snowflake.toml` has the correct account identifier
|
||||
|
||||
3. For SSO authentication, a browser window will open automatically
|
||||
|
||||
### SQLite database not found
|
||||
|
||||
If `data/pathways.db` doesn't exist, create it:
|
||||
|
||||
```bash
|
||||
python -m data_processing.migrate
|
||||
python -m data_processing.migrate --reference-data
|
||||
```
|
||||
|
||||
## Development
|
||||
|
||||
### Code Quality
|
||||
|
||||
```bash
|
||||
# Type checking
|
||||
python -m mypy core/ data_processing/ analysis/ --ignore-missing-imports
|
||||
|
||||
# Run tests with coverage report
|
||||
python -m pytest tests/ -v --cov=core --cov=data_processing --cov-report=html
|
||||
```
|
||||
|
||||
### Adding New Reference Data
|
||||
|
||||
1. Add CSV file to `data/` directory
|
||||
2. Define schema in `data_processing/schema.py`
|
||||
3. Create migration function in `data_processing/reference_data.py`
|
||||
4. Add path to `PathConfig` in `core/config.py`
|
||||
|
||||
## License
|
||||
|
||||
Internal NHS use only. Not for distribution.
|
||||
|
||||
## Support
|
||||
|
||||
For questions or issues, contact the Medicines Intelligence team.
|
||||
Reference in New Issue
Block a user