Initial commit before Ralph loop

This commit is contained in:
Andrew Charlwood
2026-02-04 13:04:29 +00:00
commit fdd33a67af
89 changed files with 20660 additions and 0 deletions
+496
View File
@@ -0,0 +1,496 @@
# Reflex Deployment Guide
This guide covers deployment options for the Patient Pathway Analysis web application built with Reflex.
## Overview
Reflex applications compile to a FastAPI backend and Next.js frontend. This creates two deployment artifacts that can be deployed together or separately depending on your infrastructure requirements.
## Development Mode
For local development:
```bash
# Start development server with hot reload
reflex run
# Access the application at http://localhost:3000
```
## Production Deployment Options
### Option 1: Simple Production (Single Server)
The simplest approach for internal deployments:
```bash
# Run in production mode (optimized build)
reflex run --env prod
```
This starts:
- FastAPI backend on port 8000
- Next.js frontend on port 3000
For background execution:
```bash
# Using nohup (Linux/macOS)
nohup reflex run --env prod > reflex.log 2>&1 &
# Using PowerShell (Windows)
Start-Process -NoNewWindow -FilePath "reflex" -ArgumentList "run --env prod"
```
### Option 2: Separate Backend and Frontend
For more control, run backend and frontend separately:
```bash
# Terminal 1: Start backend only
reflex run --env prod --backend-only
# Terminal 2: Start frontend only
reflex run --env prod --frontend-only
```
### Option 3: Static Export
Export the frontend as static files for deployment on static hosting or CDN:
```bash
# Export application
reflex export
# This creates:
# - frontend.zip (static Next.js build)
# - backend.zip (Python application source)
```
Then:
1. Unzip `frontend.zip` and serve via nginx, Apache, or any static file server
2. Run the backend separately using uvicorn/gunicorn
### Option 4: Docker Deployment
Create a `Dockerfile` for containerized deployment:
```dockerfile
# Dockerfile
FROM python:3.11-slim
WORKDIR /app
# Install Node.js for Reflex frontend build
RUN apt-get update && apt-get install -y curl && \
curl -fsSL https://deb.nodesource.com/setup_18.x | bash - && \
apt-get install -y nodejs && \
rm -rf /var/lib/apt/lists/*
# Copy requirements and install dependencies
COPY requirements.txt pyproject.toml ./
RUN pip install --no-cache-dir -r requirements.txt
# Copy application code
COPY . .
# Initialize Reflex (downloads frontend dependencies)
RUN reflex init --loglevel debug
# Expose ports
EXPOSE 3000 8000
# Start in production mode
CMD ["reflex", "run", "--env", "prod"]
```
Build and run:
```bash
# Build the image
docker build -t pathway-analysis .
# Run the container
docker run -p 3000:3000 -p 8000:8000 \
-v $(pwd)/data:/app/data \
-v $(pwd)/config:/app/config \
pathway-analysis
```
### Option 5: Docker Compose (Recommended for Production)
Create `docker-compose.yml` for multi-container deployment:
```yaml
version: '3.8'
services:
backend:
build: .
command: reflex run --env prod --backend-only
ports:
- "8000:8000"
volumes:
- ./data:/app/data
- ./config:/app/config
environment:
- REFLEX_ENV=prod
restart: unless-stopped
frontend:
build: .
command: reflex run --env prod --frontend-only
ports:
- "3000:3000"
depends_on:
- backend
environment:
- REFLEX_ENV=prod
restart: unless-stopped
```
Run with:
```bash
docker-compose up -d
```
## Reverse Proxy Configuration
### Nginx
For production deployments behind nginx:
```nginx
# /etc/nginx/sites-available/pathway-analysis
server {
listen 80;
server_name your-server.nhs.uk;
# Backend API endpoints
location /admin {
proxy_pass http://localhost:8000;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
}
location /ping {
proxy_pass http://localhost:8000;
}
location /upload {
proxy_pass http://localhost:8000;
client_max_body_size 100M; # For large data file uploads
}
# WebSocket connections (required for Reflex state sync)
location /_event/ {
proxy_pass http://localhost:8000;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
proxy_set_header Host $host;
proxy_read_timeout 86400; # 24 hours for long-running connections
}
# Frontend (all other requests)
location / {
proxy_pass http://localhost:3000;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
}
}
```
Enable the site:
```bash
sudo ln -s /etc/nginx/sites-available/pathway-analysis /etc/nginx/sites-enabled/
sudo nginx -t && sudo systemctl reload nginx
```
### Caddy (Alternative)
Caddy provides automatic HTTPS:
```caddyfile
# Caddyfile
your-server.nhs.uk {
# Backend API
handle /admin/* {
reverse_proxy localhost:8000
}
handle /ping {
reverse_proxy localhost:8000
}
handle /upload {
reverse_proxy localhost:8000
}
handle /_event/* {
reverse_proxy localhost:8000
}
# Frontend
handle {
reverse_proxy localhost:3000
}
}
```
## Process Management
### Systemd (Linux)
Create service files for automatic startup:
```ini
# /etc/systemd/system/pathway-backend.service
[Unit]
Description=Pathway Analysis Backend
After=network.target
[Service]
Type=simple
User=www-data
WorkingDirectory=/opt/pathway-analysis
ExecStart=/usr/bin/reflex run --env prod --backend-only
Restart=always
RestartSec=10
[Install]
WantedBy=multi-user.target
```
```ini
# /etc/systemd/system/pathway-frontend.service
[Unit]
Description=Pathway Analysis Frontend
After=network.target pathway-backend.service
[Service]
Type=simple
User=www-data
WorkingDirectory=/opt/pathway-analysis
ExecStart=/usr/bin/reflex run --env prod --frontend-only
Restart=always
RestartSec=10
[Install]
WantedBy=multi-user.target
```
Enable and start:
```bash
sudo systemctl daemon-reload
sudo systemctl enable pathway-backend pathway-frontend
sudo systemctl start pathway-backend pathway-frontend
```
### Windows Service
Use NSSM (Non-Sucking Service Manager) on Windows:
```powershell
# Install NSSM
choco install nssm
# Create service
nssm install PathwayAnalysis "C:\Path\To\reflex.exe" "run --env prod"
nssm set PathwayAnalysis AppDirectory "C:\Path\To\Patient pathway analysis"
nssm start PathwayAnalysis
```
## Environment Configuration
### Production Environment Variables
Set these environment variables for production:
```bash
# Reflex configuration
export REFLEX_ENV=prod
# Database paths (if using custom locations)
export PATHWAY_DB_PATH=/var/data/pathways.db
export PATHWAY_CACHE_DIR=/var/cache/pathway-analysis
# Snowflake (if using)
export SNOWFLAKE_ACCOUNT=your-account
export SNOWFLAKE_WAREHOUSE=your-warehouse
```
### Snowflake Configuration
Ensure `config/snowflake.toml` is properly configured for production:
```toml
[connection]
account = "your-production-account"
warehouse = "ANALYTICS_WH"
database = "DATA_HUB"
schema = "CDM"
authenticator = "externalbrowser" # or "oauth" for service accounts
[cache]
enabled = true
directory = "/var/cache/pathway-analysis"
ttl_seconds = 86400 # 24 hours
```
## Reflex Cloud
For managed hosting, consider [Reflex Cloud](https://reflex.dev/cloud/):
```bash
# Deploy to Reflex Cloud
reflex deploy
```
Benefits:
- Zero configuration deployment
- Automatic scaling
- Built-in SSL certificates
- Managed state management with Redis
## Security Considerations
### Network Security
1. **Firewall Rules**: Only expose necessary ports (typically just 80/443)
2. **HTTPS**: Use TLS certificates (Let's Encrypt or organizational certs)
3. **VPN**: Consider restricting access to NHS network only
### Data Security
1. **Database Access**: Ensure SQLite database permissions are restricted
2. **File Uploads**: Validate file types and scan for malware
3. **Snowflake**: Use least-privilege service accounts
### Authentication
For NHS deployments, consider adding authentication:
```python
# Example: Add basic auth middleware
import reflex as rx
from starlette.middleware import Middleware
from starlette.middleware.authentication import AuthenticationMiddleware
# In rxconfig.py
config = rx.Config(
app_name="pathways_app",
# Add authentication middleware
)
```
## Monitoring
### Health Checks
The application provides endpoints for monitoring:
- `/ping` - Basic health check
- Backend port 8000 - FastAPI health
### Logging
Configure logging for production:
```python
# In pathways_app/pathways_app.py
import logging
logging.basicConfig(
level=logging.INFO,
format='%(asctime)s - %(name)s - %(levelname)s - %(message)s',
handlers=[
logging.FileHandler('/var/log/pathway-analysis/app.log'),
logging.StreamHandler()
]
)
```
## Troubleshooting
### Common Issues
**Port already in use:**
```bash
# Find and kill process using port 3000
lsof -i :3000
kill -9 <PID>
```
**Build cache issues:**
```bash
# Clear Reflex build cache
rm -rf .web
reflex run --env prod
```
**Database connection errors:**
```bash
# Verify database exists and has correct permissions
ls -la data/pathways.db
sqlite3 data/pathways.db ".tables"
```
**Snowflake authentication:**
- Ensure browser is available for SSO popup
- Check firewall allows connections to Snowflake endpoints
- Verify account identifier is correct
## Performance Tuning
### Backend (FastAPI/Uvicorn)
For high-traffic deployments:
```bash
# Run with multiple workers
uvicorn pathways_app:app --workers 4 --host 0.0.0.0 --port 8000
```
### State Management
For multi-instance deployments, configure Redis for state management:
```python
# rxconfig.py
config = rx.Config(
app_name="pathways_app",
state_manager_mode="redis",
redis_url="redis://localhost:6379/0",
)
```
### Caching
Enable aggressive caching for Snowflake queries in `config/snowflake.toml`:
```toml
[cache]
enabled = true
ttl_seconds = 86400 # 24 hours for historical data
ttl_current_data_seconds = 3600 # 1 hour for recent data
max_size_mb = 1000 # 1GB cache
```
---
## Quick Reference
| Environment | Command | Ports |
|-------------|---------|-------|
| Development | `reflex run` | 3000, 8000 |
| Production | `reflex run --env prod` | 3000, 8000 |
| Backend only | `reflex run --backend-only` | 8000 |
| Frontend only | `reflex run --frontend-only` | 3000 |
| Export | `reflex export` | Static files |
| Cloud | `reflex deploy` | Managed |
For more information, see:
- [Reflex Documentation](https://reflex.dev/docs/)
- [Reflex Cloud](https://reflex.dev/cloud/)
- [FastAPI Deployment](https://fastapi.tiangolo.com/deployment/)
+403
View File
@@ -0,0 +1,403 @@
# User Guide - NHS Patient Pathway Analysis Tool
This guide explains how to use the NHS High-Cost Drug Patient Pathway Analysis Tool to analyze treatment pathways for secondary care patients.
## Table of Contents
1. [Getting Started](#getting-started)
2. [Interface Overview](#interface-overview)
3. [Selecting Your Data Source](#selecting-your-data-source)
4. [Configuring Analysis Filters](#configuring-analysis-filters)
5. [Selecting Drugs, Trusts, and Directories](#selecting-drugs-trusts-and-directories)
6. [Running the Analysis](#running-the-analysis)
7. [Understanding the Pathway Chart](#understanding-the-pathway-chart)
8. [Exporting Results](#exporting-results)
9. [GP Indication Validation](#gp-indication-validation)
10. [Keyboard Navigation and Accessibility](#keyboard-navigation-and-accessibility)
11. [Troubleshooting](#troubleshooting)
---
## Getting Started
### Accessing the Application
Start the application by running:
```bash
reflex run
```
Then open your browser to **http://localhost:3000**
The application will automatically load reference data (drugs, trusts, directories) when you first access it.
### First-Time Setup
1. Click **Load Reference Data** on the Home page to populate the filter options
2. Select your preferred data source (SQLite, File Upload, or Snowflake)
3. Configure your date range and other filters
4. Click **Run Analysis** to generate your first pathway chart
---
## Interface Overview
The application has four main pages, accessible from the sidebar navigation:
| Page | Purpose |
|------|---------|
| **Home** | Main analysis dashboard with data source selection, filters, and chart display |
| **Drug Selection** | Select which high-cost drugs to include in the analysis |
| **Trust Selection** | Filter by specific NHS trusts |
| **Directory Selection** | Filter by medical directories/specialties |
### Navigation
- **Desktop**: Use the sidebar on the left to switch between pages
- **Mobile**: Use the top navigation bar
- **Keyboard**: Press Tab to navigate, Enter to select
---
## Selecting Your Data Source
The application supports three data sources:
### 1. SQLite Database (Recommended)
Pre-loaded patient data stored locally for fast performance.
**Advantages:**
- Fastest analysis performance
- Works offline
- No authentication required
**To use:** Click "Use SQLite" in the Data Source section
### 2. File Upload
Upload CSV or Parquet files directly.
**Supported formats:**
- CSV files (.csv)
- Apache Parquet files (.parquet, .pq)
**To use:**
1. Drag and drop a file, or click the upload area
2. Wait for the file to process
3. Click "Use File" to select it as your data source
### 3. Snowflake
Query live data from the NHS data warehouse.
**Requirements:**
- Snowflake must be configured (see `config/snowflake.toml`)
- Browser-based NHS SSO authentication
**To use:** Click "Use Snowflake" - you'll be prompted to authenticate via your browser
---
## Configuring Analysis Filters
The Home page provides several filter options:
### Date Range
| Field | Description |
|-------|-------------|
| **Start Date** | Include patients initiated from this date onwards |
| **End Date** | Include patients initiated until this date |
| **Last Seen After** | Only include patients with activity after this date (excludes patients who haven't been seen recently) |
**Tip:** The default range is the last 12 months.
### Minimum Patients
Filter out pathways with fewer patients than the threshold you set.
- Use the slider for quick adjustment (0-100)
- Or type a specific number in the text field
- Set to 0 to show all pathways regardless of patient count
### Custom Title
Override the automatically generated chart title with your own text.
- Leave empty to use the default title: "Patients initiated [start date] to [end date]"
- Useful for specific reports or presentations
---
## Selecting Drugs, Trusts, and Directories
Each selection page works the same way:
### Navigation
1. Click "Drug Selection", "Trust Selection", or "Directory Selection" in the sidebar
2. The page shows all available options with checkboxes
### Search
Type in the search box to filter the list. The list updates as you type.
### Selection Actions
| Button | Action |
|--------|--------|
| **Select All** | Check all visible items |
| **Clear All** | Uncheck all items |
| **Select Defaults** | (Drugs only) Select pre-configured default drugs (Include=1 in include.csv) |
### Selection Behavior
- **No items selected** = Include ALL items in analysis
- **Some items selected** = Include ONLY the selected items
This means leaving a filter empty is equivalent to "select all".
---
## Running the Analysis
### Steps
1. Ensure your data source is selected and configured
2. Set your date range and other filters
3. Select desired drugs, trusts, and directories (or leave empty for all)
4. Click the green **Run Analysis** button
### During Analysis
- The button shows a spinner while analysis is running
- Status messages appear below the button
- The interface remains responsive - you can review settings
### After Analysis
- The pathway chart appears in the chart section
- Export buttons become available
- GP indication validation results appear (if Snowflake is connected)
---
## Understanding the Pathway Chart
The analysis generates an interactive **icicle chart** showing patient treatment pathways.
### Hierarchy Structure
The chart displays a hierarchical structure:
```
N&WICS (Regional Total)
└─ Trust Name (e.g., "Norfolk and Norwich University Hospitals")
└─ Directory (e.g., "Rheumatology", "Gastroenterology")
└─ Drug Name (e.g., "ADALIMUMAB", "INFLIXIMAB")
```
### Reading the Chart
- **Width** of each section indicates relative patient count
- **Color intensity** indicates proportion of patients at that level
- **Labels** show the category name and patient count
### Interacting with the Chart
| Action | Effect |
|--------|--------|
| **Click** a section | Zoom in to show details for that branch |
| **Click** the root | Zoom out to show full hierarchy |
| **Hover** over a section | See tooltip with patient count |
| Use the **toolbar** | Reset, download image, pan, zoom |
### Plotly Toolbar
The chart includes a Plotly toolbar (top right) with:
- **Download as PNG** - Save static image
- **Zoom controls** - Zoom in/out
- **Pan** - Click and drag to move
- **Reset** - Return to original view
---
## Exporting Results
Two export options are available after running an analysis:
### Export HTML
Creates an interactive HTML file that can be opened in any browser.
- **Output**: `data/exports/pathway_chart_[timestamp].html`
- **Use case**: Sharing interactive charts via email or file share
- **Features**: Full interactivity, no software required to view
### Export CSV
Exports the underlying data as a spreadsheet.
- **Output**: `data/exports/pathway_data_[timestamp].csv`
- **Use case**: Further analysis in Excel, importing to other tools
- **Includes**: Patient IDs, drugs, dates, costs, directories, indication validation status
### Export Location
All exports are saved to the `data/exports/` directory with timestamped filenames to prevent overwriting.
---
## GP Indication Validation
When connected to Snowflake, the application validates whether patients have appropriate GP diagnoses for their prescribed drugs.
### What It Does
1. Looks up the drug's licensed indications (e.g., ADALIMUMAB for rheumatoid arthritis)
2. Finds corresponding SNOMED codes for those indications
3. Checks each patient's GP records for matching diagnoses
4. Reports the match rate per drug
### Understanding Results
After analysis, a table shows:
| Column | Meaning |
|--------|---------|
| **Drug Name** | The high-cost drug |
| **Total Patients** | Number of patients prescribed this drug |
| **With GP Indication** | Patients with matching GP diagnosis |
| **Match Rate** | Percentage with valid indication |
### Match Rate Interpretation
| Rate | Meaning | Color |
|------|---------|-------|
| **80%+** | Good coverage - most patients have GP diagnoses | Green |
| **50-79%** | Moderate coverage - investigate missing cases | Orange |
| **<50%** | Low coverage - may indicate data quality issues or off-label use | Red |
### Why Rates May Be Low
Low match rates don't necessarily indicate problems:
- **Cross-provider treatment**: Patient's GP is outside the data coverage
- **Recent diagnoses**: Diagnosis not yet recorded in GP system
- **Specialist-only conditions**: Some conditions are only managed in secondary care
- **Off-label prescribing**: Legitimate use for indications not in the mapping
### Enabling/Disabling
Indication validation is enabled by default when Snowflake is connected. It requires:
- Active Snowflake connection
- Drug-to-cluster mappings in the database
---
## Keyboard Navigation and Accessibility
The application is designed to be accessible:
### Skip Link
Press **Tab** when the page loads to reveal a "Skip to main content" link that bypasses navigation.
### Keyboard Navigation
| Key | Action |
|-----|--------|
| **Tab** | Move to next interactive element |
| **Shift+Tab** | Move to previous element |
| **Enter** | Activate buttons, links, checkboxes |
| **Space** | Toggle checkboxes |
| **Arrow keys** | Adjust sliders |
### Screen Reader Support
- All buttons and inputs have descriptive labels
- Status messages announce via ARIA live regions
- Charts include figure descriptions
### Theme Toggle
A dark/light mode toggle is available at the bottom of the sidebar for visual preference.
---
## Troubleshooting
### "No data available" Error
**Cause**: No data matches your current filter settings
**Solutions:**
1. Check your date range - is it too narrow?
2. Verify your data source has data loaded
3. Check if selected trusts/drugs have any matching records
4. Try clearing all selections (to include everything)
### Chart Not Displaying
**Cause**: Analysis completed but no data met the minimum patients threshold
**Solutions:**
1. Lower the minimum patients threshold
2. Expand your date range
3. Select more drugs or trusts
### Snowflake Connection Failed
**Cause**: Unable to connect to Snowflake
**Solutions:**
1. Check that `config/snowflake.toml` exists and is configured
2. Complete browser authentication when prompted
3. Verify your network allows Snowflake connections
4. Try using SQLite as an alternative data source
### File Upload Failed
**Cause**: File format or content issue
**Solutions:**
1. Ensure file is CSV or Parquet format
2. Check file isn't corrupted or empty
3. Verify file contains required columns
4. Try a smaller file to test
### Slow Performance
**Cause**: Large data volume or complex filtering
**Solutions:**
1. Use SQLite instead of file upload for large datasets
2. Narrow your date range
3. Select fewer drugs/trusts to analyze
4. Increase minimum patients threshold to reduce chart complexity
### Reference Data Not Loading
**Cause**: Missing or corrupted reference files
**Solutions:**
1. Click "Load Reference Data" to retry
2. Check that `data/` directory contains required CSV files:
- `include.csv`
- `defaultTrusts.csv`
- `directory_list.csv`
3. Verify files aren't empty or malformed
---
## Getting Help
If you encounter issues not covered in this guide:
1. Check the [README](../README.md) for installation and setup information
2. Review [DEPLOYMENT.md](./DEPLOYMENT.md) for server configuration
3. Consult [CLAUDE.md](../CLAUDE.md) for technical architecture details
4. Contact your local support team for NHS-specific questions