- Remove reflex dependency from pyproject.toml - Move pathways_app/ and rxconfig.py to archive/ - Update CLAUDE.md: Dash app structure, callback chain, run command - All completion criteria validated (10/10 pass)
30 KiB
CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
Project Overview
NHS High-Cost Drug Patient Pathway Analysis Tool - a web-based application that analyzes secondary care patient treatment pathways. It processes clinical activity data to visualize hierarchical treatment patterns as interactive Plotly icicle charts.
Key Features:
- Dual chart types: Directory-based (Trust → Directory → Drug → Pathway) and Indication-based (Trust → GP Diagnosis → Drug → Pathway) views with toggle
- Pre-computed pathway architecture: Treatment pathways pre-processed and stored in SQLite for instant filtering
- GP diagnosis matching: Patient indications matched from GP records using SNOMED cluster codes queried directly from Snowflake (~93% match rate)
- Data pipeline: Snowflake → pre-computed SQLite pathway nodes (CSV/Parquet file loading retained for legacy compatibility)
- Interactive browser-based UI using Dash (Plotly) + Dash Mantine Components
- 6 pre-defined date filter combinations × 2 chart types = 12 pre-computed datasets with sub-50ms response times
Running the Application
# Install dependencies
uv sync
# One-time dev setup: adds src/ to Python path via .pth file
uv run python setup_dev.py
# Initialize/migrate the database (creates pathway tables)
python -m data_processing.migrate
# Refresh pathway data from Snowflake (requires SSO auth)
python -m cli.refresh_pathways
# Run the Dash web application
python run_dash.py
The application requires Python 3.10+ and runs on http://localhost:8050 by default.
CLI Commands
Refresh Pathway Data:
# Full refresh — both chart types (directory + indication), all date filters
python -m cli.refresh_pathways --chart-type all
# Directory charts only (faster, skips GP diagnosis lookup)
python -m cli.refresh_pathways --chart-type directory
# Indication charts only
python -m cli.refresh_pathways --chart-type indication
# Dry run (test without database changes)
python -m cli.refresh_pathways --chart-type all --dry-run -v
# Custom minimum patient threshold
python -m cli.refresh_pathways --minimum-patients 10
# Help
python -m cli.refresh_pathways --help
The --chart-type argument controls which pathway types are processed:
all(default) — generates both directory and indication charts (~15 minutes)directory— directory-based charts only (~5 minutes)indication— indication-based charts only (~12 minutes, includes GP lookup)
The refresh command:
- Fetches activity data from Snowflake (656K+ records, ~7 seconds)
- Applies UPID, drug name, and directory transformations (~6 minutes)
- For indication charts: queries GP records via SNOMED clusters (~9 minutes for 37K patients)
- Processes 6 date filter combinations × selected chart types
- Inserts pathway nodes to SQLite for fast Dash filtering
Architecture
Package Structure
.
├── src/ # All application library code
│ ├── core/ # Foundation: paths, models, logging
│ │ ├── config.py # PathConfig dataclass for file paths
│ │ ├── models.py # AnalysisFilters dataclass
│ │ └── logging_config.py # Structured logging setup
│ │
│ ├── config/ # Service configuration
│ │ ├── __init__.py # SnowflakeConfig + loader
│ │ └── snowflake.toml # Connection settings (co-located with loader)
│ │
│ ├── data_processing/ # Data layer
│ │ ├── database.py # SQLite connection management
│ │ ├── schema.py # Database schema (reference + pathway tables)
│ │ ├── pathway_pipeline.py # Pipeline: Snowflake → SQLite
│ │ ├── transforms.py # Data transformations (UPID, drug names, directory)
│ │ ├── loader.py # FileDataLoader for CSV/Parquet files
│ │ ├── reference_data.py # Reference data migration
│ │ ├── snowflake_connector.py # Snowflake integration
│ │ ├── cache.py # Query result caching
│ │ ├── data_source.py # Data source fallback chain
│ │ └── diagnosis_lookup.py # GP diagnosis lookup (SNOMED clusters)
│ │
│ ├── analysis/ # Analysis pipeline
│ │ ├── pathway_analyzer.py # prepare_data, calculate_statistics, build_hierarchy
│ │ └── statistics.py # Statistical calculation functions
│ │
│ ├── visualization/ # Chart generation
│ │ └── plotly_generator.py # create_icicle_figure, create_icicle_from_nodes
│ │
│ └── cli/ # CLI tools
│ └── refresh_pathways.py # Data refresh command
│
├── dash_app/ # Dash web application
│ ├── app.py # Dash app, layout root, dcc.Store, register_callbacks
│ ├── assets/
│ │ └── nhs.css # NHS design system CSS (from 01_nhs_classic.html)
│ ├── data/
│ │ ├── queries.py # Thin wrapper calling src/data_processing/pathway_queries.py
│ │ └── card_browser.py # DimSearchTerm.csv → directorate tree for drawer
│ ├── components/
│ │ ├── header.py # Top header bar with data freshness indicator
│ │ ├── sidebar.py # Left navigation with drawer triggers
│ │ ├── kpi_row.py # 4 KPI cards (patients, drugs, cost, match rate)
│ │ ├── filter_bar.py # Chart type toggle pills + date filter dropdowns
│ │ ├── chart_card.py # Chart area with tabs + dcc.Graph + loading spinner
│ │ ├── drawer.py # dmc.Drawer with drug/trust chips + directorate cards
│ │ └── footer.py # Page footer
│ ├── callbacks/
│ │ ├── __init__.py # register_callbacks(app)
│ │ ├── filters.py # Reference data loading + filter state management
│ │ ├── chart.py # Pathway data loading + icicle chart rendering
│ │ ├── drawer.py # Drawer open/close + drug/trust selection
│ │ └── kpi.py # KPI card value updates
│ └── utils/
│ └── __init__.py
│
├── run_dash.py # Entry point: python run_dash.py
├── tests/ # Test suite (113 tests)
├── data/ # Reference data + SQLite DB
├── docs/ # Documentation
├── assets/ # Static assets (logo, favicon)
├── archive/ # Historical/deprecated (includes old Reflex app)
└── logs/ # Runtime logs
Path resolution: src/ is added to sys.path via a .pth file (created by setup_dev.py).
All imports use package names directly: from core import ..., from data_processing import ..., etc.
Pathway Data Architecture
The application uses a pre-computed pathway architecture for performance:
Architecture: Snowflake → Pathway Processing → SQLite (pre-computed) → Dash (filter & view)
Key Benefits:
- Performance: Pathway calculation done once during data refresh, not on every filter change
- Simplicity: Dash callbacks filter pre-computed data with simple SQL WHERE clauses
- Full Pathways: Sequential treatment pathways (drug_0 → drug_1 → drug_2...) with statistics
Chart Types:
| Type | Hierarchy | Level 2 Source |
|---|---|---|
directory |
Trust → Directory → Drug → Pathway | Assigned directorate (5-level fallback) |
indication |
Trust → GP Diagnosis → Drug → Pathway | SNOMED cluster Search_Term from GP records |
For indication charts, ~93% of patients are matched to a GP diagnosis (Search_Term). Unmatched patients use their directorate as a fallback label (e.g., "RHEUMATOLOGY (no GP dx)").
Date Filter Combinations:
| ID | Initiated | Last Seen | Default |
|---|---|---|---|
all_6mo |
All years | Last 6 months | Yes |
all_12mo |
All years | Last 12 months | No |
1yr_6mo |
Last 1 year | Last 6 months | No |
1yr_12mo |
Last 1 year | Last 12 months | No |
2yr_6mo |
Last 2 years | Last 6 months | No |
2yr_12mo |
Last 2 years | Last 12 months | No |
Total pre-computed datasets: 6 date filters × 2 chart types = 12 datasets (~3,600 pathway nodes).
Pathway Node Structure:
Each node in pathway_nodes contains:
- Routing:
chart_type("directory" or "indication"),date_filter_id - Hierarchy:
parents,ids,labels,level(0=Root, 1=Trust, 2=Directory/Indication, 3=Drug, 4+=Pathway) - Counts:
value(patient count) - Costs:
cost,costpp,cost_pp_pa(per patient per annum) - Dates:
first_seen,last_seen,first_seen_parent,last_seen_parent - Statistics:
average_spacing,average_administered,avg_days - Denormalized:
trust_name,directory,drug_sequence(for efficient filtering) - Unique constraint:
UNIQUE(date_filter_id, chart_type, ids)
Core Module (core/)
- PathConfig - Dataclass encapsulating all file paths, with
validate()method - AnalysisFilters - Dataclass for filter state (dates, drugs, trusts, directories)
- logging_config - Structured logging with file and console output
CLI Module (cli/)
- refresh_pathways.py - Command-line tool to refresh pre-computed pathway data:
refresh_pathways()- Main function orchestrating the full pipelineinsert_pathway_records()- SQLite insertion with parameterized querieslog_refresh_start/complete/failed()- Refresh tracking inpathway_refresh_logget_default_filters()- Load trusts/drugs/directories from CSV files
Data Processing Module (data_processing/)
Database Management:
DatabaseManager- SQLite connection pooling and transaction management- Reference Tables:
ref_drug_names,ref_organizations,ref_directories,ref_drug_directory_map,ref_drug_indication_clusters - Pathway Tables:
pathway_date_filters,pathway_nodes,pathway_refresh_log
Pathway Pipeline (pathway_pipeline.py):
DateFilterConfig- Dataclass for date filter configurationDATE_FILTER_CONFIGS- All 6 pre-defined date combinationscompute_date_ranges(config, max_date)- Computes actual ISO dates from configfetch_and_transform_data()- Snowflake fetch + UPID/drug/directory transformations- Directory chart functions:
process_pathway_for_date_filter()- Processes single date filter usinggenerate_icicle_chart()extract_denormalized_fields()- Parsesidscolumn to extract trust, directory, drug_sequence
- Indication chart functions:
process_indication_pathway_for_date_filter()- Processes single date filter usinggenerate_icicle_chart_indication()extract_indication_fields()- Parsesidsfor indication charts (trust, search_term, drug_sequence)
- Shared functions:
convert_to_records(ice_df, chart_type)- Converts ice_df to list of dicts withchart_typecolumnprocess_all_date_filters()- Convenience function to process all 6 filters
Data Loaders:
FileDataLoader- Loads from CSV/Parquet files (used by legacy pipeline, not by Dash app)- Factory function
get_loader()creates aFileDataLoader
Snowflake Integration:
- SSO authentication via
externalbrowserauthenticator fetch_activity_data(start_date, end_date, provider_codes)method- Query caching with TTL-based invalidation
GP Diagnosis Lookup (diagnosis_lookup.py):
CLUSTER_MAPPING_SQL- Embedded SQL constant with ~148 Search_Term → Cluster_ID mappings plus explicit SNOMED codesget_patient_indication_groups(patient_pseudonyms)- Batch queries Snowflake to match patients to GP diagnoses:- Embeds cluster mapping as CTE, joins with
PrimaryCareClinicalCoding - Uses
PseudoNHSNoLinked(not PersonKey) to matchPatientPseudonymin GP records - Returns most recent match per patient via
QUALIFY ROW_NUMBER() - Batches 500 patients per query, returns DataFrame with PatientPseudonym, Search_Term, EventDateTime
- Embeds cluster mapping as CTE, joins with
patient_has_indication(patient_pseudonym, cluster_ids)- Single-patient GP record check (legacy)validate_indication(patient_pseudonym, drug_name)- Full validation result with source tracking (legacy)
Analysis Module (analysis/)
Refactored from the original 267-line generate_graph() function:
- prepare_data() - Filter DataFrame by date range, trusts, drugs, directories (copies df to prevent mutation)
- calculate_statistics() - Compute frequency, cost, duration statistics
- build_hierarchy() - Create Trust → Directory → Drug → Pathway structure
- prepare_chart_data() - Format data for Plotly icicle chart
- generate_icicle_chart_indication(df, indication_df, ...) - Build indication-based hierarchy using Search_Term instead of Directory. Takes an
indication_df(UPID → Search_Term mapping) alongside the main activity DataFrame.
Visualization Module (visualization/)
- create_icicle_figure(ice_df) - Generate Plotly icicle chart from DataFrame (legacy/pipeline use)
- create_icicle_from_nodes(nodes, title) - Generate icicle chart from list-of-dicts (Dash use). Accepts JSON-serializable node dicts from
dcc.Store. Uses NHS blue gradient colorscale, 10-field customdata, Source Sans 3 font. - save_figure_html() - Save interactive HTML file
- open_figure_in_browser() - Open chart in default browser
Shared Data Queries (data_processing/pathway_queries.py)
Shared query functions used by both the Dash app and potentially other consumers:
- load_initial_data(db_path) - Returns available drugs (42), directorates (14), indications (32), trusts (7), total_patients, last_updated
- load_pathway_nodes(db_path, filter_id, chart_type, selected_drugs, selected_directorates, selected_trusts) - Returns pathway nodes, unique_patients, total_drugs, total_cost, last_updated. Parameterized SQL with optional drug/directorate/trust filters.
Dash Application (dash_app/)
State Management via 3 dcc.Store components:
- app-state (session):
chart_type,initiated,last_seen,date_filter_id,selected_drugs,selected_directorates,selected_trusts - chart-data (memory):
nodes[],unique_patients,total_drugs,total_cost,last_updated - reference-data (session):
available_drugs,available_directorates,available_indications,available_trusts,total_patients,last_updated
Callback Chain (unidirectional):
Page Load → load_reference_data → reference-data store + header indicators
→ update_app_state → app-state store (default filters)
→ load_pathway_data → chart-data store
├→ update_kpis → KPI cards
└→ update_chart → dcc.Graph
Filter change → update_app_state → app-state → load_pathway_data → (chain above)
Drawer selection → all-drugs-chips/trust-chips → update_app_state → (chain above)
Key Components:
- Header (
header.py): NHS branding, data freshness indicator (patient count + relative time) - Sidebar (
sidebar.py): Navigation items with drawer trigger IDs for Drug Selection, Trust Selection, Indications - Filter Bar (
filter_bar.py): Chart type toggle pills (By Directory / By Indication) + date filter dropdowns - KPI Row (
kpi_row.py): 4 cards — Unique Patients, Drug Types, Total Cost, Indication Match Rate (~93%) - Chart Card (
chart_card.py): Icicle chart withdcc.Loadingspinner, dynamic subtitle, tab row - Drawer (
drawer.py):dmc.Drawerwith drug chips (dmc.ChipGroup), trust chips, directorate accordion with indication sub-items and drug fragment badges - Footer (
footer.py): NHS Norfolk and Waveney ICB branding
Drawer Drug Browser:
- "All Drugs" section: flat
dmc.ChipGroupwith 42 drugs from pathway_nodes level 3 - "Trusts" section:
dmc.ChipGroupwith 7 trusts - "By Directorate" section: nested
dmc.Accordion— 19 directorates → indications → drug fragmentdmc.Badgeitems - Clicking a drug fragment badge selects all full drug names containing that fragment (substring match)
- "Clear All Filters" button resets drug and trust selections
Data Transformations (data_processing/transforms.py)
Core data transformation functions used by the pipeline:
patient_id()- Creates UPID = Provider Code (first 3 chars) + PersonKeydrug_names()- Standardizes via drugnames.csv lookupdepartment_identification()- 5-level fallback chain for directory assignment
Data Flow
Pre-Computed Pathway Architecture (Current):
[CLI: python -m cli.refresh_pathways --chart-type all]
Snowflake Data Warehouse
│
▼ (fetch_and_transform_data)
┌──────────────────────────────────────────┐
│ Data Transformations (data_processing/transforms.py) │
│ → patient_id() creates UPID │
│ → drug_names() standardizes names │
│ → department_identification() → Dir │
└──────────────────────────────────────────┘
│
├─── Directory Charts ──────────────────────────────────────┐
│ │
│ ┌──────────────────────────────────────────┐ │
│ │ For each of 6 date filter combos: │ │
│ │ → generate_icicle_chart() │ │
│ │ → extract_denormalized_fields() │ │
│ │ → convert_to_records("directory") │ │
│ └──────────────────────────────────────────┘ │
│ │
├─── Indication Charts ─────────────────────────────────────┤
│ │
│ ┌──────────────────────────────────────────┐ │
│ │ GP Diagnosis Lookup (diagnosis_lookup.py)│ │
│ │ → Extract PseudoNHSNoLinked from HCD │ │
│ │ → get_patient_indication_groups() │ │
│ │ (SNOMED cluster CTE + GP records) │ │
│ │ → Build indication_df: UPID → Search │ │
│ │ Term (matched) or Directorate (no GP)│ │
│ └──────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌──────────────────────────────────────────┐ │
│ │ For each of 6 date filter combos: │ │
│ │ → generate_icicle_chart_indication() │ │
│ │ → extract_indication_fields() │ │
│ │ → convert_to_records("indication") │ │
│ └──────────────────────────────────────────┘ │
│ │
└───────────────────────┬───────────────────────────────────┘
│
▼ (insert_pathway_records)
┌──────────────────────────────────────────┐
│ SQLite: pathway_nodes table │
│ → ~3,600 nodes across 12 datasets │
│ → UNIQUE(date_filter_id, chart_type, │
│ ids) prevents cross-type overwrites │
│ → Indexed for fast filtering │
└──────────────────────────────────────────┘
[Dash App: python run_dash.py]
┌──────────────────────────────────────────┐
│ Filter Bar + Drawer (toggle pills, │
│ date dropdowns, drug/trust chips) │
│ → Triggers update_app_state callback │
└──────────────────────────────────────────┘
│
▼
┌──────────────────────────────────────────┐
│ load_pathway_data callback │
│ → Input: app-state dcc.Store │
│ → Calls pathway_queries.load_pathway_ │
│ nodes() with filters │
│ → Output: chart-data dcc.Store │
└──────────────────────────────────────────┘
│
├──────────────────────────────┐
▼ ▼
┌────────────────────┐ ┌──────────────────────┐
│ update_kpis │ │ update_chart │
│ → 4 KPI cards │ │ → create_icicle_ │
│ → formatted │ │ from_nodes() │
│ counts/costs │ │ → 10-field custom- │
└────────────────────┘ │ data + NHS blue │
│ → dcc.Graph figure │
└──────────────────────┘
Reference Data Files (data/)
| File | Purpose |
|---|---|
include.csv |
Drug filter list with default selections (Include=1) |
defaultTrusts.csv |
NHS Trust list for filter |
directory_list.csv |
Medical specialties/directories |
drugnames.csv |
Drug name standardization mapping |
org_codes.csv |
Provider code to organization name mapping |
drug_directory_list.csv |
Valid drug-to-directory mappings (pipe-separated) |
treatment_function_codes.csv |
NHS treatment function code mappings |
drug_indication_clusters.csv |
Drug to SNOMED cluster mappings |
ta-recommendations.xlsx |
NICE TA recommendations |
pathways.db |
SQLite database (~3.5 MB: reference tables + pathway nodes) |
Key Patterns
Department Identification Fallback Chain:
The department_identification() function has 5 levels of fallback:
- SINGLE_VALID_DIR - Drug has only one valid directory
- EXTRACTED - Extracted from Additional Detail/Description fields
- CALCULATED_MOST_FREQ - Most frequent valid directory for UPID/Drug
- UPID_INFERENCE - Inferred from other records with same UPID
- UNDEFINED - No directory could be determined
Indication Lookup Workflow (for indication charts):
- Extract unique
PseudoNHSNoLinkedvalues from HCD activity data - Query Snowflake in batches of 500 patients:
- Embed
CLUSTER_MAPPING_SQL(~148 Search_Term → Cluster_ID mappings) as CTE - Join
ClinicalCodingClusterSnomedCodesto get SNOMED codes per cluster - Join
PrimaryCareClinicalCodingonPatientPseudonym=PseudoNHSNoLinked - Use
QUALIFY ROW_NUMBER() OVER (PARTITION BY PatientPseudonym ORDER BY EventDateTime DESC) = 1for most recent match
- Embed
- Build
indication_dfmapping UPID → Search_Term (matched) or Directorate + " (no GP dx)" (unmatched) - Pass to
generate_icicle_chart_indication()for pathway hierarchy building
Data Source Fallback Chain (for raw data loading, not used by Dash app):
- Query cache for recent results
- Attempt Snowflake connection
- Fall back to CSV/Parquet files
Database Schema (~3.5 MB)
Reference Tables
ref_drug_names- Drug name standardizationref_organizations- Provider code to name mappingref_directories- Valid directory namesref_drug_directory_map- Valid drug-directory pairsref_drug_indication_clusters- Drug to SNOMED cluster mapping
Pathway Tables
pathway_date_filters- 6 pre-defined date filter combinations- Columns:
id,initiated,last_seen,is_default,description - Auto-populated via migration
- Columns:
pathway_nodes- Pre-computed pathway hierarchy nodes (~3,600 rows for 12 datasets)- Routing:
chart_type("directory" or "indication"),date_filter_id - Hierarchy:
parents,ids,labels,level - Metrics:
value,cost,costpp,cost_pp_pa,colour - Dates:
first_seen,last_seen,first_seen_parent,last_seen_parent - Statistics:
average_spacing,average_administered,avg_days - Denormalized:
trust_name,directory,drug_sequence - Foreign key:
date_filter_id→pathway_date_filters.id - Unique constraint:
UNIQUE(date_filter_id, chart_type, ids)— critical for INSERT OR REPLACE correctness - Indexed for: date_filter_id, chart_type, trust_name, directory, level
- Routing:
pathway_refresh_log- Tracks data refresh status- Columns:
refresh_id,started_at,completed_at,status,records_processed,error_message,source_row_count
- Columns:
Input Data Requirements
The input data (CSV/Parquet) must contain columns including:
Provider Code,PersonKey- Used to create UPIDPseudoNHSNoLinked- NHS pseudonym for GP record matching (indication charts)Drug Name,Intervention Date,Price ActualOrganisationName- Various
Additional Detail/Descriptioncolumns for directory extraction Treatment Function Code
Output
Interactive Plotly icicle chart with toggleable views:
- Directory view: Trust → Directorate → Drug → Patient Pathway
- Indication view: Trust → GP Diagnosis (Search_Term) → Drug → Patient Pathway
- Patient counts and percentages at each hierarchy level
- Total and average costs
- Treatment duration and dosing frequency information
- Color gradient based on patient volume
Testing
# Run all tests with coverage
python -m pytest tests/ -v --cov=core --cov=analysis
# Run specific test file
python -m pytest tests/test_config.py -v
# Run specific test class
python -m pytest tests/test_data_transformations.py::TestPatientId -v
Test coverage includes:
- PathConfig validation (23 tests)
- AnalysisFilters validation (26 tests)
- Data transformation functions (23 tests)
- Directory assignment logic (19 tests)
Configuration
Snowflake Connection (src/config/snowflake.toml)
[snowflake]
account = "your-account"
database = "DATA_HUB"
schema = "CDM"
warehouse = "your-warehouse"
authenticator = "externalbrowser" # Required for NHS SSO
Logging
Logs are written to logs/ directory with structured format.
Configure via src/core/logging_config.py.
Breaking Changes from Original App
The pre-computed pathway architecture introduces these changes:
Date Filters
- Old: Date pickers for arbitrary
start_dateandend_date - New: Two dropdowns:
- "Treatment Initiated": All years, Last 2 years, Last 1 year
- "Last Seen": Last 6 months, Last 12 months
- Reason: Pre-computed pathways require fixed date combinations for performance
Data Refresh
- Old: Real-time pathway calculation on each filter change
- New: Pre-computed pathways stored in SQLite, refreshed via CLI command
- Impact: Data is as fresh as the last
python -m cli.refresh_pathwaysrun - Benefit: Sub-50ms filter response time vs multi-minute calculations
State Management (Dash)
- State lives in 3
dcc.Storecomponents:app-state,chart-data,reference-data - Filter state:
chart_type,initiated,last_seen,date_filter_id,selected_drugs,selected_directorates,selected_trusts - Chart type toggle: "By Directory" / "By Indication" pills in filter bar
- Dynamic subtitle: "Trust → Directorate → Drug → Pathway" or "Trust → Indication → Drug → Pathway"
- Drug/trust selection via
dmc.ChipGroupin right-side drawer
Icicle Chart
- Full 10-field customdata structure (value, colour, cost, costpp, first_seen, last_seen, first_seen_parent, last_seen_parent, average_spacing, cost_pp_pa)
- NHS blue gradient colorscale: Heritage Blue #003087 → Pale Blue #E3F2FD
- Treatment statistics (average_spacing, cost_pp_pa) in hover tooltips
- First/last seen dates for drug nodes
create_icicle_from_nodes()insrc/visualization/plotly_generator.py— shared function accepting list-of-dicts
Development
Adding New Analysis Features
- Add statistical functions to
src/analysis/statistics.py - Integrate into pipeline in
src/analysis/pathway_analyzer.py - Update visualization in
src/visualization/plotly_generator.py
Adding New Reference Data
- Add CSV file to
data/directory - Define schema in
src/data_processing/schema.py - Create migration function in
src/data_processing/reference_data.py - Add path to
PathConfiginsrc/core/config.py