Files

43 lines
1.8 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# data_processing Package
Data layer for NHS High-Cost Drug Patient Pathway Analysis Tool.
## Core Responsibilities
**Data Pipeline:** `Snowflake → Transforms → Pathway Generation → SQLite`
## Key Modules
**transforms.py** — Core data transformations (moved from tools/data.py):
- `patient_id()` — Creates UPID = Provider Code (first 3 chars) + PersonKey
- `drug_names()` — Standardizes drug names via drugnames.csv lookup
- `department_identification()` — 5-level fallback chain for directory assignment
**pathway_pipeline.py** — Pipeline orchestration:
- Processes 6 date filter combinations × 2 chart types (directory + indication)
- `fetch_and_transform_data()` — Snowflake fetch + UPID/drug/directory transforms
- `process_pathway_for_date_filter()` — Directory charts using `generate_icicle_chart()`
- `process_indication_pathway_for_date_filter()` — Indication charts using `generate_icicle_chart_indication()`
- `insert_pathway_records()` — SQLite insertion with parameterized queries
**diagnosis_lookup.py** — GP diagnosis matching:
- `get_patient_indication_groups()` — Batch queries Snowflake (500 patients at a time)
- Embeds ~148 Search_Term → Cluster_ID mappings as SQL CTE
- Returns most recent match per patient via `QUALIFY ROW_NUMBER()`
**database.py** — SQLite connection pooling and transaction management
**schema.py** — SQL schema definitions (reference tables + pathway_nodes)
**snowflake_connector.py** — Snowflake SSO integration via externalbrowser authenticator
**cache.py** — Query result caching with TTL-based invalidation
## Import Pattern
All imports use package names directly:
```python
from data_processing.transforms import patient_id, drug_names, department_identification
from data_processing.pathway_pipeline import process_all_date_filters
```