Restructured src to more logical heirachy
This commit is contained in:
@@ -0,0 +1,42 @@
|
||||
# data_processing Package
|
||||
|
||||
Data layer for NHS High-Cost Drug Patient Pathway Analysis Tool.
|
||||
|
||||
## Core Responsibilities
|
||||
|
||||
**Data Pipeline:** `Snowflake → Transforms → Pathway Generation → SQLite`
|
||||
|
||||
## Key Modules
|
||||
|
||||
**transforms.py** — Core data transformations (moved from tools/data.py):
|
||||
- `patient_id()` — Creates UPID = Provider Code (first 3 chars) + PersonKey
|
||||
- `drug_names()` — Standardizes drug names via drugnames.csv lookup
|
||||
- `department_identification()` — 5-level fallback chain for directory assignment
|
||||
|
||||
**pathway_pipeline.py** — Pipeline orchestration:
|
||||
- Processes 6 date filter combinations × 2 chart types (directory + indication)
|
||||
- `fetch_and_transform_data()` — Snowflake fetch + UPID/drug/directory transforms
|
||||
- `process_pathway_for_date_filter()` — Directory charts using `generate_icicle_chart()`
|
||||
- `process_indication_pathway_for_date_filter()` — Indication charts using `generate_icicle_chart_indication()`
|
||||
- `insert_pathway_records()` — SQLite insertion with parameterized queries
|
||||
|
||||
**diagnosis_lookup.py** — GP diagnosis matching:
|
||||
- `get_patient_indication_groups()` — Batch queries Snowflake (500 patients at a time)
|
||||
- Embeds ~148 Search_Term → Cluster_ID mappings as SQL CTE
|
||||
- Returns most recent match per patient via `QUALIFY ROW_NUMBER()`
|
||||
|
||||
**database.py** — SQLite connection pooling and transaction management
|
||||
|
||||
**schema.py** — SQL schema definitions (reference tables + pathway_nodes)
|
||||
|
||||
**snowflake_connector.py** — Snowflake SSO integration via externalbrowser authenticator
|
||||
|
||||
**cache.py** — Query result caching with TTL-based invalidation
|
||||
|
||||
## Import Pattern
|
||||
|
||||
All imports use package names directly:
|
||||
```python
|
||||
from data_processing.transforms import patient_id, drug_names, department_identification
|
||||
from data_processing.pathway_pipeline import process_all_date_filters
|
||||
```
|
||||
Reference in New Issue
Block a user