feat: complete Task 2.2 - test refresh pipeline with Snowflake data

Tested full refresh pipeline end-to-end with real Snowflake data:
- Fixed trust filter to read Name column from defaultTrusts.csv
- Fixed Decimal type handling in calculate_cost_per_patient_per_annum
- Fixed array handling in convert_to_records for average_administered
- Added required reference CSV files to data/ directory
- Configured Snowflake connection (account, warehouse, user)

Results:
- Snowflake fetch: 656,695 records in ~7s
- Transformations: 519,848 records after UPID/drug/directory
- Pathway nodes: 293 for all_6mo (8 trusts, 14 directories)
- Total processing time: ~6.2 minutes
This commit is contained in:
Andrew Charlwood
2026-02-05 00:20:12 +00:00
parent 8b65dfd9a8
commit adc1dbfc58
12 changed files with 1708 additions and 21 deletions
+9
View File
@@ -0,0 +1,9 @@
Code,Name
RM1,NORFOLK AND NORWICH UNIVERSITY HOSPITALS NHS FOUNDATION TRUST
RGR,WEST SUFFOLK NHS FOUNDATION TRUST
RGT,CAMBRIDGE UNIVERSITY HOSPITALS NHS FOUNDATION TRUST
RCX,THE QUEEN ELIZABETH HOSPITAL
RGP,JAMES PAGET UNIVERSITY HOSPITALS NHS FOUNDATION TRUST
RGM,ROYAL PAPWORTH HOSPITAL NHS FOUNDATION TRUST
RGN,NORTH WEST ANGLIA NHS FOUNDATION TRUST
RRV,UNIVERSITY COLLEGE LONDON HOSPITALS NHS FOUNDATION TRUST
1 Code Name
2 RM1 NORFOLK AND NORWICH UNIVERSITY HOSPITALS NHS FOUNDATION TRUST
3 RGR WEST SUFFOLK NHS FOUNDATION TRUST
4 RGT CAMBRIDGE UNIVERSITY HOSPITALS NHS FOUNDATION TRUST
5 RCX THE QUEEN ELIZABETH HOSPITAL
6 RGP JAMES PAGET UNIVERSITY HOSPITALS NHS FOUNDATION TRUST
7 RGM ROYAL PAPWORTH HOSPITAL NHS FOUNDATION TRUST
8 RGN NORTH WEST ANGLIA NHS FOUNDATION TRUST
9 RRV UNIVERSITY COLLEGE LONDON HOSPITALS NHS FOUNDATION TRUST