feat: integrate drug-aware indication matching into refresh pipeline (Task 3.1)

Replace old per-patient indication matching in refresh_pathways.py with
drug-aware matching via assign_drug_indications(). Each drug is now
cross-referenced against both the patient's GP diagnoses AND the
DimSearchTerm.csv drug mapping. GP codes restricted to HCD data window
via earliest_hcd_date parameter.
This commit is contained in:
Andrew Charlwood
2026-02-05 23:11:01 +00:00
parent d9891c8991
commit 920570b437
3 changed files with 91 additions and 98 deletions
+3 -3
View File
@@ -143,14 +143,14 @@ Only assign a drug to an indication if BOTH conditions are met. If a patient's d
## Phase 3: Pipeline Integration
### 3.1 Update `refresh_pathways.py` indication processing
- [ ] In the `elif current_chart_type == "indication":` block:
- [x] In the `elif current_chart_type == "indication":` block:
1. Call `get_patient_indication_groups()` as before (but now returns ALL matches)
2. Load drug mapping: `drug_mapping = load_drug_indication_mapping()`
3. Call `assign_drug_indications(df, gp_matches_df, drug_mapping)`
4. Use modified_df (with indication-aware UPIDs) for pathway processing
5. Use indication_df for the indication mapping
- [ ] Pass modified_df (not original df) to `process_indication_pathway_for_date_filter()`
- [ ] Verify: Pipeline compiles, `python -m py_compile cli/refresh_pathways.py`
- [x] Pass modified_df (not original df) to `process_indication_pathway_for_date_filter()`
- [x] Verify: Pipeline compiles, `python -m py_compile cli/refresh_pathways.py`
### 3.2 Test with dry run
- [ ] Run `python -m cli.refresh_pathways --chart-type indication --dry-run -v`